Explorable Names

In a previous post I analyzed the Social Security Administration’s (SSA) name dataset to determine what infomation could be obtained from just a person’s first name. It was interesting to discover examples of how pop culture and historical events shaped the evolution of particular names. In this follow-up I expanded that post into an interactive application. Type in a name to get started. The four panels illustrate how the submitted name’s popularity compares over time, geographic region, and to other names. Tap on the help button for more information.

How does it work?

The application combines three datasets:

The original name dataset referenced in the previous blog post. This data lists the number of individuals born with any given name, for the years 1910-2020.
An expanded dataset that breaks down the first dataset by state.
US actuarial tables, from which I derived survival curves. This let me compute the number of individuals alive with a given name from any birth year (for the 'Age' panel)

Observations

One interesting outcome we can derive is the most prominent name for each state. If we were to look at the most popular names for each state, they would likely all be similar (in recent years: ‘Noah’ and ‘Emma’). In contrast, prominence looks at how unique a name is for a given state, irrespective of its overall popularity. You can think of a state’s most prominent name as the one that best distinguishes it from the rest of the country.

Most Prominent Name (Per State)

State	Male	Female
AK	Orion	Aurora
AL	Willie	Mattie
AR	Billy	Jewel
AZ	Tatum	Reyna
CA	Salvador	Mayra
CO	Bridger	Aspen
CT	Salvatore	Giuliana
DC	Davon	India
DE	Nasir	Aniyah
FL	Giancarlo	Anabella
GA	Willie	Ansley
HI	Keanu	Malia
IA	Darwin	Jolene

State	Male	Female
ID	Bridger	Oakley
IL	Anton	Angeline
IN	Rex	Gracelynn
KS	Kale	Jolene
KY	Denver	Briley
LA	Lionel	Demi
MA	Seamus	Maeve
MD	Davon	Amirah
ME	Marcel	Juliette
MI	Hassan	Joslyn
MN	Anders	Maren
MO	Kolten	Willa
MS	Willie	Jakayla

State	Male	Female
MT	Bridger	Aspen
NC	Turner	Annie
ND	Anton	Tenley
NE	Briggs	Joslyn
NH	Roland	Maeve
NJ	Yehuda	Chana
NM	Santana	Adelina
NV	Nixon	Litzy
NY	Chaim	Chaya
OH	Denver	Halle
OK	Cale	Charley
OR	Soren	Juniper
PA	Francis	Angeline

State	Male	Female
RI	Marcel	Michaela
SC	Willie	Hattie
SD	Briggs	Tenley
TN	Houston	Briley
TX	Santos	Guadalupe
UT	Bridger	Oakley
VA	Ryland	Emory
VT	Marcel	Juniper
WA	Soren	Juniper
WI	Anton	Angeline
WV	Denver	Willa
WY	Bridger	Aspen

Some of these promiment names relate to a state’s natural environment, such as ‘Aspen’ in Colorado, or ‘Orion’ and ‘Aurora’ in Alaska. Others prominences are associated with a state’s demographics: New York has ‘Chaim’ and ‘Chaya’ oweing to their large Orthodox Jewish population. Similarly, Michigan (‘Hassan’) and Texas (‘Santos’) have names corresponding to their Muslim and Hispanic populations, respectively.

Limitations

Only the top 1000 male and female names (as measured from 2000-2020) are considered due to performance constraints. As this website is being served statically, keeping download sizes small were a challenge in developing this application. For comparison, there are approximately 250,000 unique names in the name database, half of which have fewer than five individuals.
As the names come from the SSA, the dataset only covers names of individuals born in the US
For privacy reasons, any names with fewer than five individuals in a given year are omitted. This means that if you have a particularly obscure name, you might not see any births registered in your birth year (assuming it's in the dataset at all)

Perhaps one day I too will enjoy a novelty miniature license plate. To paraphrase a friend of mine with an equally obscure name: I dream of the day I enter a gift shop, spin that rotating stand, and see those magical four letters...

Want to learn more? Join our meetup group! Bethesda Data Science Meetup

Itai Katz

Design for thinking machines

Explorable Names

How does it work?

Observations

Most Prominent Name (Per State)

Limitations