Explorable Names

Explorable Names


In a previous post I analyzed the Social Security Administration’s (SSA) name dataset to determine what infomation could be obtained from just a person’s first name. It was interesting to discover examples of how pop culture and historical events shaped the evolution of particular names. In this follow-up I expanded that post into an interactive application. Type in a name to get started. The four panels illustrate how the submitted name’s popularity compares over time, geographic region, and to other names. Tap on the help button for more information.

How does it work?

The application combines three datasets:

  1. The original name dataset referenced in the previous blog post. This data lists the number of individuals born with any given name, for the years 1910-2020.
  2. An expanded dataset that breaks down the first dataset by state.
  3. US actuarial tables, from which I derived survival curves. This let me compute the number of individuals alive with a given name from any birth year (for the 'Age' panel)

Observations

One interesting outcome we can derive is the most prominent name for each state. If we were to look at the most popular names for each state, they would likely all be similar (in recent years: ‘Noah’ and ‘Emma’). In contrast, prominence looks at how unique a name is for a given state, irrespective of its overall popularity. You can think of a state’s most prominent name as the one that best distinguishes it from the rest of the country.

Most Prominent Name (Per State)

State Male Female
AK Orion Aurora
AL Willie Mattie
AR Billy Jewel
AZ Tatum Reyna
CA Salvador Mayra
CO Bridger Aspen
CT Salvatore Giuliana
DC Davon India
DE Nasir Aniyah
FL Giancarlo Anabella
GA Willie Ansley
HI Keanu Malia
IA Darwin Jolene
State Male Female
ID Bridger Oakley
IL Anton Angeline
IN Rex Gracelynn
KS Kale Jolene
KY Denver Briley
LA Lionel Demi
MA Seamus Maeve
MD Davon Amirah
ME Marcel Juliette
MI Hassan Joslyn
MN Anders Maren
MO Kolten Willa
MS Willie Jakayla
State Male Female
MT Bridger Aspen
NC Turner Annie
ND Anton Tenley
NE Briggs Joslyn
NH Roland Maeve
NJ Yehuda Chana
NM Santana Adelina
NV Nixon Litzy
NY Chaim Chaya
OH Denver Halle
OK Cale Charley
OR Soren Juniper
PA Francis Angeline
State Male Female
RI Marcel Michaela
SC Willie Hattie
SD Briggs Tenley
TN Houston Briley
TX Santos Guadalupe
UT Bridger Oakley
VA Ryland Emory
VT Marcel Juniper
WA Soren Juniper
WI Anton Angeline
WV Denver Willa
WY Bridger Aspen

Some of these promiment names relate to a state’s natural environment, such as ‘Aspen’ in Colorado, or ‘Orion’ and ‘Aurora’ in Alaska. Others prominences are associated with a state’s demographics: New York has ‘Chaim’ and ‘Chaya’ oweing to their large Orthodox Jewish population. Similarly, Michigan (‘Hassan’) and Texas (‘Santos’) have names corresponding to their Muslim and Hispanic populations, respectively.

Limitations

  • Only the top 1000 male and female names (as measured from 2000-2020) are considered due to performance constraints. As this website is being served statically, keeping download sizes small were a challenge in developing this application. For comparison, there are approximately 250,000 unique names in the name database, half of which have fewer than five individuals.
  • As the names come from the SSA, the dataset only covers names of individuals born in the US
  • For privacy reasons, any names with fewer than five individuals in a given year are omitted. This means that if you have a particularly obscure name, you might not see any births registered in your birth year (assuming it's in the dataset at all)
Perhaps one day I too will enjoy a novelty miniature license plate. To paraphrase a friend of mine with an equally obscure name: I dream of the day I enter a gift shop, spin that rotating stand, and see those magical four letters...

Want to learn more? Join our meetup group! Bethesda Data Science Meetup