Name Similarity API Released

Name Similarity API
You’ve probably noticed a theme in the APIs we’ve been releasing recently; they all deal with processing people’s names. Today, we’re announcing the release of the Name Similarity API, which answers the question “How similar are these two names?”.

What is the Name Similarity API?

The Name Similarity API compares two names, given as the string parameters q1 and q2, and returns a score indicating how similar they are. Although it’s easy for humans to look at two names like “Robbie” and “Robert” and realize they are similar, this is actually a fairly difficult question for a computer to answer. Luckily, there are established string similarity algorithms, which analyze two strings, and provide an algorithm-specific measure of similarity.

Why would you use it?

The primary use case for this endpoint is to determine how similar two names are. This could be used to determine if two strings actually represent the same underlying name (ie. Robbie and Robert), to some degree of certainty.

How does it work?

As the performance of different similarity algorithms can vary over different data sets, the Name Similarity API returns results from these three algorithms: Levenshtein, JaroWinkler, and Bigram Analysis using Dice’s Coefficient. It’s up to you, the developer, to determine which algorithms work best for your data set and use case. In the next few weeks, we will be releasing individual Name Similarity endpoints for each of these algorithms, allowing you to access the results independently.

Example Query

{
  "result": {
    "SimMetrics": {
      "jaroWinkler": {
        "similarity": 0.8444445,
        "timeEstimated": 0.00156312,
        "timeActual": 0
      },
      "levenshtein": {
        "similarity": 0.5,
        "timeEstimated": 0.00648,
        "timeActual": 0
      }
    },
    "SecondString": {
      "jaroWinkler": {
        "similarity": 0.8444444444444443,
        "timeTaken": "4 ms"
      },
      "levenshtein": {
        "similarity": 0.5,
        "timeTaken": "4 ms",
        "distance": 3
      },
      "level2jaroWinkler": {
        "similarity": 0.8444444444444443,
        "timeTaken": "5 ms"
      },
      "level2levenshtein": {
        "similarity": 0.5,
        "timeTaken": "2 ms",
        "distance": 3
      }
    },
    "FullContact": {
      "BigramAnalysis": {
        "dice": {
          "similarity": 0.5714285714,
          "timeTaken": "16 ms"
        }
      }
    }
  },
  "status": 200
}

Parameters

The q1 and q2 parameters allow you to pass two name strings to compare.

Example Response

<person>
  <status>200</status>
  <likelihood>0.665</likelihood>
  <nameDetails>
    <givenName>Bart</givenName>
    <familyName>Lorang</familyName>
    <middleNames>
        <middleNam>D.</middleNam>
    </middleNames>
    <fullName>Bart D. Lorang</fullName>
  </nameDetails>
  <region>USA</region>
</person>

Get Started Today

As with all of our recently released Name APIs, the Name Similarity API is free to use, just sign-up for an API key to get started. And if you need more detailed information, reference the FullContact API Docs.

Like this post? Share it:

Recent Posts