We recently completed a number of improvements to our CardShark API (human business card transcription service) and we wanted to share the major learnings while we had the API in beta.
The most common issue – and really the only complaint we heard during our beta – was card transcription accuracy. In our improvements, we think we’ve gone a long way in solving that problem by allowing a developer to optionally ask for multiple transcriptions to be completed for every card they submit.
When a developer requests multiple business card transcriptions, the data from the multiple transcriptions is then analyzed and compared on a field by field basis until each data element on the card converges to a value that numerous transcribers agree upon. Thus we’ve added a parameter (verified=low | medium | high) that allows control over the number of transcription entries that must converge to call a data element valid.
You can learn more about the new verified parameter in our CardShark API docs.
Business cards in a lot of cases have a lot of ambiguous data on them so there will always be some degree of issues with human interpretation. That said, here are some high-level results of some testing we’ve done utilizing US-based business cards with limited ambiguity.
- (verified=low) – Single transcription – Roughly 70% of cards transcribed were error free. Of the cards that had errors, typically 1.5 fields per transcription (given name, family name, email address, phone number, etc.) had an error.
- (verified=medium) – Single verification capped at 4 transcriptions – Roughly 95% of cards transcribed were error free.
- (verified=high) – Double verification capped at 6 transcriptions – Roughly 99% of cards transcribed were error free.
Thanks to everyone that provided us feedback on the API over the past few months. We will be taking the CardShark API endpoint out of Beta on January 2nd. Pricing for the endpoint can be found on our pricing page.
Let us know your thoughts and please send any questions that you have.