We’re in the era of data – tsunamis of data, in fact, that are growing exponentially. With it comes concerns about what we can learn from the data – separating the melody from the noise – as well as overcoming worries surrounding privacy and fraud. Entity resolution (ER) is an important tool used to address these data issues. On the cutting edge of ER technology is Jeff Jonas, CEO and Chief Scientist for Senzing, Inc., an artificial intelligence‐based (AI) software company focused on ER.
What is Entity Resolution (ER)?
In the book “Entity Resolution and Information Quality,” writer Terry Talley states:
Entity resolution is the process of probabilistically identifying some real thing based upon a set of possibly ambiguous clues. Humans have been performing entity resolution throughout history. Early humans looked at footprints and tried to match that clue to the animals that made the tracks.
In the same book, writer John R. Talburt gives it a more formal, scientific view:
Entity resolution is about determining when references to real-world entities are equivalent (refer to the same entity) or not equivalent (refer to different entities). Linking is appending a common identifier to reference instances to denote the decision that they are equivalent. Identity resolution, record linking, record matching, record deduplication, merge-purge, and entity analytics all represent particular forms or aspects of ER.
Jeff Jonas gives ER a more simple and understandable definition: ER determines “who is who and who is related to who” in data.
Why is ER Important?
Entity resolution is important to a business because it tries to create a single version of the “truth” for any given entity/thing that the business deals with. An example many companies can relate to is the idea of the “single customer view.” Companies may have many different systems that have separate pieces of information – purchase history, demographics, credit info, points or loyalty programs and so on, for the same person. (Or are they?) Say your company used your systems to send an email blast that included K. Coggs, KT Coggs, Katie Coggs and Kathryn Coggs. Is there just one Ms. Coggs who is now annoyed with your company for getting four emails, or are there two, three or four separate customers?
Beyond customer experience, having a single customer view gives cleaner data for advanced and predictive analytics. Otherwise, based on Katie Coggs’ purchase history you might erroneously send a related new product advertisement and coupon to her mother, Kathryn.
It can get more serious than emails and coupons – has anyone at your bank connected the fact that several people are using the same assets as loan collateral at different branches? Finding fraud is a major driver of entity resolution.
ER is also important regarding the data privacy movement, which has given rise to laws and regulations like the EU’s General Data Protection Regulation (GDPR) and California’s Consumer Privacy Act (CCPA). These laws and others like them give individuals a level of control over how companies use their collected data. This includes what’s generally referred to as the “right to be forgotten” – having a company delete the requestor’s information as mandated by the applicable law. Hefty fines can be imposed for non-compliance. Imagine KT Coggs sent your company a “right to be forgotten” request. Will you be paying a fine when the auditors find K. Coggs info in your systems – and that K. and KT were the same person?
Jeff Jonas – “Wizard of Big Data”
Jeff Jonas tackles all these ER issues, and more, as CEO and Chief Scientist for Senzing, Inc., an artificial intelligence‐based (AI) software company focused on ER. His work has gone from the simple – finding duplicates in contact lists – to the complex – searching for criminal identities in real time across thousands of data sources and billions of pieces of information.
“Wizard of Big Data” is a moniker that stuck after a 2014 National Geographic feature article that covered his life and work. As related in the article, some of Jonas’ high-profile ER cases have included identifying potential terrorists, detecting fraudulent behavior in casinos, connecting loved ones after a natural disaster, and modernizing voter registration systems. Describing fraud detection in casinos, the National Geographic article said:
Using available and legally obtained data (Jonas emphasizes the program has built-in privacy safeguards) – such as employee records, phone numbers, addresses, job applications, hotel reservations, customer loyalty program information, and the gaming commission’s list of banned players – (Jonas’s program) figures out if an employee and a bad guy are related, live near each other, or share the same phone number; it may also detect if a guest has links to an employee.
Jonas’ goal is to make high quality ER available to mainstream companies – Senzing’s collateral says “You don’t need a million-dollar-plus budget, expensive ER experts, or a large number of IT resources to deploy Senzing ER.” The company offers a plug-and-play, real-time AI for ER desktop app and a more advanced API version for developers.
Jonas works on innovation, national security and privacy with government leaders, think tanks and executives all over the world. He is the author or co-author of more than a dozen patents and his work has been featured in documentaries and books. He is one of only five people in the world who has completed every Ironman triathlon currently on the global circuit.
See Jeff Jonas live on “Innovation Sandbox, Powered by Prolifics”
Jeff Jonas will be our first guest for the inaugural season of “Innovation Sandbox, Powered by Prolifics,” a new YouTube Live series with today’s brightest minds, latest tech and most creative ideas. This first episode – “We’re LIVE to see the Wizard!” – airs Thursday, July 9 from 10-10:30 a.m. ET on Prolifics TV.
Prolifics’ chief technology officer and Innovation Center leader Greg Hodgkinson will host this episode. Join us and get ready for a dynamic discussion with Jonas on topics ranging from how ER affects the average person to Ironman triathlon training during a pandemic.
More information and registration here.