Expanded data processing for improved entity matching
First Claim
1. A method for improving reliability in entity matching, comprising:
- receiving, at an entity matching system, a request for an identifier associated with an entity;
parsing the request for requesting demographic fields data related to the entity;
retrieving records that include indexed demographic data, wherein at least one of the indexed demographic data for each record retrieved matches a corresponding field from the requesting demographic fields data;
augmenting the indexed demographic data with expanded demographic data, wherein the expanded demographic data are received from a third party;
determining a matching confidence for whether the entity is associated with a given record of the records based on the requesting demographic fields data that match the indexed demographic data as augmented by the expanded demographic data;
in response to the matching confidence satisfying a threshold, selecting a prior-assigned identifier that is associated with the given record as the identifier for the entity;
in response to the matching confidence not satisfying the threshold, creating a new identifier as the identifier for the entity;
transmitting the identifier to the requestor; and
in response to receiving updated expanded demographic data;
locating the new identifier;
retrieving demographic data related to the new identifier;
retrieving existing records that include prior-indexed demographic data, wherein at least one of the prior-indexed demographic data for each existing record retrieved matches a corresponding field from the demographic data related to the new identifier;
augmenting the prior-indexed demographic data with the updated expanded demographic data;
determining an updated matching confidence for whether the entity is associated with a prior-existing identifier based on the demographic data related to the new identifier that match the prior-indexed demographic data as augmented by the updated expanded demographic data; and
in response to the updated matching confidence satisfying the threshold;
selecting the prior-existing identifier as the identifier for the entity;
associating the demographic data related to the new identifier with the prior-existing identifier; and
removing the new identifier from an identifier index.
1 Assignment
0 Petitions
Accused Products
Abstract
Improvements in data processing to match entities allow for more accurate records to be kept, with less memory storage used and fewer processing resources to be expended when accessing records. When receiving a request for an identifier for an entity, the request is parsed to identify various demographic fields within the request. A probabilistic search is performed to compare the entity to the candidate records that are augmented with expanded demographic data, which improve the reliability in matching the requested entity to its records. As updates are made to the external resources, the requests are rerun to update the internal resource and to eliminate any new identifiers created due to non-updated data, thus reducing data storage overhead.
17 Citations
18 Claims
-
1. A method for improving reliability in entity matching, comprising:
-
receiving, at an entity matching system, a request for an identifier associated with an entity; parsing the request for requesting demographic fields data related to the entity; retrieving records that include indexed demographic data, wherein at least one of the indexed demographic data for each record retrieved matches a corresponding field from the requesting demographic fields data; augmenting the indexed demographic data with expanded demographic data, wherein the expanded demographic data are received from a third party; determining a matching confidence for whether the entity is associated with a given record of the records based on the requesting demographic fields data that match the indexed demographic data as augmented by the expanded demographic data; in response to the matching confidence satisfying a threshold, selecting a prior-assigned identifier that is associated with the given record as the identifier for the entity; in response to the matching confidence not satisfying the threshold, creating a new identifier as the identifier for the entity; transmitting the identifier to the requestor; and in response to receiving updated expanded demographic data; locating the new identifier; retrieving demographic data related to the new identifier; retrieving existing records that include prior-indexed demographic data, wherein at least one of the prior-indexed demographic data for each existing record retrieved matches a corresponding field from the demographic data related to the new identifier; augmenting the prior-indexed demographic data with the updated expanded demographic data; determining an updated matching confidence for whether the entity is associated with a prior-existing identifier based on the demographic data related to the new identifier that match the prior-indexed demographic data as augmented by the updated expanded demographic data; and in response to the updated matching confidence satisfying the threshold; selecting the prior-existing identifier as the identifier for the entity; associating the demographic data related to the new identifier with the prior-existing identifier; and removing the new identifier from an identifier index. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system for improving reliability in entity matching, comprising:
-
a processor; and a memory storage device including instructions that when executed by the processor are operable to; receive, from a requestor system, a request for an identifier associated with an entity; parse the request for requesting demographic fields data related to the entity; retrieve records that include indexed demographic data, wherein at least one of the indexed demographic data for each record retrieved matches a corresponding field from the requesting demographic fields data; augment the indexed demographic data with expanded demographic data, wherein the expanded demographic data are received from a third party; determine a matching confidence for whether the entity is associated with a given record of the records based on the requesting demographic fields data that match the indexed demographic data as augmented by the expanded demographic data; in response to the matching confidence satisfying a threshold, select a prior-assigned identifier that is associated with the given record as the identifier for the entity; in response to the matching confidence not satisfying the threshold, create a new identifier as the identifier for the entity; transmit the identifier to the requestor; and in response to receiving updated expanded demographic data; locate the new identifier; retrieve demographic data related to the new identifier; retrieve existing records that include prior-indexed demographic data, wherein at least one of the prior-indexed demographic data for each existing record retrieved matches a corresponding field from the demographic data related to the new identifier; augment the prior-indexed demographic data with the updated expanded demographic data; determine an updated matching confidence for whether the entity is associated with a prior-existing identifier based on the demographic data related to the new identifier that match the prior-indexed demographic data as augmented by the updated expanded demographic data; and in response to the updated matching confidence satisfying the threshold; select the prior-existing identifier as the identifier for the entity; associate the demographic data related to the new identifier with the prior-existing identifier; and remove the new identifier from an identifier index. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A system for improving reliability in entity matching, comprising:
-
a processor; and a memory storage device including instructions that when executed by the processor are operable to provide a matching system operable to receive identity requests for a given entity and produce an identifier response that includes an identifier for the given entity, the matching system including; an index, storing indexed demographic data for entities, wherein each entity of the entities is associated with an identifier, and wherein the indexed demographic data include one or more indexed demographic data fields with one or more values that are associated with the entities; a blocking system, operable to parse the identity requests for requesting demographic data fields and return blocks of candidate entities, wherein the candidate entities included in a given block match the requesting demographic data fields according to one or more values of the one or more indexed demographic data fields associated with the candidate entities; and a matching engine, operable to determine a confidence score for each candidate entity included in the given block to determine whether a matching entity for the given entity is present in the given block, wherein when a matching entity is determined to be present in the given block, the identifier included in the identifier response is selected as a prior-existing identifier associated with the matching entity in the index, and wherein when a matching entity is determined to not be present in the given block, the identifier included in the identifier response is created as a new identifier which is associated with the given entity in the index, wherein the matching system; in response to receiving updated expanded demographic data; locates the new identifier; retrieves demographic data related to the new identifier; retrieves existing records that include prior-indexed demographic data, wherein at least one of the prior-indexed demographic data for each existing record retrieved matches a corresponding field from the demographic data related to the new identifier; augments the prior-indexed demographic data with the updated expanded demographic data; determines an updated matching confidence for whether the given entity is associated with a prior-existing identifier based on the demographic data related to the new identifier that match the prior-indexed demographic data as augmented by the updated expanded demographic data; and in response to the updated matching confidence satisfying the threshold; selects the prior-existing identifier as the identifier for the given entity; associates the demographic data related to the new identifier with the prior-existing identifier; and removes the new identifier from the index. - View Dependent Claims (18)
-
Specification