Best-value determination rules for an entity resolution system
First Claim
1. A computer-implemented method to reconcile distinct candidate values in an entity resolution system storing distinct identity records that resolve to a plurality of entities, the computer-implemented method comprising:
- receiving a selection of an entity from the plurality of entities stored in the entity resolution system, wherein each entity is associated with a respective plurality of distinct identity records that previously resolved to the respective entity based on one or more entity resolution rules, wherein each identity record includes two or more attribute types and attribute values, wherein the attribute values in each identity record includes information corresponding to one individual, wherein a relationship is identified between at least two of the plurality of entities;
subsequent to resolving the plurality of distinct identity records to the selected entity based on the one or more entity resolution rules, evaluating the plurality of distinct identity records of the selected entity against a plurality of representative value determination rules distinct from the one or more entity resolution rules, wherein each representative value determination rule has a respective rule name and specifies to evaluate a respective attribute type using a distinct criterion for selecting a representative value, wherein the evaluation for each representative value determination rule comprises;
identifying two or more attribute types associated with the respective representative value determination rule, the representative value determination rule specifying to qualify a first attribute type by a specified value of a second attribute type different from the first attribute type, wherein at least two of the representative value determination rules are associated with at least the first attribute types;
identifying a set of attribute values, stored in the plurality of distinct identity records of the selected entity, that correspond to the first attribute type qualified by the specified value of the second attribute type;
determining whether the representative value determination rule applies to the entity, based on the identified set of attribute values;
upon determining that the representative value determination value rule applies to the entity, selecting, from the identified set of attribute values, a respective, distinct, candidate value to represent the first attribute types qualified by the specified value of the second attribute type, wherein each of the at least two representative value determination rules assigns a respective, distinct confidence level to the candidate value selected by the respective representative value determination rule;
determining, by operation of one or more computer processors, the candidate value having the highest confidence level among the selected candidate values;
designating only the candidate value as a sole value representing the first attribute type qualified by the specified value of the second attribute type, including storing an indication of the sole value as representing the first attribute type; and
retaining each of the identified set of attribute values, represented by the sole value, as part of the plurality of distinct identity records of the selected entity without merging or replacing the identified set of attribute values in the plurality of distinct identity records in the selected entity; and
generating an entity display summary including the sole value representing the first attribute type qualified by the specified value of the second attribute type, whereafter the entity display summary is output.
1 Assignment
0 Petitions
Accused Products
Abstract
Primary value determination rules may be used by entity resolution system to select a “best” or “primary” value of an attribute from a plurality of attribute values. For example, the “best” name, address, phone number, etc. to use in presenting a summary of information about that entity may be determined. Further, the primary value determination rules may each be configured to assign a confidence score to the “best” values selected for of a given entity. Doing so allows a selection of a “best” value for a given attribute made by one rule to be overridden by a selection of another “best” value made by another rule for that same attribute.
-
Citations
21 Claims
-
1. A computer-implemented method to reconcile distinct candidate values in an entity resolution system storing distinct identity records that resolve to a plurality of entities, the computer-implemented method comprising:
-
receiving a selection of an entity from the plurality of entities stored in the entity resolution system, wherein each entity is associated with a respective plurality of distinct identity records that previously resolved to the respective entity based on one or more entity resolution rules, wherein each identity record includes two or more attribute types and attribute values, wherein the attribute values in each identity record includes information corresponding to one individual, wherein a relationship is identified between at least two of the plurality of entities; subsequent to resolving the plurality of distinct identity records to the selected entity based on the one or more entity resolution rules, evaluating the plurality of distinct identity records of the selected entity against a plurality of representative value determination rules distinct from the one or more entity resolution rules, wherein each representative value determination rule has a respective rule name and specifies to evaluate a respective attribute type using a distinct criterion for selecting a representative value, wherein the evaluation for each representative value determination rule comprises; identifying two or more attribute types associated with the respective representative value determination rule, the representative value determination rule specifying to qualify a first attribute type by a specified value of a second attribute type different from the first attribute type, wherein at least two of the representative value determination rules are associated with at least the first attribute types; identifying a set of attribute values, stored in the plurality of distinct identity records of the selected entity, that correspond to the first attribute type qualified by the specified value of the second attribute type; determining whether the representative value determination rule applies to the entity, based on the identified set of attribute values; upon determining that the representative value determination value rule applies to the entity, selecting, from the identified set of attribute values, a respective, distinct, candidate value to represent the first attribute types qualified by the specified value of the second attribute type, wherein each of the at least two representative value determination rules assigns a respective, distinct confidence level to the candidate value selected by the respective representative value determination rule; determining, by operation of one or more computer processors, the candidate value having the highest confidence level among the selected candidate values; designating only the candidate value as a sole value representing the first attribute type qualified by the specified value of the second attribute type, including storing an indication of the sole value as representing the first attribute type; and retaining each of the identified set of attribute values, represented by the sole value, as part of the plurality of distinct identity records of the selected entity without merging or replacing the identified set of attribute values in the plurality of distinct identity records in the selected entity; and generating an entity display summary including the sole value representing the first attribute type qualified by the specified value of the second attribute type, whereafter the entity display summary is output. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A computer program product to reconcile distinct candidate values in an entity resolution system storing distinct identity records that resolve to a plurality of entities, the computer program product comprising:
a non-transitory computer-readable medium having computer-usable program code embodied therewith, the computer-usable program code executable by one or more computer processors to; receive a selection of an entity from the plurality of entities stored in the entity resolution system, wherein each entity is associated with a respective plurality of distinct identity records that previously resolved to the respective entity based on one or more entity resolution rules, wherein each identity record includes two or more attribute types and attribute values, wherein the attribute values in each identity record includes information corresponding to one individual, wherein a relationship is identified between at least two of the plurality of entities; subsequent to resolving the plurality of distinct identity records to the selected entity based on the one or more entity resolution rules, evaluate the plurality of distinct identity records of the selected entity against a plurality of representative value determination rules distinct from the one or more entity resolution rules, wherein each representative value determination rule has a respective rule name and specifies to evaluate a respective attribute type using a distinct criterion for selecting a representative value, wherein the evaluation for each representative value determination rule comprises; identifying two or more attribute types associated with the respective representative value determination rule, the representative value determination rule specifying to qualify a first attribute type by a specified value of a second attribute type different from the first attribute type, wherein at least two of the representative value determination rules are associated with at least the first attribute types; identifying a set of attribute values, stored in the plurality of distinct identity records of the selected entity, that correspond to the first attribute types;
qualified by the specified value of the second attribute type;determining whether the representative value determination rule applies to the entity, based on the identified set of attribute values; upon determining that the representative value determination value rule applies to the entity, selecting, from the identified set of attribute values, a respective, distinct, candidate value to represent the first attribute type qualified by the specified value of the second attribute type, wherein each of the at least two representative value determination rules assigns a respective, distinct confidence level to the candidate value selected by the respective representative value determination rule; determining the candidate value having the highest confidence level among the selected candidate values; designating only the candidate value as a sole value representing the first attribute types qualified by the specified value of the second attribute type, including storing an indication of the sole value as representing the first attribute type; and retaining each of the identified set of attribute values, represented by the sole value, as part of the plurality of distinct identity records of the selected entity without merging or replacing the identified set of attribute values in the plurality of distinct identity records in the selected entity; and
generate an entity display summary including the sole value representing the first attribute types qualified by the specified value of the second attribute type, whereafter the entity display summary is output.- View Dependent Claims (17, 18)
-
19. A system to reconcile distinct candidate values in an entity resolution system storing distinct identity records that resolve to a plurality of entities, the system comprising:
-
one or more computer processors; and a memory containing a program which, when executed by the one or more computer processors, is configured to perform an operation comprising; receiving a selection of an entity from the plurality of entities stored in the entity resolution system, wherein each entity is associated with a respective plurality of distinct identity records that previously resolved to the respective entity based on one or more entity resolution rules, wherein each identity record includes two or more attribute types and attribute values, wherein the attribute values in each identity record includes information corresponding to one individual, wherein a relationship is identified between at least two of the plurality of entities; subsequent to resolving the plurality of distinct identity records to the selected entity based on the one or more entity resolution rules, evaluating the plurality of distinct identity records of the selected entity against a plurality of representative value determination rules distinct from the one or more entity resolution rules, wherein each representative value determination rule has a respective rule name and specifies to evaluate a respective attribute type using a distinct criterion for selecting a representative value, wherein the evaluation for each representative value determination rule comprises; identifying two or more attribute types associated with the representative value determination rule, the representative value determination rule specifying to qualify a first attribute type by a specified value of a second attribute type different from the first attribute type, wherein at least two of the representative value determination rules are associated with at least the first attribute types; identifying a set of attribute values, stored in the plurality of distinct identity records of the selected entity, that correspond to the first attribute type qualified by the specified value of the second attribute type; determining whether the representative value determination rule applies to the entity, based on the identified set of attribute values; upon determining that the representative value determination value rule applies to the entity, selecting, from the identified set of attribute values, a respective, distinct, candidate representative value to represent the first attribute type qualified by the specified value of the second attribute type, wherein each of the at least two representative value determination rules assigns a respective, distinct confidence level to the candidate value selected by the respective representative value determination rule; determining the candidate value having the highest confidence level among the selected candidate values; designating only the candidate value as a sole value representing the first attribute types qualified by the specified value of the second attribute type, including storing an indication of the sole value as representing the first attribute types; and retaining each of the identified set of attribute values, represented by the sole value, as part of the plurality of distinct identity records of the selected entity without merging or replacing the identified set of attribute values in the plurality of distinct identity records in the selected entity; and generating an entity display summary including the sole value representing the first attribute type qualified by the specified value of the second attribute type, whereafter the entity display summary is output. - View Dependent Claims (20, 21)
-
Specification