Statistical measure and calibration of internally inconsistent search criteria where one or both of the search criteria and database is incomplete
First Claim
1. A method of identifying, using an internally inconsistent search criteria, an entity representation in an electronic universal database that corresponds to an entity representation in an electronic foreign database, each database comprising a plurality of entity representations each corresponding to a definitive identifier, each entity representation comprising a plurality of linked records, each record comprising a plurality of fields, each field capable of containing a field value, each field value associated with a field value weight indicating the likelihood that a record or entity representation chosen at random contains the associated field value, the method comprising:
- receiving a plurality of search criteria field values, each search criteria field value associated with a field, wherein at least two search criteria field values are associated with a same field, wherein the at least two search criteria field values are not identical;
receiving at least one match template specifying an ordered plurality of fields;
forming and electronically storing, for each match template, a table comprising field value weights for a plurality of records having matches between a search criteria field value and a field value appearing in a record in the universal database, wherein the at least two search criteria field values match field values in two records corresponding to a same entity representation, and wherein at least one table comprises an inclusion field comprising a sum of at least a portion of field value weights for the at least two search criteria field values that match field values in records corresponding to the same entity representation;
merging the tables according to entity representation, resulting in a merged table, wherein the at least two search criteria field values that are not identical are grouped in the merged table and associated with the same definitive identifier corresponding to the entity representation;
summing field value weights according to entity representation in the merged table, resulting in a plurality of summed weights, one summed weight for each entity representation;
ranking entity representations in the merged table according to the plurality of summed weights;
determining a highest ranked entity representation;
calculating a confidence level reflecting a likelihood that the highest ranked entity representation corresponds to the plurality of query field values; and
outputting, when the confidence level exceeds a predetermined threshold, an identifier sufficient to identify the entity representation for the highest ranked entity representation.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a system for, and method of, searching for and identifying an entity representation. Some embodiments permit search criteria that are internally inconsistent. Such internally inconsistent criteria may include, for example, a maiden last name and a married last name. Certain embodiments account for such criteria in an intelligent manner and identify matching entity representations with a known confidence level of accuracy.
139 Citations
20 Claims
-
1. A method of identifying, using an internally inconsistent search criteria, an entity representation in an electronic universal database that corresponds to an entity representation in an electronic foreign database, each database comprising a plurality of entity representations each corresponding to a definitive identifier, each entity representation comprising a plurality of linked records, each record comprising a plurality of fields, each field capable of containing a field value, each field value associated with a field value weight indicating the likelihood that a record or entity representation chosen at random contains the associated field value, the method comprising:
-
receiving a plurality of search criteria field values, each search criteria field value associated with a field, wherein at least two search criteria field values are associated with a same field, wherein the at least two search criteria field values are not identical; receiving at least one match template specifying an ordered plurality of fields; forming and electronically storing, for each match template, a table comprising field value weights for a plurality of records having matches between a search criteria field value and a field value appearing in a record in the universal database, wherein the at least two search criteria field values match field values in two records corresponding to a same entity representation, and wherein at least one table comprises an inclusion field comprising a sum of at least a portion of field value weights for the at least two search criteria field values that match field values in records corresponding to the same entity representation; merging the tables according to entity representation, resulting in a merged table, wherein the at least two search criteria field values that are not identical are grouped in the merged table and associated with the same definitive identifier corresponding to the entity representation; summing field value weights according to entity representation in the merged table, resulting in a plurality of summed weights, one summed weight for each entity representation; ranking entity representations in the merged table according to the plurality of summed weights;
determining a highest ranked entity representation;calculating a confidence level reflecting a likelihood that the highest ranked entity representation corresponds to the plurality of query field values; and outputting, when the confidence level exceeds a predetermined threshold, an identifier sufficient to identify the entity representation for the highest ranked entity representation. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method of identifying, using an internally inconsistent search criteria, an entity representation in an electronic universal database that corresponds to an entity representation in an electronic foreign database, each database comprising a plurality of entity representations each corresponding to a definitive identifier, each entity representation comprising a plurality of linked records, each record comprising a plurality of fields, each field capable of containing a field value, each field value associated with a field value weight indicating the likelihood that a record or entity representation chosen at random contains the associated field value, the method comprising:
-
receiving a plurality of search criteria field values, each search criteria field value associated with a field, wherein at least two search criteria field values are associated with a same field, wherein the at least two search criteria field values are not identical; forming and electronically storing a table comprising field value weights for a plurality of records having matches between a search criteria field value and a field value appearing in a record in the universal database, wherein the at least two search criteria field values match field values in two records corresponding to a same entity representation, and wherein at least one table comprises an inclusion field comprising a sum of at least a portion of field value weights for the at least two search criteria field values that match field values in records corresponding to the same entity representation; merging the tables according to entity representation, resulting in a merged table, wherein the at least two search criteria field values that are not identical are grouped in the merged table and associated with the same definitive identifier corresponding to the entity representation; summing field value weights according to entity representation in the merged table, resulting in a plurality of summed weights, one summed weight for each entity representation; ranking entity representations in the merged table according to the plurality of summed weights; determining a highest ranked entity representation; calculating a confidence level reflecting a likelihood that the highest ranked entity representation corresponds to the plurality of query field values; and outputting, when the confidence level exceeds a predetermined threshold, an identifier sufficient to identify the entity representation for the highest ranked entity representation. - View Dependent Claims (8, 9, 10)
-
-
11. A system for identifying, using an internally inconsistent search criteria, an entity representation in an electronic universal database that corresponds to an entity representation in an electronic foreign database, each database comprising a plurality of entity representations each corresponding to a definitive identifier, each entity representation comprising a plurality of linked records, each record comprising a plurality of fields, each field capable of containing a field value, each field value associated with a field value weight indicating the likelihood that a record or entity representation chosen at random contains the associated field value, the system comprising:
-
an electronic universal database comprising a plurality of electronically stored entity representations, each entity representation comprising a plurality of linked records, each record comprising a plurality of fields, each field capable of containing a field value, each field value associated with a field value weight;
an electronic memory storing a plurality of search criteria field values, each search criteria field value associated with a field, wherein at least two search criteria field values are associated with a same field, wherein the at least two search criteria field values are not identical;an electronic memory storing at least one match template specifying an ordered plurality of fields; a processor programmed to form and electronically store, for each match template, a table comprising field value weights for a plurality of records having matches between a search criteria field value and a field value appearing in a record in the universal database, wherein the at least two search criteria field values match field values in two records corresponding to a same entity representation, and wherein at least one table comprises an inclusion field comprising a sum of at least a portion of field value weights for the at least two search criteria field values that match field values in records corresponding to the same entity representation; a processor programmed to merge and electronically store the tables according to entity representation, resulting in a stored merged table, wherein the at least two search criteria field values that are not identical are grouped in the merged table and associated with the same definitive identifier corresponding to the entity representation; a processor programmed to sum and electronically store field value weights according to entity representation in the merged table, resulting in a plurality of stored summed weights, one summed weight for each entity representation; a processor programmed to rank entity representations in the merged table according to the plurality of summed weights;
a processor programmed to determine a highest ranked entity representation;a processor programmed to calculate a confidence level reflecting a likelihood that the highest ranked entity representation corresponds to the plurality of query field values; and an output configured to output, when the confidence level exceeds a predetermined threshold, an identifier sufficient to identify the entity representation for the highest ranked entity representation. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A system for identifying, using an internally inconsistent search criteria, an entity representation in an electronic universal database that corresponds to an entity representation in an electronic foreign database, each database comprising a plurality of entity representations each corresponding to a definitive identifier, each entity representation comprising a plurality of linked records, each record comprising a plurality of fields, each field capable of containing a field value, each field value associated with a field value weight indicating the likelihood that a record or entity representation chosen at random contains the associated field value, the system comprising:
-
an electronic universal database comprising a plurality of electronically stored entity representations, each entity representation comprising a plurality of linked records, each record comprising a plurality of fields, each field capable of containing a field value, each field value associated with a field value weight; an electronic memory storing a plurality of search criteria field values, each search criteria field value associated with a field, wherein at least two search criteria field values are associated with a same field, wherein the at least two search criteria field values are not identical; a processor programmed to form and electronically store a table comprising field value weights for a plurality of records having matches between a search criteria field value and a field value appearing in a record in the universal database, wherein the at least two search criteria field values match field values in two records corresponding to a same entity representation, and wherein at least one table comprises an inclusion field comprising a sum of at least a portion of field value weights for the at least two search criteria field values that match field values in records corresponding to the same entity representation; a processor programmed to merge and electronically store the tables according to entity representation, resulting in a stored merged table, wherein the at least two search criteria field values that are not identical are grouped in the merged table and associated with the same definitive identifier corresponding to the entity representation; a processor programmed to sum and electronically store field value weights according to entity representation in the merged table, resulting in a plurality of stored summed weights, one summed weight for each entity representation;
a processor programmed to rank entity representations in the merged table according to the plurality of summed weights;a processor programmed to determine a highest ranked entity representation; a processor programmed to calculate a confidence level reflecting a likelihood that the highest ranked entity representation corresponds to the plurality of query field values; and an output configured to output, when the confidence level exceeds a predetermined threshold, an identifier sufficient to identify the entity representation for the highest ranked entity representation. - View Dependent Claims (18, 19, 20)
-
Specification