Statistical measure and calibration of search criteria where one or both of the search criteria and database is incomplete
First Claim
1. A method of identifying an entity representation in an electronic universal database that corresponds to an entity representation in an electronic foreign database, each database comprising a plurality of entity representations, each entity representation comprising a plurality of linked records, each record comprising a plurality of fields, each field capable of containing a field value, each field value associated with a field value weight, the method comprising:
- electronically storing a plurality of field tables, each field table corresponding to a particular field, each field table comprising field value weights for each unique pair consisting of an arbitrary entity representation from the universal database and a field value appearing in the particular field of a record in the arbitrary entity representation from the universal database, wherein each field value weight comprises a logarithm of a probability that an arbitrary entity representation in the universal database comprises a corresponding field value in a field of a record in the arbitrary entity representation, wherein each probability comprise a ratio of entity representation in the universal database that contain a corresponding field value to a total number of entity representations in the universal database;
receiving a plurality of search criteria field values identifying an entity representation in the foreign database;
performing a fetch operation from an associated field table for each search criterion, by for each search criteria field value, fetching a field value weight from an the associated field table corresponding to the search criteria field value;
summing results of the step of fetching field value weights for each field from the fetch operation according to entity representations from the universal database, resulting in a plurality of summed weights, one summed weight for each of a plurality of entity representations from the universal database;
ranking entity representations according to the plurality of summed weights;
determining a highest ranked entity representation;
calculating a confidence level reflecting a likelihood that the highest ranked entity representation corresponds to the entity representation identified by the search criteria field values, wherein the calculation is based on the summed field value weight; and
outputting, if the confidence level exceeds a predetermined threshold, wherein the threshold comprises a logarithm of a term comprising a confidence level, an identifier for the highest ranked entity representation.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a system for, and method of, identifying an entity representation. In some embodiments, search criteria are used to identify an entity representation in a universal database, and this identification is then used to identify a corresponding entity representation in a foreign database. Certain embodiments provide assurance, with a know probability of error, that the entity representation identified in the universal database is correct.
-
Citations
12 Claims
-
1. A method of identifying an entity representation in an electronic universal database that corresponds to an entity representation in an electronic foreign database, each database comprising a plurality of entity representations, each entity representation comprising a plurality of linked records, each record comprising a plurality of fields, each field capable of containing a field value, each field value associated with a field value weight, the method comprising:
-
electronically storing a plurality of field tables, each field table corresponding to a particular field, each field table comprising field value weights for each unique pair consisting of an arbitrary entity representation from the universal database and a field value appearing in the particular field of a record in the arbitrary entity representation from the universal database, wherein each field value weight comprises a logarithm of a probability that an arbitrary entity representation in the universal database comprises a corresponding field value in a field of a record in the arbitrary entity representation, wherein each probability comprise a ratio of entity representation in the universal database that contain a corresponding field value to a total number of entity representations in the universal database; receiving a plurality of search criteria field values identifying an entity representation in the foreign database; performing a fetch operation from an associated field table for each search criterion, by for each search criteria field value, fetching a field value weight from an the associated field table corresponding to the search criteria field value; summing results of the step of fetching field value weights for each field from the fetch operation according to entity representations from the universal database, resulting in a plurality of summed weights, one summed weight for each of a plurality of entity representations from the universal database; ranking entity representations according to the plurality of summed weights; determining a highest ranked entity representation; calculating a confidence level reflecting a likelihood that the highest ranked entity representation corresponds to the entity representation identified by the search criteria field values, wherein the calculation is based on the summed field value weight; and outputting, if the confidence level exceeds a predetermined threshold, wherein the threshold comprises a logarithm of a term comprising a confidence level, an identifier for the highest ranked entity representation. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system for identifying an entity representation in an electronic universal database that corresponds to an entity representation in an electronic foreign database, each database comprising a plurality of entity representations, each entity representation comprising a plurality of linked records, each record comprising a plurality of fields, each field capable of containing a field value, each field value associated with a field value weight, the system comprising:
-
an electronic universal database comprising a plurality of electronically stored entity representations, each entity representation comprising a plurality of linked records, each record comprising a plurality of fields, each field capable of containing a field value, each field value associated with a field value weight; a processor programmed to form and store a plurality of field tables, each field table corresponding to a particular field, each field table comprising field value weights for each unique pair consisting of an arbitrary entity representation from the universal database and a field value appearing in the particular field of a record in the arbitrary entity representation from the universal database, wherein each field value weight comprises a logarithm of a probability that an arbitrary entity representation in the universal database comprises a corresponding field value in a field of a record in the arbitrary entity representation, wherein each probability comprises a ratio of entity representations in the universal database that contain a corresponding field value to a total number of entity representations in the universal database; an electronic memory storing a plurality of search criteria field values identifying an entity representation in the foreign database; a processor programmed to, for each search criteria field value, perform a fetch operation from an associated field table for each search criteria, fetch a field value weight from an the associated field table corresponding to the search criteria field value; a processor programmed to sum the fetched weights field value weights for each field from the fetch operation according to entity representations from the universal database, resulting in a stored plurality of summed weights, one summed weight for each of a plurality of entity representations from the universal database; a processor configured to rank entity representations according to the plurality of summed weights; a processor programmed to determine a highest ranked entity representation; a processor programmed to calculate a confidence level reflecting a likelihood that the highest ranked entity representation corresponds to the entity representation identified by the search criteria field values, wherein the calculation is based on the summed field value weight; and a processor programmed to output, if the confidence level exceeds a predetermined threshold, wherein the threshold comprises a logarithm of a term comprising a confidence level, an identifier for the highest ranked entity representation. - View Dependent Claims (8, 9, 10, 11, 12)
-
Specification