Statistical measure and calibration of reflexive, symmetric and transitive fuzzy search criteria where one or both of the search criteria and database is incomplete
First Claim
1. A method of identifying, using a search criteria, an entity representation in an electronic universal database that corresponds to an entity representation in an electronic foreign database, each database comprising a plurality of entity representations, each entity representation comprising a plurality of linked records, each record comprising a plurality of fields, each field capable of containing a field value, each field value associated with a field value weight which is calculated based on the data value stored in the field of the record, wherein the search criteria comprises at least one field value that is not identical to a field value in a record in an entity representation that is identified by the method, the method comprising:
- selecting a field;
applying a symmetric, reflexive and transitive function to each field value in the selected field of each of a plurality of records, whereby a plurality of field value codes are generated, and whereby applying the symmetric, reflexive and transitive function to each field value in the selected field of each of a plurality of records in the database defines a partition of the plurality of records;
populating a field of each of the plurality of records with a field value code;
computing a field value weight for each field value code;
distributing, for each record, a field value weight associated with a field value in the selected field, among the field value in the selected field and a field value code, wherein the distributing comprises, for each record of the plurality of records, calculating a difference between a field value weight associated with a field value in the selected field and a field value weight for a field value code;
receiving a plurality of search criteria field values;
determining a highest ranked entity representation according to summed field value weights for field values matching the plurality of search criteria field values;
calculating a confidence level reflecting a likelihood that the highest ranked entity representation corresponds to the plurality of search criteria field values; and
outputting, if the confidence level exceeds a predetermined threshold, an identifier for the highest ranked entity representation.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a system for, and method of, searching for and identifying an entity representation. Some embodiments utilize a reflexive, symmetric and transitive function to allow for non-identical matches between field values. The function may be used to generate field value codes, which are associated with a portion of a field value weight for the original field value. In such embodiments, the field value weight for the original field values may be distributed among the original field value and the associated field value code.
-
Citations
20 Claims
-
1. A method of identifying, using a search criteria, an entity representation in an electronic universal database that corresponds to an entity representation in an electronic foreign database, each database comprising a plurality of entity representations, each entity representation comprising a plurality of linked records, each record comprising a plurality of fields, each field capable of containing a field value, each field value associated with a field value weight which is calculated based on the data value stored in the field of the record, wherein the search criteria comprises at least one field value that is not identical to a field value in a record in an entity representation that is identified by the method, the method comprising:
-
selecting a field; applying a symmetric, reflexive and transitive function to each field value in the selected field of each of a plurality of records, whereby a plurality of field value codes are generated, and whereby applying the symmetric, reflexive and transitive function to each field value in the selected field of each of a plurality of records in the database defines a partition of the plurality of records; populating a field of each of the plurality of records with a field value code; computing a field value weight for each field value code; distributing, for each record, a field value weight associated with a field value in the selected field, among the field value in the selected field and a field value code, wherein the distributing comprises, for each record of the plurality of records, calculating a difference between a field value weight associated with a field value in the selected field and a field value weight for a field value code; receiving a plurality of search criteria field values; determining a highest ranked entity representation according to summed field value weights for field values matching the plurality of search criteria field values; calculating a confidence level reflecting a likelihood that the highest ranked entity representation corresponds to the plurality of search criteria field values; and outputting, if the confidence level exceeds a predetermined threshold, an identifier for the highest ranked entity representation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system of identifying, using a search criteria, an entity representation in an electronic universal database that corresponds to an entity representation in an electronic foreign database, each database comprising a plurality of entity representations, each entity representation comprising a plurality of linked records, each record comprising a plurality of fields, each field capable of containing a field value, each field value associated with a field value weight which is calculated based on the data value stored in the field of the record, wherein the search criteria comprises at least one field value that is not identical to a field value in a record in an entity representation that is identified, the system comprising:
-
an electronic database comprising a plurality of entity representations, each entity representation comprising a plurality of linked records, each record comprising a plurality of fields, each field capable of containing a field value, each field value associated with a field value weight; a processor programmed to apply a symmetric, reflexive and transitive function to each field value in a selected field of each of a plurality of records, whereby a plurality of field value codes are generated and electronically stored, and whereby applying the symmetric, reflexive and transitive function to each field value in the selected field of each of a plurality of records in the database defines a partition of the plurality of records; a processor programmed to store, in a field of each of the plurality of records, a field value code; a processor programmed to compute a field value weight for each field value code; a processor programmed to distribute electronic storage of, for each record, a field value weight associated with a field value in the selected field, among an electronic storage of a field value weight for the field value in the selected field and an electronic storage of the field value weight of the field value, wherein the distributing comprises, for each record of the plurality of records, calculating a difference between a field value weight associated with a field value in the selected field and a field value weight for a field value code; an electronic memory storing a plurality of search criteria field values; a processor programmed to determine a highest ranked entity representation according to summed field value weights for field values matching the plurality of search criteria field values; a processor programmed to calculate a confidence level reflecting a likelihood that the highest ranked entity representation corresponds to the plurality of search criteria field values; and an output configured to output, if the confidence level exceeds a predetermined threshold, an identifier for the highest ranked entity representation. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification