ADAPTIVE CLUSTERING OF RECORDS AND ENTITY REPRESENTATIONS
First Claim
1. A computer implemented iterative process for generating entity representations by identifying and linking related records in a computer implemented database using a record matching formula, each record and entity representation electronically stored in the database, each record comprising a plurality of fields, each field capable of containing a field value, the process comprising:
- assigning to each pair of records from a plurality of records in the database a match value using a the record matching formula, the match value reflecting a likelihood that the pair of records is related, the match value computed by a programmed processor;
assigning, for each record from the plurality of records, at least one associated preferred record, wherein a match value assigned to a given record together with its associated preferred record is at least as great as a match value assigned to the record together with any other record in the plurality of records;
identifying mutually preferred pairs of records from the plurality of records, each mutually preferred pair of records consisting of a first record and a second record, the first record consisting of a preferred record associated with the second record and the second record consisting of a preferred record associated with the first record;
forming and storing a plurality of entity representations in the database, each entity representation of the plurality of entity representations comprising at least one linked pair of mutually preferred records; and
retrieving information from at least one record of a pair of mutually preferred records.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a system for, and method of, determining whether records and entity representations should be linked. The system and method include assigning to each pair of entity references a match value reflecting the likelihood that the entity references are related. Based on the match values, each entity reference may then associated with a preferred entity reference. Pairs of entity references that are mutually preferred may then be identified and linked. The process may be iterated to generate further links.
-
Citations
18 Claims
-
1. A computer implemented iterative process for generating entity representations by identifying and linking related records in a computer implemented database using a record matching formula, each record and entity representation electronically stored in the database, each record comprising a plurality of fields, each field capable of containing a field value, the process comprising:
-
assigning to each pair of records from a plurality of records in the database a match value using a the record matching formula, the match value reflecting a likelihood that the pair of records is related, the match value computed by a programmed processor; assigning, for each record from the plurality of records, at least one associated preferred record, wherein a match value assigned to a given record together with its associated preferred record is at least as great as a match value assigned to the record together with any other record in the plurality of records; identifying mutually preferred pairs of records from the plurality of records, each mutually preferred pair of records consisting of a first record and a second record, the first record consisting of a preferred record associated with the second record and the second record consisting of a preferred record associated with the first record; forming and storing a plurality of entity representations in the database, each entity representation of the plurality of entity representations comprising at least one linked pair of mutually preferred records; and retrieving information from at least one record of a pair of mutually preferred records. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer implemented iterative process for generating entity representations by identifying and linking related records in a computer implemented database using a record matching formula, each record and entity representation electronically stored in the database, each record comprising a plurality of fields, each field capable of containing a field value, the process comprising:
-
determining a mutually preferred pair of records consisting of a first record and a second record, wherein a match score of the first record and the second record as computed using the record matching formula is at least as great as a match score of the first record and any other record in the database, and wherein the match score of the first record and the second record as computed using the record matching formula is at least as great as a match score for the second record and any other record in the database; forming a new entity representation in the database, the new entity representation comprising at least the first record and the second record; determining a mutually preferred pair of records consisting of the first record and a third record, the third record from a different entity representation than the new entity representation, wherein a match score of the first record and the third record as computed using the record matching formula is at least as great as a match score of the first record and any other record not in the new entity representation, and wherein a match score of the first record and the third record as computed using the record matching formula is at least as great as a match score for the third record and any other record not in the different entity representation; consolidating the new entity representation with the different entity representation by linking the new entity representation with the different entity representation resulting in a consolidated entity representation; and retrieving information from the consolidation entity representation. - View Dependent Claims (9)
-
-
10. A computer system for iteratively generating entity representations in a computer implemented database using a record matching formula, the database comprising a plurality of records, each record comprising a plurality of fields, each field capable of containing a field value, the system comprising:
-
a computer implemented database comprising a plurality of records, each record comprising a plurality of fields, each field capable of containing a field value; a processor programmed to assign to each pair of records from a plurality of records in the database a match value using a the record matching formula, the match value reflecting a likelihood that the pair of records is related, the match value computed by a programmed processor; a processor programmed to assign, for each record from the plurality of records, at least one associated preferred record, wherein a match value assigned to a given record together with its associated preferred record is at least as great as a match value assigned to the record together with any other record in the plurality of records; a processor programmed to identify mutually preferred pairs of records from the plurality of records, each mutually preferred pair of records consisting of a first record and a second record, the first record consisting of a preferred record associated with the second record and the second record consisting of a preferred record associated with the first record; and a processor programmed to form and store a plurality of entity representations in the database, each entity representation of the plurality of entity representations comprising at least one linked pair of mutually preferred records. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. A computer system for iteratively generating entity representations by identifying and linking related records in a computer implemented database using a record matching formula, each record and entity representation electronically stored in the database, each record comprising a plurality of fields, each field capable of containing a field value, the system comprising:
-
a computer implemented database comprising a plurality of records, each record comprising a plurality of fields, each field capable of containing a field value; a processor programmed to determine a mutually preferred pair of records consisting of a first record and a second record, wherein a match score of the first record and the second record as computed using the record matching formula is at least as great as a match score of the first record and any other record in the database, and wherein the match score of the first record and the second record as computed using the record matching formula is at least as great as a match score for the second record and any other record in the database; a processor programmed to form and store a new entity representation in the database, the new entity representation comprising at least the first record and the second record; a processor programmed to determine a mutually preferred pair of records consisting of the first record and a third record, the third record from a different entity representation than the new entity representation, wherein a match score of the first record and the third record as computed using the record matching formula is at least as great as a match score of the first record and any other record not in the new entity representation, and wherein a match score of the first record and the third record as computed using the record matching formula is at least as great as a match score for the third record and any other record not in the different entity representation; and a processor programmed to consolidate the new entity representation with the different entity representation by linking the new entity representation with the different entity representation, resulting in a consolidated entity representation. - View Dependent Claims (18)
-
Specification