×

Database systems and methods for linking records and entity representations with sufficiently high confidence

  • US 8,266,168 B2
  • Filed: 08/08/2008
  • Issued: 09/11/2012
  • Est. Priority Date: 04/24/2008
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method of linking a first record in a database to a second record in a database upon a determination that the first record and the second record correspond to a same individual, the method comprising:

  • calculating a plurality of match probabilities using an iterative process, each of the plurality of match probabilities corresponding to a different field common to the first record and the second record;

    selecting a matching formula from the group consisting of a field weight matching formula, a field value weight matching formula, and a supplemental weight matching formula;

    calculating a match score based on a plurality of terms using the selected matching formula, each of the plurality of terms corresponding to a different field common to the first record and the second record, each of the plurality of terms comprising;

    (1) a probability that a field value in a corresponding field of the first record matches a field value in a corresponding field in the second record, and (2) a weight comprising a match probability, wherein the weight comprises a logarithm of the match probability, wherein the match probability comprises a probability that an arbitrary entity representation in the database comprises a particular field value, wherein the probability comprises a ratio of entity representations in the database that include the particular field value to a total number of entity representations in the database;

    determining, based on the match score and a size of a population associated with the database, whether there is a sufficiently high confidence level that the first record and the second record correspond to the same individual; and

    linking, in the database, the first record with the second record based on the determining.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×