×

Automated calibration of negative field weighting without the need for human interaction

  • US 8,135,681 B2
  • Filed: 04/24/2009
  • Issued: 03/13/2012
  • Est. Priority Date: 04/24/2008
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented iterative process for generating entity representations in a computer implemented database using a record matching formula and for generating parameters for the record matching formula, each entity representation comprising at least one record, the database comprising a plurality of records, each record comprising a plurality of fields, each field capable of containing a field value, the process comprising:

  • calculating a field weight for a selected field, the field weight for the selected field derived from each of a plurality of field value weights for the selected field, wherein each field value weight comprises a logarithm of a first probability that an arbitrary record in the database comprises a corresponding field value in a field of a record in the arbitrary record, wherein each first probability comprises a ratio of records in the database that contain a corresponding field value to a total number of records in the database;

    forming a plurality of entity representations in the database, each entity representation comprising at least two records linked using a first instance of the record matching formula, at least one entity representation comprising a first record linked to a second record using a first instance of the record matching formula wherein the first record comprises a different field value in its selected field than that of the second record, the first instance of the record matching formula comprising a negative of the field weight for the selected field;

    calculating a weight parameter for the selected field, the weight parameter for the selected field reflecting a second probability that an arbitrary entity representation in the database comprises two different records each comprising a different field value in its respective selected field, the weight parameter being a negative number;

    linking at least two entity representations in the database based on a second instance of the record matching formula, wherein the second instance of the record matching formula comprises the weight parameter, whereby a number of entity representations in the database is reduced by the linking at least two entity representations relative to a number of entity representations in the database prior to the linking at least two entity representations; and

    retrieving information from at least one record in the database.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×