×

Statistical measure and calibration of search criteria where one or both of the search criteria and database is incomplete

  • US 8,090,733 B2
  • Filed: 07/02/2009
  • Issued: 01/03/2012
  • Est. Priority Date: 07/02/2008
  • Status: Active Grant
First Claim
Patent Images

1. A method of identifying an entity representation in an electronic universal database that corresponds to an entity representation in an electronic foreign database, each database comprising a plurality of entity representations, each entity representation comprising a plurality of linked records, each record comprising a plurality of fields, each field capable of containing a field value, each field value associated with a field value weight, the method comprising:

  • electronically storing a plurality of field tables, each field table corresponding to a particular field, each field table comprising field value weights for each unique pair consisting of an arbitrary entity representation from the universal database and a field value appearing in the particular field of a record in the arbitrary entity representation from the universal database, wherein each field value weight comprises a logarithm of a probability that an arbitrary entity representation in the universal database comprises a corresponding field value in a field of a record in the arbitrary entity representation, wherein each probability comprise a ratio of entity representation in the universal database that contain a corresponding field value to a total number of entity representations in the universal database;

    receiving a plurality of search criteria field values identifying an entity representation in the foreign database;

    performing a fetch operation from an associated field table for each search criterion, by for each search criteria field value, fetching a field value weight from an the associated field table corresponding to the search criteria field value;

    summing results of the step of fetching field value weights for each field from the fetch operation according to entity representations from the universal database, resulting in a plurality of summed weights, one summed weight for each of a plurality of entity representations from the universal database;

    ranking entity representations according to the plurality of summed weights;

    determining a highest ranked entity representation;

    calculating a confidence level reflecting a likelihood that the highest ranked entity representation corresponds to the entity representation identified by the search criteria field values, wherein the calculation is based on the summed field value weight; and

    outputting, if the confidence level exceeds a predetermined threshold, wherein the threshold comprises a logarithm of a term comprising a confidence level, an identifier for the highest ranked entity representation.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×