×

System of and method for entity representation splitting without the need for human interaction

  • US 9,189,505 B2
  • Filed: 08/09/2010
  • Issued: 11/17/2015
  • Est. Priority Date: 08/09/2010
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented process for delinking, based on a bloat index formula, entity representations in an electronic database associated with a population of individuals, the electronic database stored at least partially in a memory and comprising a plurality of entity representations, each entity representation comprising a plurality of linked electronic records that likely refer to a same individual of the population of individuals, each electronic record comprising a plurality of fields, each field capable of containing a field value, the process comprising:

  • calculating a field inconsistency weight for a plurality of fields in the electronic database, wherein each field inconsistency weight is derived from a field inconsistency probability associated with the corresponding field and each field inconsistency probability reflects a likelihood that an arbitrary entity representation in the electronic database includes records with different field values in the corresponding field;

    selecting an entity representation in the electronic database;

    calculating, for the selected entity representation, a bloat index reflecting a sum of field inconsistency weights over a plurality of fields common to a plurality of linked electronic records of the selected entity representation;

    responsive to a field or record being added to the electronic database, determining, based on the bloat index and a known or expected size of the population of individuals associated with the electronic database, whether there is a sufficiently high confidence level that the plurality of linked electronic records of the selected entity representation do not correspond to the respective same individual; and

    delinking, by the processor, in the electronic database, each of the plurality of linked electronic records of the selected entity representation based on the determining;

    wherein an individual is at least one of a natural person and company.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×