×

Entity normalization via name normalization

  • US 8,700,568 B2
  • Filed: 03/31/2006
  • Issued: 04/15/2014
  • Est. Priority Date: 02/17/2006
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method of identifying duplicate objects in a plurality of objects, each of the plurality of objects having one or more associated facts and being stored in a computer memory, each of the one or more facts having a value, the method comprising:

  • using a computer processor to perform;

    extracting facts from web documents that are located on document hosts;

    associating the facts extracted from the web documents with a plurality of objects;

    for each of the plurality of objects, normalizing the value of a name fact, the name fact being among the one or more facts associated with the object;

    grouping the plurality of objects into a plurality of buckets in accordance with the normalized value of the name facts of the plurality of objects; and

    applying a matcher to a pair of objects in one of the plurality of buckets to determine if the pair of objects are duplicates, one of the pair of objects having an associated fact that is not a common fact of the pair of objects.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×