×

Entity normalization via name normalization

  • US 9,710,549 B2
  • Filed: 03/28/2014
  • Issued: 07/18/2017
  • Est. Priority Date: 02/17/2006
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method of identifying duplicate objects in a plurality of objects, wherein each object in the plurality of objects is associated with one or more facts, and each of the one or more facts has an attribute and a value, the method comprising using a computer processor to perform:

  • associating facts extracted from web documents with the plurality of objects;

    for each of the plurality of objects, normalizing a value of a name fact, the name fact being among one or more facts associated with the object;

    based on the normalized values of the name facts, grouping the plurality of objects into a plurality of buckets, each object in a bucket having the same normalized value of a name fact;

    processing the plurality of objects in a bucket to identify at least one pair of duplicate objects in the plurality of objects in the bucket, based on a similarity of values of facts other than the name fact for the objects in the bucket; and

    merging the duplicate objects together, the merging including removing one of the duplicate objects from a memory repository.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×