×

Finding and disambiguating references to entities on web pages

  • US 8,751,498 B2
  • Filed: 02/01/2012
  • Issued: 06/10/2014
  • Est. Priority Date: 10/20/2006
  • Status: Active Grant
First Claim
Patent Images

1. A method for identifying documents referring to an entity, the entity being associated with a first set of features, the method comprising:

  • at a computer having one or more processors and memory storing programs for execution by the one or more processors;

    identifying a first set of documents based on a first model and the first set of features, wherein the first model includes a first set of rules specifying at least one combination of features from the first set of features that are sufficient for identifying a document referring to the entity, and each document in the first set of documents includes a sufficient number of features in common with the first set of features to identify a document referring to the entity according to the first model;

    determining a second model based on features included in one or more documents in the first set of documents, wherein the second model includes a second set of rules specifying at least one combination of features from the first set of documents that are sufficient for identifying a document referring to the entity;

    identifying a second set of documents based on the second model, wherein each document in the second set of documents includes a sufficient number of features in common with the first set of features to identify a document referring to the entity according to the second model, and wherein the second set of documents includes at least one document not included in the first set of documents; and

    extracting one or more facts from the second set of documents and associating the extracted facts with the entity.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×