×

Automatic disambiguation based on a reference resource

  • US 8,112,402 B2
  • Filed: 02/26/2007
  • Issued: 02/07/2012
  • Est. Priority Date: 02/26/2007
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method, performed by a computer having a processor, of disambiguating references to named entities, comprising:

  • identifying a surface form of a named entity in a text, the surface form being an ambiguous orthographic representation of a common name for the named entity, the surface form having a corresponding surface form reference in a surface form reference database;

    enumerating, from the surface form reference, a plurality of different reference named entities based on the identified surface form of the named entity, wherein the surface form is associated in the surface form reference with the plurality of different reference named entities each being formed of a different set of words, and each of the different reference named entities is associated with a named entity reference, the named entity references being stored in a named entity reference database that is separate from the surface form reference database, each of the named entity references associating one of the different reference named entities to multiple entity indicators, the entity indicators including both labels applied to a respective named entity in an information resource, and context indicators applied to the respective named entity in the information resource, in which the labels comprise classifying identifiers applied to the respective named entities in the information resource;

    evaluating, with the processor, one or more measures of correlation between one or more of the entity indicators in the information resource for each of the identified reference named entities, and the text, the evaluation including comparisons of the text to both the labels and the context indicators;

    identifying, with the processor, one of the reference named entities for which the associated entity indicators have a relatively high correlation to the text; and

    providing a disambiguation output that indicates the identified reference named entity to be associated with the surface form of the named entity in the text.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×