×

Annotating entities using cross-document signals

  • US 9,275,135 B2
  • Filed: 05/29/2012
  • Issued: 03/01/2016
  • Est. Priority Date: 05/29/2012
  • Status: Expired due to Fees
First Claim
Patent Images

1. An article of manufacture comprising a non-transitory computer readable storage medium having computer readable instructions tangibly embodied thereon which, when implemented, cause a computer to carry out a plurality of method steps comprising:

  • determining which documents in a document corpus of multiple documents mention an entity of interest;

    clustering the documents that mention an entity of interest according to similarities across a temporal signal, a structural signal and a content signal, thereby forming multiple clusters of documents;

    annotating each document in the multiple clusters of documents with an annotation by marking each occurrence of the entity in each document;

    calculating a confidence measure for each occurrence of the entity in each document in each of the multiple clusters, wherein said confidence measure comprises the sum of (i) a measure of similarity between the given occurrence of the entity and the entity of interest, and (ii) a measure of similarity between the documents within the cluster of the given document via

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×