Please download the dossier by clicking on the dossier button x
×

Leveraging cross-document context to label entity

  • US 7,970,808 B2
  • Filed: 05/05/2008
  • Issued: 06/28/2011
  • Est. Priority Date: 05/05/2008
  • Status: Active Grant
First Claim
Patent Images

1. A method of classifying entities, the method comprising:

  • using a processor to perform acts comprising;

    recognizing occurrences of an entity in a plurality of documents;

    identifying a plurality of features in contexts of said occurrences, a first one of the features being derived from a first context of a first one of the occurrences in a first one of the documents, and a second one of the features being derived from a second context of a second one of the occurrences in a second one of the documents;

    calculating a sum of a plurality of weights, wherein each of the features is associated with one of the weights;

    making a first determination that said sum exceeds a first threshold;

    making a second determination that a label applies to said entity based on said first determination; and

    storing or communicating a fact that said label applies to said entity, wherein said features comprise membership in a list, and wherein said acts further comprise;

    choosing a subset of members of said members of said list based on members in said subset being estimated to occur more frequently in said documents than other members of said list; and

    comparing a string that occurs in at least one of said contexts with members of said subset; and

    making a third determination that said string represents a member of said list, said third determination being made (a) based on said string'"'"'s being among said subset, and (b) without use of a filter that accepts all strings that are members of said list and accepts at least one string that is not a member of said list.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×