×

Scalable lookup-driven entity extraction from indexed document collections

  • US 20090319500A1
  • Filed: 06/24/2008
  • Published: 12/24/2009
  • Est. Priority Date: 06/24/2008
  • Status: Active Grant
First Claim
Patent Images

1. A method for filtering a set of documents, comprising:

  • receiving a list of entity strings;

    determining a set of token sets that covers the entity strings in the list;

    querying an inverted index generated on a first set of documents using the set of token sets to determine a set of document identifiers for a subset of the documents in the first set;

    retrieving from the first set of documents a second set of documents identified by the set of document identifiers; and

    filtering the second set of documents to include one or more documents of the second set that each include a match with at least one entity string of the list of entity strings.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×