×

System and method for word indexing in a capture system and querying thereof

  • US 8,554,774 B2
  • Filed: 09/01/2010
  • Issued: 10/08/2013
  • Est. Priority Date: 08/31/2005
  • Status: Active Grant
First Claim
Patent Images

1. A method, comprising:

  • receiving a query to search a plurality of objects captured by a capture system, the query including a search term;

    generating a search token from the search term using a context-aware parser, wherein the context-aware parser uses a list of patterns associated with a content type indicated by the query to generate sub-tokenized search tokens from the search token;

    hashing the sub-tokenized search tokens to one or more term bit positions using a hash function;

    searching a first word index associated with a first object;

    eliminating the first object from the query if a bit is not set in each of the one or more term bit positions of a first bit vector of the first word index, wherein each bit that is set in the first bit vector represents at least one token generated from the first object, and wherein the hashing of the sub-tokenized search tokens includes truncating several of the sub-tokenized search tokens such that they stem to a same token.

View all claims
  • 11 Assignments
Timeline View
Assignment View
    ×
    ×