×

Complex queries for corpus indexing and search

  • US 8,266,169 B2
  • Filed: 12/18/2008
  • Issued: 09/11/2012
  • Est. Priority Date: 12/18/2008
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method, comprising:

  • receiving in a memory a complex-query pattern, wherein the complex-query pattern identifies a relationship between a plurality of words using a query language;

    receiving a corpus;

    transforming with a processor the complex-query pattern into a region matching transducer, wherein the transducer is a form of a finite state network;

    determining whether a corpus index exists;

    in response to determining the corpus index does not exist;

    combining a corpus-level transducer and the region matching transducer; and

    applying the combined transducer to the corpus to identify strings therein that satisfy patterns defined in the corpus-level transducer, including the complex-query pattern, with each identified pattern being recorded in the corpus index;

    in response to determining the corpus index exists;

    applying the region matching transducer to the corpus to identify strings therein that satisfy the complex-query pattern, with each identified pattern being recorded in an augmented index; and

    merging the corpus index with the augmented index specifying locations in the corpus satisfying the complex-query pattern;

    storing in the memory the corpus index that records a query tag for indexing locations in the corpus satisfying the complex-query pattern.

View all claims
  • 6 Assignments
Timeline View
Assignment View
    ×
    ×