×

Method and system for document indexing and data querying

  • US 9,946,753 B2
  • Filed: 12/17/2015
  • Issued: 04/17/2018
  • Est. Priority Date: 07/23/2009
  • Status: Active Grant
First Claim
Patent Images

1. A system, comprising:

  • a processor; and

    a memory coupled with the processor, wherein the memory is configured to provide the processor with instructions which when executed cause the processor to;

    generate a preset filter characters list based at least in part on a sample set of documents and appearance frequencies of monadic partitions that are present in the sample set of documents, wherein the monadic partitions comprise character text;

    obtain a document to be indexed;

    perform a monadic partition operation on the document to obtain a plurality of monadic partitions associated with the document;

    determine whether a first monadic partition of the plurality of monadic partitions associated with the document should be indexed directly or indexed with at least one other monadic partition from the plurality of monadic partitions as at least one polynary partition, wherein the determination comprises to;

    determine that the first monadic partition matches a filter character monadic partition included in the preset filter characters list;

    in response to the determination that the first monadic partition matches the filter character monadic partition, index the first monadic partition as the at least one polynary partition, including to;

    determine whether the first monadic partition precedes a second monadic partition in the plurality of monadic partitions associated with the document, wherein the second monadic partition is adjacent to the first monadic partition in the document;

    in response to a first determination that the first monadic partition precedes the second monadic partition, form a first binary partition by combining the first monadic partition with the second monadic partition;

    determine whether the first monadic partition succeeds a third monadic partition in the plurality of monadic partitions associated with the document, wherein the third monadic partition is adjacent to the first monadic partition in the document;

    in response to a second determination that the first monadic partition succeeds the third monadic partition, form a second binary partition by combining the first monadic partition with the third monadic partition; and

    add a first entry in a document index corresponding to the first binary partition and a second entry in the document index corresponding to the second binary partition, without directly indexing the first monadic partition in the document index.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×