×

System and method for flexible indexing of document content

DC
  • US 6,741,979 B1
  • Filed: 07/24/2001
  • Issued: 05/25/2004
  • Est. Priority Date: 07/24/2001
  • Status: Expired due to Term
First Claim
Patent Images

1. A method for flexible indexing of document content comprising:

  • obtaining a collection of documents to be indexed;

    storing said collection of documents in a single document information stream;

    parsing each one of said documents into constituent words to facilitate indexing;

    creating a plurality of stem words to be indexed by stemming each word into a standard prefix;

    creating a first record providing an entry point into an index structure, said first record having a plurality of entry blocks, each one of said entry blocks being uniquely associated with a character out of a character set, said collection of documents being formed by characters drawn from said character set;

    creating a plurality of additional primary and secondary records providing character-by-character pathways for locating an occurrence of a stem word in said document information stream;

    creating a translation vector mapping stem words to document locations; and

    creating a plurality of streams for providing locations of occurrences of said stem word in said document information stream.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×