×

System and method for portable document indexing using n-gram word decomposition

  • US 5,706,365 A
  • Filed: 04/10/1995
  • Issued: 01/06/1998
  • Est. Priority Date: 04/10/1995
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method for indexing stored documents, each document containing at least one page and containing a plurality of words, and searching for at least one document matching an input search query containing at least one query word, comprising the steps of:

  • for each document;

    identifying non-stop words on each page of the document;

    determining for each non-stop word at least one n-gram;

    for each n-gram, storing a map having a plurality of positions, each position corresponding to a page, and each position indicating whether or not the corresponding page contains the n-gram;

    determining at least one query word n-gram for the at least one query word; and

    is retrieving documents having n-grams that match selected ones of the query word n-grams, by performing the steps of;

    determining a map corresponding to the query word n-gram;

    determining from the map at least one page containing the query word n-gram; and

    retrieving the page, and the document associated therewith.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×