×

System and method for compressing inverted index files in document search/retrieval system

  • US 5,953,723 A
  • Filed: 04/02/1993
  • Issued: 09/14/1999
  • Est. Priority Date: 04/02/1993
  • Status: Expired due to Term
First Claim
Patent Images

1. A document database generator for use in connection with a document query processing system, the document database generator generating an encoded index file and a dictionary in response to a document text base, the document text base including a plurality of words organized into at least one document, each word having an associated location identifier identifying the location of the word in the at least one document, the location identifier having a series of location identifier entries for identifying the location in the at least one document of the associated word in a location hierarchy, the document database generator comprising:

  • A. an index file generating element for generating an index file including a plurality of records each associated with a unique word in the document text base, each records having at least one locator entry, with each locator entry identifying a location of the word in the at least one document, each locator entry having a series of locator fields containing locator values according to a location hierarchy;

    B. an encoded index file generating element for generating, in response to the index file generated by the index file generating element, an encoded index file, the encoded index file generating element in generating the encoded index file selecting locator entries in a record in the index file which have a series of locator fields which contain corresponding locator values, and substituting for the series in one of the selected locator entries an indicator indicating the correspondence of the locator fields to the other of the selected locator entries; and

    C. a dictionary generating element for generating a dictionary field comprising a plurality of record location identifiers each identifying one of the words in the document text base and the location in the encoded index file of an encoded record associated with the identified word.

View all claims
  • 6 Assignments
Timeline View
Assignment View
    ×
    ×