Lossy index compression
First Claim
1. Apparatus for indexing a corpus of text documents, characterized by an index processor which is arranged to create an inverted index of terms appearing in the documents, the index comprising postings of the terms in the documents, the processor being further arranged to create rankings of the postings in the index, and prune from the index the postings below a given level in the ranking.
1 Assignment
0 Petitions
Accused Products
Abstract
An apparatus is provided for performing a method (FIG. 2) for pruning an index of a corpus of text documents, wherein the method includes steps for ranking (50) the postings in the index and pruning (48) from the index the postings below a given level in the ranking. The pruning methods of the invention are lossy, since some document postings are removed from the full index; however, the user cannot differentiate the lossy index from the full index.
70 Citations
14 Claims
- 1. Apparatus for indexing a corpus of text documents, characterized by an index processor which is arranged to create an inverted index of terms appearing in the documents, the index comprising postings of the terms in the documents, the processor being further arranged to create rankings of the postings in the index, and prune from the index the postings below a given level in the ranking.
-
14. Apparatus for performing a method for indexing a corpus of text documents, wherein the method is characterized by steps for:
-
creating an inverted index of terms appearing in the documents, the index comprising postings of the terms in the documents;
ranking the postings in the index; and
pruning from the index the postings below a given level in the ranking.
-
Specification