ESTIMATION OF POSTINGS LIST LENGTH IN A SEARCH SYSTEM USING AN APPROXIMATION TABLE
First Claim
Patent Images
1. A method of minimizing accesses to secondary storage when searching an inverted index for a search term, the method comprising:
- obtaining by at least one computing unit a predetermined size of a posting list for the search term, the predetermined size based on document frequency for the search term, wherein the posting list is stored in secondary storage; and
reading by the at least one computing unit at least a portion of the posting list into memory based on the predetermined size.
3 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides a method of minimizing accesses to secondary storage when searching an inverted index for a search term. The method comprises automatically obtaining a predetermined size of a posting list for the search term, the predetermined size based on document frequency for the search term, wherein the posting list is stored in secondary storage, and reading at least a portion of the posting list into memory based on the predetermined size. Corresponding computer system and program products are also provided.
-
Citations
33 Claims
-
1. A method of minimizing accesses to secondary storage when searching an inverted index for a search term, the method comprising:
-
obtaining by at least one computing unit a predetermined size of a posting list for the search term, the predetermined size based on document frequency for the search term, wherein the posting list is stored in secondary storage; and reading by the at least one computing unit at least a portion of the posting list into memory based on the predetermined size. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer system for minimizing accesses to secondary storage when searching an inverted index for a search term, the computer system comprising:
-
a memory; and a processor in communication with the memory to perform a method, the method comprising; obtaining a predetermined size of a posting list for the search term based on document frequency for the search term, wherein the posting list is stored in secondary storage; and reading at least a portion of the posting list into memory based on the predetermined size. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A program product for minimizing accesses to secondary storage when searching an inverted index for a search term, the program product comprising:
a storage medium readable by a processor and storing instructions for execution by the processor for performing a method, the method comprising; obtaining by at least one computing unit a predetermined size of a posting list for the search term, the predetermined size based on document frequency for the search term, wherein the posting list is stored in secondary storage; and reading by the at least one computing unit at least a portion of the posting list into memory based on the predetermined size. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30)
-
31. A data structure for use in minimizing accesses to data stored in secondary storage when searching an inverted index for a search term, the data structure comprising:
a posting list length approximation table, comprising a hash table, the hash table comprising; a plurality of range IDs, each range ID corresponding to a subset of posting lists of predetermined similar size and representing a non-overlapping range of document frequencies; and a posting list length approximation for each range ID. - View Dependent Claims (32, 33)
Specification