MULTIPLE INDEX BASED INFORMATION RETRIEVAL SYSTEM
1 Assignment
0 Petitions
Accused Products
Abstract
An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are identified that predict the presence of other phrases in documents. Documents are the indexed according to their included phrases. The document index is partitioned into multiple indexes, including a primary index and a secondary index. The primary index stores phrase posting lists with relevance rank ordered documents. The secondary index stores excess documents from the posting lists in document order.
126 Citations
17 Claims
-
1-12. -12. (canceled)
-
13. A method of providing an information retrieval system, the method comprising:
-
storing a primary index including primary phrase posting lists, each posting list associated with a phrase and including up to a maximum number of documents that contain the phrase, the documents rank ordered by respective relevance scores; storing a secondary index including secondary phrase posting lists, each posting list associated with a primary phrase posting list in the primary index, and including documents that contain the phrase and which have relevance scores less than the relevance score of a lowest ranked document in the primary posting list for the phrase, the documents ordered by document identifier; receiving a search query comprising at least one phrase; responsive to the search query containing a first phrase having a primary posting list and a secondary posting list and a second phrase having only a primary posting list, intersecting, by operation of a processor adapted to manipulate data within a computer system, the primary posting list of the first phrase with the primary posting list of the second phrase to obtain a first set of common documents, and intersecting the secondary posting list of the first phrase with the primary posting list of the second phrase to obtain a second set of common documents, and conjoining the first and second sets of common documents; and ranking the common documents.
-
-
14. An information retrieval system, comprising:
-
a primary index server system comprising a primary index, the primary index including primary phrase posting lists, each primary phrase posting list associated with a phrase and including up to a maximum number of documents that contain the phrase, the documents stored relative to one another in the primary index in rank order by respective relevance scores; and a secondary index server system comprising a secondary index, the secondary index including secondary phrase posting lists, each secondary phrase posting list associated with a primary phrase posting list in the primary index, and including documents that contain the phrase and which have relevance scores less than the relevance score of a lowest ranked document in the primary posting list for the phrase, the documents stored relative to one another in the secondary index in order by respective document identifiers. - View Dependent Claims (15, 16, 17)
-
Specification