HYBRID-DISTRIBUTION MODEL FOR SEARCH ENGINE INDEXES
First Claim
1. One or more computer-storage media storing computer-useable instructions that, when used by a computing device, cause the computing device to perform a method for utilizing a hybrid-distribution system for identifying relevant documents based on a search query, the method comprising:
- allocating a group of documents to a segment, the group of documents being indexed by atom in a reverse index and indexed by document in a forward index;
storing a different portion of the reverse index and the forward index on each of a plurality of nodes that form the segment;
accessing the reverse index portion stored on each of a first set of nodes to identify a first set of documents that is relevant to the search query; and
based on document identifications associated with the first set of documents, accessing the forward index portion stored on each of a second set of nodes to limit a quantity of relevant documents in the first set of documents to a second set of documents.
3 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems are provided for using a hybrid-distribution system to identify relevant documents based on a search query. A group of documents is assigned to a particular segment. The group of documents is indexed both by atom and by document to form a reverse index and a forward index. Both indexes are divided amongst each node in that segment so that each node is responsible for storing and accessing a different portion of both the reverse and forward indexes. The reverse index portion is accessed on each of a first set of nodes to identify a first set of documents that is relevant to a particular search query. Document identifications associated with the first set of documents are used to identify a second set of nodes that access their forward index portions to limit the number of relevant documents to a second set of documents.
-
Citations
20 Claims
-
1. One or more computer-storage media storing computer-useable instructions that, when used by a computing device, cause the computing device to perform a method for utilizing a hybrid-distribution system for identifying relevant documents based on a search query, the method comprising:
-
allocating a group of documents to a segment, the group of documents being indexed by atom in a reverse index and indexed by document in a forward index; storing a different portion of the reverse index and the forward index on each of a plurality of nodes that form the segment; accessing the reverse index portion stored on each of a first set of nodes to identify a first set of documents that is relevant to the search query; and based on document identifications associated with the first set of documents, accessing the forward index portion stored on each of a second set of nodes to limit a quantity of relevant documents in the first set of documents to a second set of documents. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. One or more computer-storage media storing computer-useable instructions that, when used by a computing device, cause the computing device to perform a method for generating a hybrid-distribution system for a multiprocess document-retrieval system, the method comprising:
-
receiving an indication of a group of documents assigned to a segment, the segment comprising a plurality of nodes; for the segment, (1) indexing the allocated group of documents by atom to generate a reverse index, and (2) indexing the allocated group of documents by document to generate a forward index; and assigning a portion of the reverse index and a portion of the forward index to each of a plurality of nodes that form the segment such that each of the plurality of nodes has stored a different portion of the forward index and a different portion of the reverse index. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. One or more computer-storage media storing computer-useable instructions that, when used by a computing device, cause the computing device to perform a method for utilizing a hybrid-distribution system for identifying relevant documents based on a search query, the method comprising:
-
receiving a search query; identifying one or more atoms in the search query; communicating the one or more atoms to a plurality of segments that have each been assigned a group of documents that is indexed both by atom and by document such that a reverse index and a forward index generated and stored at each of the plurality of segments, wherein each of the plurality of segments is comprised of a plurality of nodes that are each assigned a portion of the forward index and the reverse index; based on the one or more atoms, identifying a first set of nodes at a first segment whose reverse index portions contain at least one of the one or more atoms from the search query; accessing the reverse index portion stored at each of the first set of nodes to identify a first set of documents that is found to be relevant to the one or more atoms; based on document identifications associated with the first set of documents, identifying a second set of nodes whose forward index portions contain one or more of the document identifications associated with the first set of documents; and accessing the forward index portion stored at each of the second set of nodes to identify a second set of documents that is a subset of the first set of documents. - View Dependent Claims (17, 18, 19, 20)
-
Specification