Hybrid-distribution model for search engine indexes
First Claim
1. One or more computer-storage media storing computer-useable instructions that, when used by a computing device, cause the computing device to perform a method for utilizing a hybrid-distribution system for identifying relevant documents based on a search query, the method comprising:
- allocating a group of documents to a segment, the group of documents being indexed by atom in a reverse index and indexed by document in a forward index wherein atoms in the reverse index are accessed in a matching process and a preliminary ranking process and wherein documents in the forward index are accessed in a final ranking process;
storing a different portion of the reverse index and the forward index on each of a plurality of nodes that form the segment;
first, accessing the reverse index portion stored on each of a first set of nodes having portions of the reverse index;
identifying a first set of documents that is relevant to the search query, wherein the first set of documents is identified as being relevant to the search query by way of the matching process and the preliminary ranking process;
second, based on document identifications associated with the first set of documents, accessing the forward index portion stored on each of a second set of nodes having portions of the forward index;
identifying a second set of documents from the first set of documents, wherein the second set of documents is identified by way of the final ranking process;
limiting a quantity of relevant documents in the first set of documents identified to the second set of documents; and
communicating for presentation search results for the search query based on the second set of documents.
3 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems are provided for using a hybrid-distribution system to identify relevant documents based on a search query. A group of documents is assigned to a particular segment. The group of documents is indexed both by atom and by document to form a reverse index and a forward index. Both indexes are divided amongst each node in that segment so that each node is responsible for storing and accessing a different portion of both the reverse and forward indexes. The reverse index portion is accessed on each of a first set of nodes to identify a first set of documents that is relevant to a particular search query. Document identifications associated with the first set of documents are used to identify a second set of nodes that access their forward index portions to limit the number of relevant documents to a second set of documents.
105 Citations
20 Claims
-
1. One or more computer-storage media storing computer-useable instructions that, when used by a computing device, cause the computing device to perform a method for utilizing a hybrid-distribution system for identifying relevant documents based on a search query, the method comprising:
-
allocating a group of documents to a segment, the group of documents being indexed by atom in a reverse index and indexed by document in a forward index wherein atoms in the reverse index are accessed in a matching process and a preliminary ranking process and wherein documents in the forward index are accessed in a final ranking process; storing a different portion of the reverse index and the forward index on each of a plurality of nodes that form the segment; first, accessing the reverse index portion stored on each of a first set of nodes having portions of the reverse index; identifying a first set of documents that is relevant to the search query, wherein the first set of documents is identified as being relevant to the search query by way of the matching process and the preliminary ranking process; second, based on document identifications associated with the first set of documents, accessing the forward index portion stored on each of a second set of nodes having portions of the forward index; identifying a second set of documents from the first set of documents, wherein the second set of documents is identified by way of the final ranking process; limiting a quantity of relevant documents in the first set of documents identified to the second set of documents; and communicating for presentation search results for the search query based on the second set of documents. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for utilizing a hybrid-distribution system for identifying relevant documents based on a search query, the method comprising:
-
allocating a group of documents to a segment, the group of documents being indexed by atom in a reverse index and indexed by document in a forward index, atoms in the reverse index are accessed in a matching process and a preliminary ranking process and documents in the forward index are accessed in a final ranking process; storing a different portion of the reverse index and the forward index on each of a plurality of nodes that form the segment; first, accessing the reverse index portion stored on each of a first set of nodes having portions of the reverse index; identifying a first set of documents that is relevant to the search query, wherein the first set of documents is identified as being relevant to the search query by way of the matching process and the preliminary ranking process; second, based on document identifications associated with the first set of documents, accessing the forward index portion stored on each of a second set of nodes having portions of the forward index; identifying a second set of documents from the first set of documents, wherein the second set of documents is identified by way of the final ranking process; limiting a quantity of relevant documents in the first set of documents identified to the second set of documents; and communicating for presentation search results for the search query based on the second set of documents. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A system for utilizing a hybrid-distribution system for identifying relevant documents based on a search query, the system comprising:
-
a hardware processor and a memory configured for providing computer program instructions to the processor; a hybrid distribution server configured for; allocating a group of documents to a segment, the group of documents being indexed by atom in a reverse index and indexed by document in a forward index, atoms in the reverse index are accessed in a matching process and a preliminary ranking process and documents in the forward index are accessed in a final ranking process; storing a different portion of the reverse index and the forward index on each of a plurality of nodes that form the segment; first, accessing the reverse index portion stored on each of a first set of nodes having portions of the reverse index; identifying a first set of documents that is relevant to the search query, wherein the first set of documents is identified as being relevant to the search query by way of the matching process and the preliminary ranking process; second, based on document identifications associated with the first set of documents, accessing the forward index portion stored on each of a second set of nodes having portions of the forward index; identifying a second set of documents from the first set of documents, wherein the second set of documents is identified by way of the final ranking process; limiting a quantity of relevant documents in the first set of documents identified to the second set of documents; and communicating for presentation search results for the search query based on the second set of documents. - View Dependent Claims (20)
-
Specification