Index partitioning based on document relevance for document indexes
First Claim
1. For use with a search engine that processes user queries, a system that locates documents containing search words corresponding to a present user query comprising:
- at least one processor;
at least one memory;
an index builder that stores locations of documents indexed by word in an index based on a present query-independent static rank that has been assigned to each document;
an index partitioner that orders and partitions the index into index partitions that each contain location information about a group of one or more documents having a continuous range of static ranks that is a subset of an overall range of static ranks;
an index scanner that progressively scans the index partitions starting with a partition containing those documents with the highest static rank to locate documents containing a search word; and
a scorer that calculates a score based on a present set of documents located thus far in the search and on the range of static ranks of a next partition to be scanned and wherein the index scanner scans the next partition to locate documents containing a search word when the calculated score is above a target score.
2 Assignments
0 Petitions
Accused Products
Abstract
Indexed documents are arranged in the index according to a static ranking and partitioned according to static ranking. Index queries reference the first partition and move to a subsequent partition when a static rank for the subsequent partition is higher than a weighted portion of the target score added to a weighted portion of a dynamic rank corresponding to the relevance of the results set generated thus far. By changing the weight of the target score and dynamic ranks in the subsequent partition score, searches can be stopped when no more relevant results will be found in the next partition.
90 Citations
30 Claims
-
1. For use with a search engine that processes user queries, a system that locates documents containing search words corresponding to a present user query comprising:
-
at least one processor; at least one memory; an index builder that stores locations of documents indexed by word in an index based on a present query-independent static rank that has been assigned to each document; an index partitioner that orders and partitions the index into index partitions that each contain location information about a group of one or more documents having a continuous range of static ranks that is a subset of an overall range of static ranks; an index scanner that progressively scans the index partitions starting with a partition containing those documents with the highest static rank to locate documents containing a search word; and a scorer that calculates a score based on a present set of documents located thus far in the search and on the range of static ranks of a next partition to be scanned and wherein the index scanner scans the next partition to locate documents containing a search word when the calculated score is above a target score. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. For use with a search engine that processes user queries, a method for locating documents containing a search word found in a present user query comprising:
-
assigning a present query-independent rank to each document to be searched; ordering the documents to be searched in order of the assigned present query-independent rank and grouping the ordered documents to be searched into partitions by present query-independent rank; indexing documents in a partition by mapping a location for each document to words contained in the document to form an index; scanning the partitions in present query-independent rank order by iteratively i) searching a highest ranked unsearched partition for a search word found in the user query to add to a present set of located documents located thus far;
ii) calculating a score based on a present set of located documents and the present query-independent rank of documents indexed in a next highest ranking unsearched partition;
iii) comparing the calculated score to a target score; and
iv) continuing to search the next highest ranking unsearched partition until the calculated score is higher than the target score; andreturning search results including the document locations in the present set of located documents when the calculated score is higher than a target score. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. For use with a search engine that processes user queries, one or more computer readable media storing computer readable instructions for retrieving documents containing search words in a query by:
-
assigning a static rank to documents; indexing the documents by mapping document locations to words contained in the document to construct an index; ordering and partitioning the index by document based on the static rank assigned to the document; iteratively searching, in static rank order, a highest ranking unsearched partition to return locations for documents containing search words in the query;
calculating a score based on a relevance of documents returned and the static rank assigned to a next partition to be searched;and continuing to search the next partition until the calculated score is higher than a target score; and returning document locations as a query result when the calculated score exceeds the target score. - View Dependent Claims (20, 21, 22, 23, 24)
-
-
25. For use with a search engine that processes user queries, an apparatus for locating documents containing a search word found in a present user query comprising:
-
means for assigning a present query-independent rank to each document to be searched; means for ordering the documents to be searched in order of the assigned present query-independent rank and grouping the ordered documents to be searched into partitions by present query-independent rank; means for indexing documents in a partition by mapping a location for each document to words contained in the document to form an index; means for scanning the partitions in present query-independent rank order by iteratively i) searching a highest ranked unsearched partition for a search word found in the user query to add to a present set of located documents located thus far;
ii) calculating a score based on a present set of located documents and the present query-independent rank of documents indexed in a next highest ranking unsearched partition;
iii) comparing the calculated score to a target score; and
iv) continuing to search the next highest ranking unsearched partition until the calculated score is higher than the target score; andmeans for returning search results including the document locations in the present set of located documents when the calculated score is higher than a target score. - View Dependent Claims (26, 27, 28, 29, 30)
-
Specification