Ranking search results by reranking the results based on local inter-connectivity
First Claim
Patent Images
1. A method of generating documents based on a search query, comprising:
- obtaining an initial set of documents relevant to the search query;
assigning relevance scores to the documents based on cross references between the documents within the initial set; and
sorting the documents based on the assigned relevance scores.
1 Assignment
0 Petitions
Accused Products
Abstract
A search engine for searching a corpus improves the relevancy of the results by refining a standard relevancy score based on the interconnectivity of the initially returned set of documents. The search engine obtains an initial set of relevant documents by matching a user'"'"'s search terms to an index of a corpus. A re-ranking component in the search engine then refines the initially returned document rankings so that documents that are frequently cited in the initial set of relevant documents are preferred over documents that are less frequently cited within the initial set.
205 Citations
30 Claims
-
1. A method of generating documents based on a search query, comprising:
-
obtaining an initial set of documents relevant to the search query;
assigning relevance scores to the documents based on cross references between the documents within the initial set; and
sorting the documents based on the assigned relevance scores. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 27)
calculating a local score value for at least two of the documents in the initial set, the local score value quantifying an amount that the at least two documents are supported by other documents in the initial set of documents; and
generating refined relevance scores for the at least two documents in the initial set based on the local score values and the initial relevance scores.
-
-
3. The method of claim 1, wherein the initial set of documents are associated with initial relevance scores, and wherein assigning the relevance scores includes:
-
calculating a local score value for at least two of the documents in the initial set based on the cross references to the at least two documents by other documents in the initial set of documents; and
generating refined relevance scores for the at least two documents in the initial set based on the local score values and the initial relevance scores.
-
-
4. The method of claim 2, wherein calculating the local score value for a particular one of the at least two documents in the initial set of documents further includes:
-
forming a sub-set of documents from the initial set of documents as the sub-set of documents that cross reference the particular one of the at least two documents, and removing at least one document from the sub-set that is affiliated with the particular one of the at least two documents.
-
-
5. The method of claim 4, further comprising:
removing, for at least one pair of documents in the sub-set that are affiliated, one of the documents in the pair that has a lower initial relevance score than the other of the documents in the pair.
-
6. The method of claim 4, wherein a predefined number of the documents in the sub-set are used to calculate the local score value.
-
7. The method of claim 4, wherein the local score value is calculated for the particular one of the at least two documents as:
-
where OldScore(x) refers to the initial relevance score value for one of the at least two documents, BackSet refers to the sub-set of documents, the sum is taken over the first k documents in BackSet, where k is a predefined number, and m is a predetermined constant.
-
-
8. The method of claim 2, wherein generating the refined relevance scores is based on taking a product based on the local score values and the initial relevance scores.
-
9. The method of claim 2, wherein generating the refined relevance score further includes:
calculating the refined relevance scores as
-
10. The method of claim 9, further including:
setting MaxLS to a predetermined threshold value when MaxLS is below the threshold value.
-
11. The method of claim 1, wherein obtaining the initial set of relevant documents includes obtaining the initial set based on a matching of terms in the search query to a corpus.
-
27. The method of claim 1, wherein the cross references are hyperlinks.
-
12. A method of responding to a search query from a user, the method comprising:
-
receiving the search query from the user;
generating a list of relevant documents based on search terms of the query;
generating relevance scores for the documents in the list of relevant documents based on cross references between the documents in the list; and
returning a set of relevant documents to the user, the set being sorted based on the relevance scores. - View Dependent Claims (13, 14, 15, 16, 17, 28)
calculating a local score value for at least two of the documents in the list of relevant documents, the local score value quantifying an amount that the at least two documents are cross referenced by other documents in the list of relevant documents; and
calculating refined relevance scores for the documents in the initial set based on the local score values and the initial relevance scores.
-
-
14. The method of claim 13, wherein calculating the local score value for a particular one of the documents in the list of relevant documents further includes:
-
forming a sub-set of documents from the list of relevant documents as the sub-set of documents that cross reference the particular one of the documents, and removing at least one document from the sub-set that is affiliated with the particular one of the documents.
-
-
15. The method of claim 14, further comprising:
removing, for at least one pair of documents in the sub-set that are affiliated, one of the documents in the pair that has a lower initial relevance score than the other of the documents in the pair.
-
16. The method of claim 14, wherein a predefined number of the documents in the sub-set are used to calculate the local score value.
-
17. The method of claim 13, wherein calculating the relevance scores is based on taking a product based on the local score values and the initial relevance scores.
-
28. The method of claim 12, wherein the cross references are hyperlinks.
-
18. A system for identifying documents relevant to a search query comprising:
-
means for obtaining an initial set of relevant documents from a corpus based on a matching of terms in the search query to the corpus;
means for assigning refined relevance scores to the set of relevant documents based on an amount that the documents of the initial set are cross referenced by other documents in the initial set of documents; and
means for sorting the documents in the initial set based on the refined relevance scores. - View Dependent Claims (19, 20, 29)
means for determining an initial relevance score for each document in the initial set of relevant documents.
-
-
20. The system of claim 18, wherein the means for assigning further includes:
-
means for forming a sub-set of documents from the initial set of relevant documents as the sub-set of documents that cross reference a particular one of the documents in the initial set of relevant documents, and means for removing documents from the sub-set that are affiliated with the particular one of the documents in the initial set of relevant documents.
-
-
29. The system of claim 18, wherein the cross references are hyperlinks.
-
21. A computer-readable medium storing instructions for causing at least one processor to perform a method that identifies documents relevant to a search query, the method comprising:
-
obtaining an initial set of documents relevant to the search query;
assigning refined relevance scores to the documents based on cross references between the documents within the initial set; and
sorting the documents based on the refined relevance scores. - View Dependent Claims (22, 23, 24, 25, 26, 30)
calculating a local score value for at least two of the documents in the initial set, the local score value quantifying an amount that the at least two documents are cross referenced by other documents in the initial set of documents; and
generating the refined relevance scores for the documents in the initial set based on the local score values and the initial relevance scores.
-
-
23. The computer-readable medium of claim 21, wherein calculating the local score value for a particular one of the initial set of documents further includes:
-
forming a sub-set of documents from the initial set of documents as the sub-set of documents that contain a hyperlink to the particular one of the documents, and removing documents from the sub-set that are from a same host or from an affiliated host as the particular one of the documents.
-
-
24. The computer-readable medium of claim 23, further comprising:
removing, for pairs of documents in the sub-set that are affiliated, one of the documents in the pair that has a lower refined relevance score than the other of the documents in the pair.
-
25. The computer-readable medium of claim 23, wherein a predefined number of the documents in the sub-set are used to calculate the local score value.
-
26. The computer-readable medium of claim 23, wherein obtaining the initial set of relevant documents includes obtaining the initial set based on a matching of terms in the search query to a corpus.
-
30. The computer-readable medium of claim 21, wherein the cross references are hyperlinks.
Specification