Ranking search results by reranking the results based on local inter-connectivity
First Claim
Patent Images
1. A method of identifying documents relevant to a search query, comprising:
- obtaining an initial set of relevant documents from a corpus;
ranking the initial set of documents to obtain a relevance score for each document in the initial set of documents;
calculating a local score value for at least two of the documents in the initial set, the local score value quantifying an amount that the at least two documents are referenced by other documents in the initial set of documents; and
refining the relevance scores for the documents in the initial set based on the local score values.
2 Assignments
0 Petitions
Accused Products
Abstract
A search engine for searching a corpus improves the relevancy of the results by refining a standard relevancy score based on the interconnectivity of the initially returned set of documents. The search engine obtains an initial set of relevant documents by matching a user'"'"'s search terms to an index of a corpus. A re-ranking component in the search engine then refines the initially returned document rankings so that documents that are frequently cited in the initial set of relevant documents are preferred over documents that are less frequently cited within the initial set.
522 Citations
14 Claims
-
1. A method of identifying documents relevant to a search query, comprising:
-
obtaining an initial set of relevant documents from a corpus;
ranking the initial set of documents to obtain a relevance score for each document in the initial set of documents;
calculating a local score value for at least two of the documents in the initial set, the local score value quantifying an amount that the at least two documents are referenced by other documents in the initial set of documents; and
refining the relevance scores for the documents in the initial set based on the local score values. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
forming a sub-set of documents from the initial set of documents as the sub-set of documents that contain a hyperlink to the particular one of the relevant document, and removing documents from the sub-set that are from the same host or from an affiliated host as the particular one of the relevant documents.
-
-
3. The method of claim 2, further comprising:
removing, for each pair of documents in the sub-set that are from the same host or an affiliated host, one of the documents in the pair that has a lower relevance score than the other of the documents in the pair.
-
4. The method of claim 1, wherein the local score values are based on the relevance scores.
-
5. The method of claim 3, wherein a predefined number of the documents in the sub-set are used to calculate the local score value.
-
6. The method of claim 3, wherein the local score value is calculated for the particular one of the relevant documents as:
-
where OldScore(x) refers to the relevance score value for the particular document, BackSet refers to the sub-set of documents, the sum is taken over the first k documents in BackSet, where k is a predefined number, and m is a predetermined constant.
-
-
7. The method of claim 6, wherein refining the relevance scores is based on taking a product based on the local score values and the relevance score values.
-
8. The method of claim 6, wherein refining the relevance score values for the documents further includes:
recalculating the relevance score values for the documents as
-
9. The method of claim 8, further including:
setting MaxLS to a predetermined threshold value when MaxLS is below the threshold value.
-
10. The method of claim 1, wherein obtaining the initial set of relevant documents from the corpus includes obtaining the initial set based on a matching of terms in the search query to the corpus.
-
11. A method of responding to a search query from a user, the method comprising:
-
receiving the search query from the user;
generating a list of relevant documents based on search terms of the query, each document in the list being associated with a relevance score corresponding to a relevance of the document;
calculating a local score for documents in the list of relevant documents, the local score quantifying an amount of inter-connectivity between documents in the list of relevant documents;
refining the relevance score based on the calculated local scores; and
returning a list of relevant documents to the user, the list being sorted based on the refined relevance scores.
-
-
12. A system comprising:
-
a server connected to a network, the server receiving search queries from users via the network, the server including;
at least one processor;
a database of a corpus; and
a memory operatively coupled to the processor, the memory storing program instructions that when executed by the processor, cause the processor to;
generate an initial list of relevant documents from the corpus based on a matching of terms in the search query to the corpus, rank the generated list of documents to obtain a relevance score value for each document in the generated list of documents, calculate a local score value for the documents in the generated list, the local score value quantifying an amount that the documents are referenced by other documents in the generated list of documents, and refine the relevance score values for the documents in the generated list based on the local score values.
-
-
13. A system for identifying documents relevant to a search query comprising:
-
means for obtaining an initial set of relevant documents from a corpus based on a matching of terms in the search query to the corpus;
means for determining a relevance score for each document in the initial set of documents;
means for determining a local score value for the documents in the initial set, the local score value quantifying an amount that the documents are referenced by other documents in the initial set of documents; and
means for refining the relevance scores for the documents in the initial set based on the local score values.
-
-
14. A computer-readable medium storing instructions for causing at least one processor to perform a method that identifies documents relevant to a search query, the method comprising:
-
identifying a set of relevant documents from a corpus based on the search query;
ranking the set of documents to obtain a relevance score for each document in the set of documents;
calculating a local score value for the documents in the set, the local score value quantifying an amount that the documents are referenced by other documents in the set of documents; and
refining the relevance scores for the documents in the set based on the local score values.
-
Specification