Ranking based on reference contexts
First Claim
Patent Images
1. A method performed by a device, comprising:
- identifying a link in a first document, the link being associated with a second document;
analyzing a first portion of text to the left of the link in the first document;
analyzing a second portion of text to the right of the link in the first document;
identifying a first rare word from the text in the first portion, where the first rare word is identified as a rare word based on a frequency of occurrence of the first rare word in a set of documents;
identifying a second rare word from the text in the second portion, where the second rare word is identified as a rare word based on a frequency of occurrence of the second rare word in the set of documents;
creating a context identifier based only on the first and second rare words; and
ranking the second document within a list of search results based on the context identifier.
2 Assignments
0 Petitions
Accused Products
Abstract
A system ranks documents based on contexts associated with the documents. The system identifies a reference in a first document, where the reference is associated with a second document. The system analyzes a portion of the first document associated with the reference, identifies a rare word (or words) from the portion, creates a context identifier based on the rare word(s), and ranks the second document based on the context identifier.
17 Citations
17 Claims
-
1. A method performed by a device, comprising:
-
identifying a link in a first document, the link being associated with a second document; analyzing a first portion of text to the left of the link in the first document; analyzing a second portion of text to the right of the link in the first document; identifying a first rare word from the text in the first portion, where the first rare word is identified as a rare word based on a frequency of occurrence of the first rare word in a set of documents; identifying a second rare word from the text in the second portion, where the second rare word is identified as a rare word based on a frequency of occurrence of the second rare word in the set of documents; creating a context identifier based only on the first and second rare words; and ranking the second document within a list of search results based on the context identifier. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system, comprising:
-
a memory to store instructions; and a processor to execute the instructions to implement; means for identifying a link in a first document, the link being associated with a second document; means for analyzing a first portion of the first document located to the left of the link in the first document; means for analyzing a second portion of the first document located to the right of the link in the first document; means for identifying a first rarest word from the first portion of the first document; means for identifying a second rarest word from the second portion of the first document; means for creating a context identifier based only on the first rarest word and the second rarest word; and means for ranking the second document based on the context identifier. - View Dependent Claims (13, 14)
-
-
15. A system, comprising:
-
a memory to store instructions; and a processor to execute the instructions to implement; a document analyzing component to; identify a reference in a first document, the reference being associated with a second document, analyze a first portion of the first document located to the left of the reference in the first document, analyze a second portion of the first document located to the right of the reference in the first document, identify a first rare word or rare phrase from the first portion of the first document; identify a second rare word or rare phrase from the second portion of the first document; and create a context identifier based only on the first rare word or rare phrase and the second rare word or rare phrase; and a document ranking component to rank the second document based on the context identifier. - View Dependent Claims (16)
-
-
17. A method performed by a device, comprising:
determining a plurality of different contexts associated with references to a document, the determining the plurality of different contexts comprising; parsing a plurality of first documents to identify the references to the document, analyzing first portions of text to the left of the references in the plurality of first documents, analyzing second portions of text to the right of the references in the plurality of first documents, and identifying the plurality of different contexts based on the text in the first and second portions, where identifying the plurality of different contexts comprises; identifying first rare words from the text in the first portions, identifying second rare words from the text in the second portions, and creating context identifiers based on the first and second rare words, the context identifiers corresponding to the plurality of different contexts, where the first and second rare words are identified based on a frequency of occurrence of the first and second rare words in a set of documents; and ranking the document within a list of search results based on the plurality of different contexts associated with the references, where ranking the document includes; generating a ranking score based on the plurality of different contexts, and using the ranking score as one of a plurality of factors when ranking the document.
Specification