Ranking based on reference contexts

US 8,577,893 B1
Filed: 03/15/2004
Issued: 11/05/2013
Est. Priority Date: 03/15/2004
Status: Active Grant

First Claim

Patent Images

1. A method performed by a device, comprising:

identifying a link in a first document, the link being associated with a second document;

analyzing a first portion of text to the left of the link in the first document;

analyzing a second portion of text to the right of the link in the first document;

identifying a first rare word from the text in the first portion, where the first rare word is identified as a rare word based on a frequency of occurrence of the first rare word in a set of documents;

identifying a second rare word from the text in the second portion, where the second rare word is identified as a rare word based on a frequency of occurrence of the second rare word in the set of documents;

creating a context identifier based only on the first and second rare words; and

ranking the second document within a list of search results based on the context identifier.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system ranks documents based on contexts associated with the documents. The system identifies a reference in a first document, where the reference is associated with a second document. The system analyzes a portion of the first document associated with the reference, identifies a rare word (or words) from the portion, creates a context identifier based on the rare word(s), and ranks the second document based on the context identifier.

17 Citations

View as Search Results

17 Claims

1. A method performed by a device, comprising:
- identifying a link in a first document, the link being associated with a second document;
  
  analyzing a first portion of text to the left of the link in the first document;
  
  analyzing a second portion of text to the right of the link in the first document;
  
  identifying a first rare word from the text in the first portion, where the first rare word is identified as a rare word based on a frequency of occurrence of the first rare word in a set of documents;
  
  identifying a second rare word from the text in the second portion, where the second rare word is identified as a rare word based on a frequency of occurrence of the second rare word in the set of documents;
  
  creating a context identifier based only on the first and second rare words; and
  
  ranking the second document within a list of search results based on the context identifier.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, where identifying a first rare word and identifying a second rare word comprise:
    - comparing words in the first and second portions to a table that identifies occurrences of a plurality of words in the set of documents, anddetermining which of the words in the first and second portions occurred least often in the set of documents based on the table.
  - 3. The method of claim 1, where creating a context identifier comprises:
    - hashing the first and second rare words to create the context identifier.
  - 4. The method of claim 1, where the first document comprises a plurality of first documents that include a link to the second document;
    - where creating a context identifier comprises;
      
      creating a plurality of context identifiers associated with the plurality of first documents; and
      
      where the ranking the second document comprises;
      
      generating a ranking score for the second document based on the context identifiers.
  - 5. The method of claim 4, where generating a ranking score for the second document comprises:
    - ranking the second document based on a total number of the context identifiers.
  - 6. The method of claim 4, further comprising:
    - determining a number of occurrences of the context identifiers in association with the link as context counts.
  - 7. The method of claim 6, where generating a ranking score for the second document comprises:
    - ranking the second document based on a distribution of the context counts associated with the context identifiers.
  - 8. The method of claim 7, where ranking the second document based on a distribution of the context counts comprises:
    - identifying a one of the context identifiers based on a distribution of the context counts, andranking the second document while reducing an impact of the one of the context identifiers.
  - 9. The method of claim 6, where generating a ranking score for the second document comprises:
    - ranking the second document based on a history of distribution of the context counts associated with the context identifiers.
  - 10. The method of claim 1, where ranking the second document comprises:
    - generating a ranking score based on the context identifier, andusing the ranking score as one of a plurality of factors when ranking the second document.
  - 11. The method of claim 1, where each of the first and second portions comprises a plurality of words.

12. A system, comprising:
- a memory to store instructions; and
  
  a processor to execute the instructions to implement;
  
  means for identifying a link in a first document, the link being associated with a second document;
  
  means for analyzing a first portion of the first document located to the left of the link in the first document;
  
  means for analyzing a second portion of the first document located to the right of the link in the first document;
  
  means for identifying a first rarest word from the first portion of the first document;
  
  means for identifying a second rarest word from the second portion of the first document;
  
  means for creating a context identifier based only on the first rarest word and the second rarest word; and
  
  means for ranking the second document based on the context identifier.
- View Dependent Claims (13, 14)
- - 13. The system of claim 12, where the means for creating a context identifier comprises:
    - means for hashing the first and second rare words to create the context identifier.
  - 14. The system of claim 12, where the means for ranking the second document comprises:
    - means for generating a ranking score based on the context identifier, andmeans for using the ranking score as one of a plurality of factors when ranking the second document.

15. A system, comprising:
- a memory to store instructions; and
  
  a processor to execute the instructions to implement;
  
  a document analyzing component to;
  
  identify a reference in a first document, the reference being associated with a second document,analyze a first portion of the first document located to the left of the reference in the first document,analyze a second portion of the first document located to the right of the reference in the first document,identify a first rare word or rare phrase from the first portion of the first document;
  
  identify a second rare word or rare phrase from the second portion of the first document; and
  
  create a context identifier based only on the first rare word or rare phrase and the second rare word or rare phrase; and
  
  a document ranking component to rank the second document based on the context identifier.
- View Dependent Claims (16)
- - 16. The system of claim 15, where the first portion and the second portion are areas of text, and where the first rare word or rare phrase and the second rare word or rare phrase are identified based on a frequency of occurrence of the first and second rare words or rare phrases in a set of documents.

17. A method performed by a device, comprising:
- determining a plurality of different contexts associated with references to a document, the determining the plurality of different contexts comprising;
  
  parsing a plurality of first documents to identify the references to the document,analyzing first portions of text to the left of the references in the plurality of first documents,analyzing second portions of text to the right of the references in the plurality of first documents, andidentifying the plurality of different contexts based on the text in the first and second portions, where identifying the plurality of different contexts comprises;
  
  identifying first rare words from the text in the first portions,identifying second rare words from the text in the second portions, andcreating context identifiers based on the first and second rare words, the context identifiers corresponding to the plurality of different contexts, where the first and second rare words are identified based on a frequency of occurrence of the first and second rare words in a set of documents; and
  
  ranking the document within a list of search results based on the plurality of different contexts associated with the references, where ranking the document includes;
  
  generating a ranking score based on the plurality of different contexts, andusing the ranking score as one of a plurality of factors when ranking the document.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Patterson, Anna, Haahr, Paul
Primary Examiner(s)
Xue, Belinda

Application Number

US10/800,006
Time in Patent Office

3,522 Days
Field of Search

707/6, 707/7
US Class Current

707/748
CPC Class Codes

G06F 16/3344 using natural language anal...

Ranking based on reference contexts

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

17 Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Ranking based on reference contexts

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

17 Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links