Method and system for adapting search results to personal information needs

US 7,630,976 B2
Filed: 05/10/2005
Issued: 12/08/2009
Est. Priority Date: 05/10/2005
Status: Expired due to Fees

First Claim

Patent Images

1. A computer-readable storage medium containing instructions for controlling a computer system to calculate relevance of a document to a user, by a method comprising:

providing click-through data generated when users submitted queries to a search engine and selected a document from results provided by the search engine;

identifying user, query, and document triplets from the click-through data, each triplet indicating that the user of the triplet submitted the query of the triplet and the user selected the document of the triplet from results of the query provided by the search engine;

identifying user clusters of users and query clusters of queries such that each user is in only one user cluster and each query is in only one queryreceiving from a user a query;

searching for documents to be provided as results of the received query;

for each document of the results of the received query, determining a probability that the user from whom the query was received will find the document relevant by performing a smoothing of the identified triplets to account for sparseness of the triplets and calculating the probability based on the smoothed triplets, the smoothing including;

smoothing via backoff by;

when the identified triplets include a triplet for the user, query, and document, setting a first probability based on a discounted count of the number of identified triplets for the user, query, and document and the number of triplets for the user and query; and

when the identified triplets do not include a triplet for the user, query, and document, setting the first probability based on the number of identified triplets for the query and the document and the number of identified triplets for the document and based on a normalization constant;

when the identified triplets include a triplet for the query and document, smoothing via clustering by setting a second probability based on a probability that a user in the user cluster that includes the user from whom the query was received selects the document from the query; and

when the identified triplets do not include a triplet for the query and document, smoothing via content similarity by;

identifying the query cluster to which the query is most similar; and

setting the second probability based on a probability that a user selects the document from a query that is in the query cluster; and

combining the first probability and the second probability into an overall probability of the document; and

displaying an indication of the documents to the user from whom the query was received in an order based on the combined overall probabilities of the documents.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and system for adapting search results of a query to the information needs of the user submitting the query is provided. A search system analyzes click-through triplets indicating that a user submitted a query and that the user selected a document from the results of the query. To overcome the large size and sparseness of the click-through data, the search system when presented with an input triplet comprising a user, a query, and a document determines a probability that the user will find the input document important by smoothing the click-through triplets. The search system then orders documents of the result based on the probability of their importance to the input user.

51 Citations

View as Search Results

5 Claims

1. A computer-readable storage medium containing instructions for controlling a computer system to calculate relevance of a document to a user, by a method comprising:
- providing click-through data generated when users submitted queries to a search engine and selected a document from results provided by the search engine;
  
  identifying user, query, and document triplets from the click-through data, each triplet indicating that the user of the triplet submitted the query of the triplet and the user selected the document of the triplet from results of the query provided by the search engine;
  
  identifying user clusters of users and query clusters of queries such that each user is in only one user cluster and each query is in only one queryreceiving from a user a query;
  
  searching for documents to be provided as results of the received query;
  
  for each document of the results of the received query, determining a probability that the user from whom the query was received will find the document relevant by performing a smoothing of the identified triplets to account for sparseness of the triplets and calculating the probability based on the smoothed triplets, the smoothing including;
  
  smoothing via backoff by;
  
  when the identified triplets include a triplet for the user, query, and document, setting a first probability based on a discounted count of the number of identified triplets for the user, query, and document and the number of triplets for the user and query; and
  
  when the identified triplets do not include a triplet for the user, query, and document, setting the first probability based on the number of identified triplets for the query and the document and the number of identified triplets for the document and based on a normalization constant;
  
  when the identified triplets include a triplet for the query and document, smoothing via clustering by setting a second probability based on a probability that a user in the user cluster that includes the user from whom the query was received selects the document from the query; and
  
  when the identified triplets do not include a triplet for the query and document, smoothing via content similarity by;
  
  identifying the query cluster to which the query is most similar; and
  
  setting the second probability based on a probability that a user selects the document from a query that is in the query cluster; and
  
  combining the first probability and the second probability into an overall probability of the document; and
  
  displaying an indication of the documents to the user from whom the query was received in an order based on the combined overall probabilities of the documents.
- View Dependent Claims (2, 3, 4)
- - 2. The computer-readable medium of claim 1, including processing the triplets to calculate various counts.
  - 3. The computer-readable medium of claim 1, wherein the probability is the probability of the input document given the input user and input query when the user, query, and document are in an identified triplet and is the probability of the input document given the input query otherwise.
  - 4. The computer-readable medium of claim 1 wherein the probability is based on similarity between the input document and document clusters identified based on relationships between users and queries.

5. A computing device with a processor and memory for calculating relevance of a document, comprising:
- a click-through data store;
  
  a component that identifies user, query, and document triplets from the click through data;
  
  a component that identifies user clusters of users and document clusters of documents such that each user is in only one user cluster and each document is in only one document cluster;
  
  a component that receives an input user, an input query, and input documents, the input documents representing results of the input query submitted by the input user; and
  
  for each input document, determining a probability that the input user will find the input document relevant by performing a smoothing by performingwhen the same input user, input query, and input document triplet was identified in the click-through data, a first backoff smoothing by setting a first probability that is a discounted probability of when the input user submits the input query, the input user selects the input document as indicated by the identified triplets;
  
  when only the same input user and input query were identified in a triplet of the click-through data, a second backoff smoothing by setting the first probability that is a normalized probability of when a user submits the input query, that user selects the input document as indicated by the identified triplets;
  
  when both the input query and input document were identified in a triplet of the click-through data, a clustering smoothing by setting a second probability based on a probability that a user in the user cluster that includes the input user selects the input document from the input query as indicated by the identified triplets; and
  
  when both the input query and input document were not identified in a triplet of the click-through data, a content similarity smoothing by;
  
  identifying a document cluster to which the input document is most similar; and
  
  setting the second probability based on a probability that a user selects a document of the document cluster from the input query as indicated by the identified triplets; and
  
  combining the first probability and the second into an overall probability of the input document to account for sparseness of the identified triplets.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Ma, Wei-Ying, Chen, Zheng, Zhang, Benyu, Zeng, Hua-Jun, Jiang, Xue-Mei, Xue, Gui-Rong
Primary Examiner(s)
Jalil; Neveen Abel
Assistant Examiner(s)
Chbouki; Tarek

Application Number

US11/125,839
Publication Number

US 20060259480A1
Time in Patent Office

1,673 Days
Field of Search

707/1, 707/3
US Class Current

1/1
CPC Class Codes

G06F 16/9535   Search customisation based ...

G06F 16/9558   Details of hyperlinks; Mana...

Y10S 707/99935   Query augmenting and refini...

Method and system for adapting search results to personal information needs

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

51 Citations

5 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for adapting search results to personal information needs

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

51 Citations

5 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links