Search ranking estimation

US 7,765,178 B1
Filed: 10/06/2005
Issued: 07/27/2010
Est. Priority Date: 10/06/2004
Status: Expired due to Fees

First Claim

Patent Images

1. In a computerized search system in which queries are submitted by users who receive, in response, a list of documents selected from a corpus of documents wherein the list comprises documents deemed responsive to a user'"'"'s query, a method of determining relevance of the documents comprising:

obtaining, at a server computer comprising a processor, the query from a user;

determining initial probabilities that at least one leaf category of a taxonomy contains documents relevant to the query, at least one of the initial probabilities being non-zero, wherein the at least one leaf category contains indexed documents predetermined to be related to one another and the initial probabilities are numeric values between zero and one;

determining a relevance of the documents matching the query in each leaf category having non-zero initial probability; and

determining a relevance of documents to the query based on the initial probabilities of the at least one leaf category and the relevance of the documents matching the query, wherein determining the relevance of documents to the query comprises;

for each particular leaf category containing a particular document, determining a weighted relevance value by multiplying a determined relevance of the particular document matching the query by the initial probability that the particular leaf category includes relevant documents;

generating updated probabilities that the nodes of the taxonomy contain relevant documents by weighting each of the relevance of documents to the query, wherein weights used to generate the updated probabilities decay monotonically with the probability that a document matching the query resides in the particular node;

determining an updated relevance of documents to the query based on the updated probabilities and the probability that a document matching the query resides in the particular node; and

summing the weighted relevance values to determine the relevance of the particular document to the query.

View all claims

16 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A searcher can be configured to improve relevance ranking of search results through iterative weighting of search ranking results. A Search Auto Categorizer (SAC) operates on a base query to return a probabilistic distribution of leaf categories of a taxonomy in which relevant products may reside. A Search Logic Unit (SLU) can compute a relevance of any particular leaf category to the base query. The SLU can then determine an initial relevance of a particular product to the query based on the probabilistic distribution and the relevance of leaf category to query. The SLU weights the relevance of a product to the query to establish an updated probabilistic distribution. The SLU then repeats the relevance and weighting until convergence upon a relevance list.

Citations

20 Claims

1. In a computerized search system in which queries are submitted by users who receive, in response, a list of documents selected from a corpus of documents wherein the list comprises documents deemed responsive to a user'"'"'s query, a method of determining relevance of the documents comprising:
- obtaining, at a server computer comprising a processor, the query from a user;
  
  determining initial probabilities that at least one leaf category of a taxonomy contains documents relevant to the query, at least one of the initial probabilities being non-zero, wherein the at least one leaf category contains indexed documents predetermined to be related to one another and the initial probabilities are numeric values between zero and one;
  
  determining a relevance of the documents matching the query in each leaf category having non-zero initial probability; and
  
  determining a relevance of documents to the query based on the initial probabilities of the at least one leaf category and the relevance of the documents matching the query, wherein determining the relevance of documents to the query comprises;
  
  for each particular leaf category containing a particular document, determining a weighted relevance value by multiplying a determined relevance of the particular document matching the query by the initial probability that the particular leaf category includes relevant documents;
  
  generating updated probabilities that the nodes of the taxonomy contain relevant documents by weighting each of the relevance of documents to the query, wherein weights used to generate the updated probabilities decay monotonically with the probability that a document matching the query resides in the particular node;
  
  determining an updated relevance of documents to the query based on the updated probabilities and the probability that a document matching the query resides in the particular node; and
  
  summing the weighted relevance values to determine the relevance of the particular document to the query.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, wherein determining the relevance of the documents matching the query comprises determining a probability that a particular document matching the query resides in a particular leaf category.
  - 3. The method of claim 1, wherein determining the relevance of documents to the query comprises:
    - determining a relevance value by multiplying a probability of a document matching the query in a particular leaf category by the probability that the particular leaf category includes relevant documents; and
      
      summing all relevance values within the particular leaf category.
  - 4. The method of claim 1, further comprising:
    - generating updated probabilities that the leaf categories of the taxonomy contain relevant documents, the updated probabilities being based on the relevance of documents to the query; and
      
      determining an updated relevance of documents to the query based on the updated probabilities that the leaf categories of the taxonomy contain relevant documents and the relevance of the documents matching the query.
  - 5. The method of claim 4, wherein generating the updated probabilities that the leaf categories of the taxonomy contain relevant documents comprises computing a weighted relevance of documents to the query.
  - 6. The method of claim 5, wherein computing the weighted relevance of documents to the query comprises multiplying each value of the relevance of documents to the query by a weighting factor less than one.
  - 7. The method of claim 5, wherein computing the weighted relevance of documents to the query comprises multiplying each value of the relevance of documents to the query by a weighting factor based on the value of the relevance of documents to the query.
  - 8. The method of claim 4, further comprising repeatedly generating the updated probabilities that the leaf categories of the taxonomy contain relevant documents based on the previous updated probabilities that the leaf categories of the taxonomy contain relevant documents and re-determining the updated relevance of documents to the query.
  - 9. The method of claim 8, wherein generating the updated probabilities that the leaf categories of the taxonomy contain relevant documents ceases when a change in the updated probabilities is less than a predetermined convergence threshold.
  - 10. The method of claim 1, further comprising repeatedly generating the updated probabilities of the nodes based on the previous updated probabilities of the nodes and re-determining the updated relevance of documents to the query.
  - 11. The method of claim 10, wherein generating the updated probabilities of the nodes ceases when a change in the updated probabilities is less than a predetermined convergence threshold.

12. In a computerized search system in which queries are submitted by users who receive, in response, a list of documents selected from a corpus of documents wherein the list comprises documents deemed responsive to a user'"'"'s query, an apparatus for determining relevance of the documents comprising:
- a search engine configured to obtain the query from a user;
  
  a Search Auto Categorizer (SAC) configured to determine initial probabilities that at least one leaf category of a taxonomy contains documents relevant to the query, at least one of the initial probabilities being non-zero, wherein the at least one leaf category contains indexed documents predetermined to be related to one another and the initial probabilities are numeric values; and
  
  a Search Logic Unit (SLU) configured to determine a relevance of documents matching the query in each leaf category having non-zero initial probability, and determine a relevance of documents to the query based on the initial probabilities of the at least one leaf category generated by the SAC and the relevance of the documents matching the query, wherein the SLU is configured to determine the relevance of documents to the query by;
  
  for each particular leaf category containing a particular document, determining a weighted relevance value by multiplying a determined relevance of the particular document matching the query by the initial probability that the particular leaf category includes relevant documents;
  
  generating undated probabilities that the nodes of the taxonomy contain relevant documents by weighting each of the relevance of documents to the query, wherein weights used to generate the updated probabilities decay monotonically with the probability that a document matching the query resides in the Particular node;
  
  determining an updated relevance of documents to the query based on the updated probabilities and the probability that a document matching the query resides in the particular node; and
  
  summing the weighted relevance values to determine the relevance of the particular document to the query.
- View Dependent Claims (13, 14, 15, 16, 17, 18)
- - 13. The apparatus of claim 12, wherein the SAC is further configured to determine for each leaf category, a probability that the documents relevant to the query reside in the leaf category.
  - 14. The apparatus of claim 12, wherein the SLU determines the relevance of the documents matching the query by determining for each leaf category a probability that the documents matching the query reside in the leaf category.
  - 15. The apparatus of claim 12, wherein the SLU determines the relevance of documents to the query by multiplying a particular leaf category initial probability by the relevance of the documents matching the query in the particular leaf category.
  - 16. The apparatus of claim 12, wherein the SLU further generates updated probabilities that the leaf categories of the taxonomy contain relevant documents based on the relevance of documents to the query, and determines an updated relevance of documents to the query based on the updated probabilities and the relevance of the documents matching the query.
  - 17. The apparatus of claim 16, wherein the SLU determines the updated probabilities by weighting each of the relevance of documents to the query and summing the weighted relevance within each leaf category.
  - 18. The apparatus of claim 16, wherein the SLU repeatedly generates the updated probabilities based on the previous updated probabilities and re-determines the updated relevance of documents to the query until the updated probabilities converges to within a predetermined threshold.

19. In a computerized search system in which queries are submitted by users who receive, in response, a list of documents selected from a corpus of documents wherein the list comprises documents deemed responsive to a user'"'"'s query, an apparatus for determining relevance of the documents comprising:
- means for obtaining the query from a user;
  
  means for determining initial probabilities that nodes of a taxonomy contain documents relevant to the query, at least one of the initial probabilities being non-zero, wherein the initial probabilities are numeric values;
  
  means for determining a relevance of documents matching the query in each node having non-zero initial probability;
  
  means for determining a relevance of documents to the query based on the initial probabilities of the nodes and the relevance of the documents matching the query;
  
  means for generating updated probabilities that the nodes of the taxonomy contain relevant documents based on the relevance of documents to the query by weighting each of the relevance of documents to the query, wherein weights used for generating the updated probabilities decrease monotonically based on the probability that a document matching the query resides in the particular node; and
  
  means for determining an updated relevance of documents to the query based on the updated probabilities of the nodes and the probability that a document matching the query resides in the particular node.

20. In a computerized search system in which queries are submitted by users who receive, in response, a list of documents selected from a corpus of documents wherein the list comprises documents deemed responsive to a user'"'"'s query, an apparatus for determining relevance of the documents comprising:
- an indexer configured to create an indexed taxonomy from the corpus of documents, the indexed taxonomy comprising at least one high level category, the at least one high level category having at least one sub-category, and the at least one sub-category having at least one leaf category, wherein the at least one leaf category contains indexed documents predetermine to be related to one another;
  
  a search engine configured to obtain the query from a user;
  
  a Search Auto Categorizer (SAC) configured to determine initial probabilities that at least one of the leaf categories contains documents relevant to the query, at least one of the initial probabilities being non-zero, wherein the initial probabilities are numeric values between zero and one;
  
  a Search Logic Unit (SLU) configured to determine a relevance of the documents matching the query in each leaf category having non-zero initial probability, and determine a relevance of documents to the query based on the initial probabilities of the at least one leaf category and the relevance of the documents matching the query, wherein the SLU is configured to determine the relevance of documents to the query by;
  
  for each particular leaf category containing a particular document, determining a weighted relevance value by multiplying a determined relevance of the particular document matching the query by the initial probability that the particular leaf category includes relevant documents;
  
  generating updated probabilities that the nodes of the taxonomy contain relevant documents by weighting each of the relevance of documents to the query, wherein weights used to generate the updated probabilities decay monotonically with the probability that a document matching the query resides in the Particular node;
  
  determining an updated relevance of documents to the query based on the updated probabilities and the probability that a document matching the query resides in the Particular node; and
  
  summing the weighted relevance values to determine the relevance of the particular document to the query.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Connexity, Inc.
Original Assignee
Shopzilla Incorporated
Inventors
Dutton, Keith A., Roizen, Igor
Primary Examiner(s)
Ali; Mohammad
Assistant Examiner(s)
Smith; Brannon W

Application Number

US11/245,601
Time in Patent Office

1,755 Days
Field of Search

707/3, 707/5, 707/7, 706/12, 706/20
US Class Current

1/1
CPC Class Codes

G06F 16/3346   using probabilistic model

G06F 16/951   Indexing; Web crawling tech...

Y10S 707/971   Federated

Y10S 707/99935   Query augmenting and refini...

Y10S 707/99937   Sorting

Search ranking estimation

First Claim

16 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Search ranking estimation

First Claim

16 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links