Ranking functions using an incrementally-updatable, modified naïve bayesian query classifier
First Claim
Patent Images
1. A computer readable storage medium having stored computer-executable instructions that when executed by a computer cause the computer to:
- rank documents on a network in response to a user inputted search query comprising one or more search query terms utilizing an incrementally-updatable query classifier for ranking the documents based on usage data;
display documents to the user ranked by the query classifier based on usage data comprising pre-calculated values #(wi, Asset) and log[#(wi, Asset)]stored for each of the search query terms and pre-calculated values #(Asset), log[#(Asset)] and Σ
#(wi, Asset) stored for each of the documents; and
update the usage data in response to the user selecting a document for viewing by;
updating count values #(Asset), #(wi, Asset) and Σ
#(wi, Asset),calculating values log[#(Asset)] and log[#(wi, Asset)], andstoring updated usage data replacing the pre-calculated values, wherein;
#(Asset) represents a number of times that a given document on the network is selected for viewing,log[#(Asset)] represents a log of #(Asset),#(wi, Asset) represents a number of times that a given document on the network and a search query term, w1, of the search query are matched,log[#(wi, Asset)] represents a log of #(wi, Asset), andΣ
#(wi, Asset) represents a sum of the number of times that a given document on the network and a search query term, wi, of the search query are matched.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods of ranking documents on a network using an incrementally-updatable system are disclosed. Computer readable storage media having stored computer-executable instructions for performing a method of ranking documents on a network using an incrementally-updatable system are also disclosed. Further, computing devices containing at least one application module comprising application code for performing methods of ranking documents on a network using an incrementally-updatable system are disclosed.
16 Citations
20 Claims
-
1. A computer readable storage medium having stored computer-executable instructions that when executed by a computer cause the computer to:
-
rank documents on a network in response to a user inputted search query comprising one or more search query terms utilizing an incrementally-updatable query classifier for ranking the documents based on usage data; display documents to the user ranked by the query classifier based on usage data comprising pre-calculated values #(wi, Asset) and log[#(wi, Asset)]stored for each of the search query terms and pre-calculated values #(Asset), log[#(Asset)] and Σ
#(wi, Asset) stored for each of the documents; andupdate the usage data in response to the user selecting a document for viewing by; updating count values #(Asset), #(wi, Asset) and Σ
#(wi, Asset),calculating values log[#(Asset)] and log[#(wi, Asset)], and storing updated usage data replacing the pre-calculated values, wherein; #(Asset) represents a number of times that a given document on the network is selected for viewing, log[#(Asset)] represents a log of #(Asset), #(wi, Asset) represents a number of times that a given document on the network and a search query term, w1, of the search query are matched, log[#(wi, Asset)] represents a log of #(wi, Asset), and Σ
#(wi, Asset) represents a sum of the number of times that a given document on the network and a search query term, wi, of the search query are matched. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer implemented method of incrementally updating a query classifier component in a search engine of a computer, said method comprising:
-
determining count values #(Asset), #(wi, Asset) and Σ
#(wi, Asset), wherein #(Asset) represents a number of times that a given document on the network is selected for viewing, #(wi, Asset) represents a number of times that a given document on the network and a search query term, wi, of the search query are matched, and Σ
#(wi, Asset) represents a sum of the number of times that a given document on the network and a search query term, wi, of the search query are matched;calculating values log[#(Asset)] and log[#(wi, Asset)], wherein log[#(Asset)] represents a log of #(Asset) and log[#(wi, Asset)] represents a log of #(wi, Asset); storing the count values #(Asset), #(wi, Asset) and Σ
#(wi, Asset) and calculated values log[#(Asset)] and log[#(wi, Asset)] in a database of the computer, wherein the values #(wi, Asset) and log[#(wi, Asset)] are stored for search query terms and the values #(Asset), log[#(Asset)] and Σ
#(wi, Asset) are stored for documents;displaying documents to one or more users ranked by the query classifier based on previously stored count values and calculated values in response to user inputted search queries received by the search engine of the computer; receiving responses during a time period from the one or more users selecting documents for viewing; and updating the stored count values and calculated values by adding new data collected during the time period to the previously stored count values #(Asset), #(wi, Asset) and Σ
#(wi, Asset) and the previously stored calculated values log[#(Asset)] and log[#(wi, Asset)]. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
-
18. A computing device comprising a processing unit executing at least one application module stored in memory on the computing device, wherein the at least one application module comprises application code executable by the processing unit of the computing device for performing a method of ranking documents on a network based on document relevance to a user inputted search query, said method comprising the steps of:
-
utilizing formula (I) to determine a document relevance score for each document; and ranking documents in descending order based on the document relevance score for each document; wherein formula (I) comprises wherein; P(Asset|Query) represents a probability of returning a given document, Asset, given a particular user inputted search query, Query; NQ is the number of terms in the search query; V is the size of the vocabulary of the network; #(T) is the total number of search queries that have been processed; #(Asset) represents a number of times that a given document on the network is selected for viewing; log[#(Asset)] represents a log of #(Asset); #(wi, Asset) represents a number of times that a given document on the network and a search query term, wi, of the search query are matched; log[#(wi, Asset)] represents a log of #( wi, Asset); and Σ
#(wi, Asset) represents a sum of the number of times that a given document on the network and a search query term, wi, of the search query are matched.- View Dependent Claims (19, 20)
wherein; λ
is a weighing multiplier having a value equal to or less than 1.0; andt is an integer representing an age of a count value component.
-
-
20. The computing device of claim 19, wherein λ
- is less than 1.0.
Specification