Learning a document ranking using a loss function with a rank pair or a query parameter
First Claim
1. A computer system for generating a ranking function to rank relevance of a document to a query, comprising:
- a processor; and
a memory for storinga collection of queries, resultant documents, and relevance of each resultant document to its query, the collection being generated by submitting the queries to a search engine with search results for each query being the resultant documents for that query and inputting the relevance of each resultant document to its query; and
an application program for execution by the processor comprising;
a component that trains a ranking function using the resultant documents and their relevances by weighting incorrect rankings of relevant resultant documents more heavily than incorrect rankings of not relevant resultant documents so that the ranking function more correctly ranks relevant resultant documents than it does not relevant resultant documents, wherein a different weighting is used for each rank pair where a rank pair represents a combination of two different relevance classifications, the ranking function being trained byfor each resultant document, generating a feature vector of features for the resultant document,for each query, generating ordered pairs of resultant documents with different relevances;
for each feature, initializing a current weighting parameter for the feature, the current weighting parameters forming the ranking function; and
modifying the current weighting parameters of the ranking function by iteratively applying the ranking function with current weighting parameters to the feature vectors of each pair of resultant documents and when the ranking for the resultant documents of a pair is in error, adjusting the weighting parameters by comparing an evaluation measure of incorrect rankings of documents to an evaluation measure of correct rankings of the documents, wherein the weighting is set to an average of differences between the evaluation measure of the correct ranking and the evaluation measure of the incorrect rankings, such that an error in ranking is weighted more heavily when a resultant document with a higher relevance is ranked incorrectly than when a resultant document with a lower relevance is ranked incorrectly; and
a component that ranks relevance of a document to a query that is not part of the collectionwherein the component that trains the ranking function uses a gradient descent algorithm.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and system for generating a ranking function to rank the relevance of documents to a query is provided. The ranking system learns a ranking function from training data that includes queries, resultant documents, and relevance of each document to its query. The ranking system learns a ranking function using the training data by weighting incorrect rankings of relevant documents more heavily than the incorrect rankings of not relevant documents so that more emphasis is placed on correctly ranking relevant documents. The ranking system may also learn a ranking function using the training data by normalizing the contribution of each query to the ranking function so that it is independent of the number of relevant documents of each query.
92 Citations
8 Claims
-
1. A computer system for generating a ranking function to rank relevance of a document to a query, comprising:
-
a processor; and a memory for storing a collection of queries, resultant documents, and relevance of each resultant document to its query, the collection being generated by submitting the queries to a search engine with search results for each query being the resultant documents for that query and inputting the relevance of each resultant document to its query; and an application program for execution by the processor comprising; a component that trains a ranking function using the resultant documents and their relevances by weighting incorrect rankings of relevant resultant documents more heavily than incorrect rankings of not relevant resultant documents so that the ranking function more correctly ranks relevant resultant documents than it does not relevant resultant documents, wherein a different weighting is used for each rank pair where a rank pair represents a combination of two different relevance classifications, the ranking function being trained by for each resultant document, generating a feature vector of features for the resultant document, for each query, generating ordered pairs of resultant documents with different relevances; for each feature, initializing a current weighting parameter for the feature, the current weighting parameters forming the ranking function; and modifying the current weighting parameters of the ranking function by iteratively applying the ranking function with current weighting parameters to the feature vectors of each pair of resultant documents and when the ranking for the resultant documents of a pair is in error, adjusting the weighting parameters by comparing an evaluation measure of incorrect rankings of documents to an evaluation measure of correct rankings of the documents, wherein the weighting is set to an average of differences between the evaluation measure of the correct ranking and the evaluation measure of the incorrect rankings, such that an error in ranking is weighted more heavily when a resultant document with a higher relevance is ranked incorrectly than when a resultant document with a lower relevance is ranked incorrectly; and a component that ranks relevance of a document to a query that is not part of the collection wherein the component that trains the ranking function uses a gradient descent algorithm. - View Dependent Claims (2, 3, 4)
-
-
5. A computer-readable storage medium containing instructions for controlling a computing system to generate a ranking function to rank relevance of a document to a query, by a method comprising:
-
providing a collection of queries, resultant documents, and relevance of each resultant document to its query, the collection being generated by submitting the queries to a search engine with search results for each query being the resultant documents for that query and inputting the relevance of each resultant document to its query; and training a ranking function using the resultant documents and their relevances by weighting incorrect rankings of relevant resultant documents more heavily than incorrect rankings of not relevant resultant documents so that the ranking function more correctly ranks relevant resultant documents than it does not relevant resultant documents, wherein a different weighting is used for each rank pair where a rank pair represents a combination of two different relevance classifications, the ranking function being trained by for each resultant document, generating a feature vector of features for the resultant document, for each query, generating ordered pairs of resultant documents with different relevances; for each feature, initializing a current weighting parameter for the feature, the current weighting parameters forming the ranking function; and modifying the current weighting parameters of the ranking function by iteratively applying the ranking function with current weighting parameters to the feature vectors of each pair of resultant documents and when the ranking for the resultant documents of a pair is in error, adjusting the weighting parameters by comparing an evaluation measure of incorrect rankings of documents to an evaluation measure of correct rankings of the documents, wherein the weighting is set to an average of differences between the evaluation measure of the correct rankings and the evaluation measure of the incorrect rankings, such that an error in ranking is weighted more heavily when a resultant document with a higher relevance is ranked incorrectly than when a resultant document with a lower relevance is ranked incorrectly; and ranking relevance of a document to a query that is not part of the collection wherein the training of the ranking function uses a gradient descent algorithm. - View Dependent Claims (6, 7, 8)
-
Specification