Method and apparatus for machine learning a document relevance function
First Claim
1. A method for determining a document relevance function for estimating a relevance score of a document in a database with respect to a query, comprising:
- (a) collecting a respective result set of documents from the database for each of a plurality of test queries;
(b) for each test query of the plurality of test queries, selecting a subset of the documents in the respective result set; and
assigning a set of training relevance scores to the documents in the subset; and
(c) determining a relevance function based on the plurality of test queries, the subsets of documents, and the sets of training relevance scores;
(d) outputting a list of one or more documents ordered based on the determined relevance function.
14 Assignments
0 Petitions
Accused Products
Abstract
Provided is a method and computer program product for determining a document relevance function for estimating a relevance score of a document in a database with respect to a query. For each of a plurality of test queries, a respective set of result documents is collected. For each test query, a subset of the documents in the respective result set is selected, and a set of training relevance scores is assigned to documents in the subset. In one embodiment, at least some of the training relevance scores are assigned by human subjects who determine individual relevance scores for submitted documents with respect to the corresponding queries. Finally, a relevance function is determined based on the plurality of test queries, the subsets of documents, and the sets of training relevance scores.
-
Citations
56 Claims
-
1. A method for determining a document relevance function for estimating a relevance score of a document in a database with respect to a query, comprising:
-
(a) collecting a respective result set of documents from the database for each of a plurality of test queries; (b) for each test query of the plurality of test queries, selecting a subset of the documents in the respective result set; and
assigning a set of training relevance scores to the documents in the subset; and(c) determining a relevance function based on the plurality of test queries, the subsets of documents, and the sets of training relevance scores; (d) outputting a list of one or more documents ordered based on the determined relevance function. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
-
29. A computer program product for use in conjunction with a computer system, the computer program product comprising a computer readable storage medium and a computer program mechanism therein, the computer program mechanism comprising:
-
(a) a collecting module for collecting a respective result set of documents from the database for each of a plurality of test queries; (b) a sampling module for selecting, for each test query of the plurality of test queries, a subset of the documents in the respective result set; (c) a scoring module for assigning a set of training relevance scores to the documents in each selected subset; and (d) a relevance function generation module for determining a relevance function based on the plurality of test queries, the subsets of documents, and the sets of training relevance scores. - View Dependent Claims (30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56)
-
Specification