Learning a document ranking function using fidelity-based error measurements
First Claim
1. A method in a computing device for determining loss between a target probability and a model probability for documents when training a ranking function based on training data, the training data including documents and the target probability of relative relevance of pairs of documents to queries, the model probability being generated by a ranking function that ranks documents, the method comprising:
- training the ranking function by repeating the following until a calculated loss is below a threshold loss;
selecting a new ranking function by modifying a previous ranking function to reduce the calculated loss;
applying the new ranking function to the pairs of documents of the training data to provide new rankings of the documents based on the queries;
calculating by the computing device a model probability from the new rankings of the documents; and
calculating by the computing device a loss between the calculated model probability and the target probability to indicate a difference between the new ranking of a pair of documents represented by the calculated model probability and a ranking of the pair of documents represented by the target probability, the loss varying between 0 and 1 and the loss being 0 when the calculated model probability is the same as the target probabilitywherein the calculated loss is a fidelity loss andwherein the fidelity-based loss is represented by the following equation;
where Fij represents the fidelity loss, Pij* represents the target probability for documents i and j, and Pij represents the calculated model probability for documents i and j.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and system for generating a ranking function using a fidelity-based loss between a target probability and a model probability for a pair of documents is provided. A fidelity ranking system generates a fidelity ranking function that ranks the relevance of documents to queries. The fidelity ranking system operates to minimize a fidelity loss between pairs of documents of training data. The fidelity loss may be derived from “fidelity” as used in the field of quantum physics. The fidelity ranking system may use a learning technique in conjunction with a fidelity loss when generating the ranking function. After the fidelity ranking system generates the fidelity ranking function, it uses the fidelity ranking function to rank the relevance of documents to queries.
56 Citations
19 Claims
-
1. A method in a computing device for determining loss between a target probability and a model probability for documents when training a ranking function based on training data, the training data including documents and the target probability of relative relevance of pairs of documents to queries, the model probability being generated by a ranking function that ranks documents, the method comprising:
-
training the ranking function by repeating the following until a calculated loss is below a threshold loss; selecting a new ranking function by modifying a previous ranking function to reduce the calculated loss; applying the new ranking function to the pairs of documents of the training data to provide new rankings of the documents based on the queries; calculating by the computing device a model probability from the new rankings of the documents; and calculating by the computing device a loss between the calculated model probability and the target probability to indicate a difference between the new ranking of a pair of documents represented by the calculated model probability and a ranking of the pair of documents represented by the target probability, the loss varying between 0 and 1 and the loss being 0 when the calculated model probability is the same as the target probability wherein the calculated loss is a fidelity loss and wherein the fidelity-based loss is represented by the following equation; where Fij represents the fidelity loss, Pij* represents the target probability for documents i and j, and Pij represents the calculated model probability for documents i and j. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method in a computing device for determining loss between a target probability and a model probability for a pair of documents, the model probability being generated by a ranking function that ranks documents, the method comprising:
-
applying the ranking function to the pair of documents to provide rankings of the documents; calculating a model probability from the rankings of the documents; and calculating by the computing device a fidelity loss between the calculated model probability and the target probability, the fidelity loss varying between 0 and 1 and the loss being 0 when the calculated model probability is the same as the target probability wherein the calculating of a model probability applies a logistic function to the ranking of the documents wherein the logistic function is represented by the following equation; where Pij represents the probability that document i is ranked higher than document j and oij represents the difference between outputs of a fidelity ranking function for document i and document j as represented by f(di)−
f(dj) with f(di) being the output of the fidelity ranking function for document i.
-
-
8. A computing device for generating a ranking function for documents, the ranking function indicating a ranking of documents based on relevance of the documents to a query, the system comprising:
-
a processor; and a memory with computer-executable instructions that implement a component that provides features of documents and indications of target probabilities of relative rankings of the relevance of pairs of documents to queries; a component that calculates a fidelity loss between a model probability and a target probability for a pair of documents, the probabilities indicating a probability of relative ranking of the documents of the pair; and a component that generates the ranking function by operating to minimize the fidelity loss between the model probabilities derived from the ranking of documents and the target probabilities wherein the model probability is derived by applying a logistic function to the ranking of the documents, wherein the logistic function is represented by the following equation; where Pij represents the probability that document i is ranked higher than document j and oij represents the difference between outputs of a fidelity ranking function for document i and document j as represented by f(di)−
f(dj) with f(di) being the output of the fidelity ranking function for document, i andwherein the fidelity loss is represented by the following equation; where Fij represents the fidelity loss, Pij* represents the target probability for documents i and j, and Pij represents the calculated model probability for documents i and j. - View Dependent Claims (9, 11, 12, 13)
-
-
10. A computing device for generating a ranking function for documents, the ranking function indicating a ranking of documents based on relevance of the documents to a query, the system comprising:
-
a processor; and a memory with computer-executable instructions that implement a component that provides features of documents and indications of target probabilities of relative rankings of the relevance of pairs of documents to queries; a component that calculates a fidelity loss between a model probability and a target probability for a pair of documents, the probabilities indicating a probability of relative ranking of the documents of the pair; and a component that generates the ranking function by operating to minimize the fidelity loss between the model probabilities derived from the ranking of documents and the target probabilities wherein the fidelity loss varies between 0 and 1 and the fidelity loss is 0 when the model probability is the same as the target probability and wherein the fidelity-based loss is represented by the following equation; where Fij represents the fidelity loss, Pij* represents the target probability for documents i and j, and Pij represents the calculated model probability for documents i and j.
-
-
14. A computing device for determining loss between a target probability and a model probability for documents when training a ranking function based on training data, the training data including documents and the target probability of relative relevance of pairs of documents to queries, the model probability being generated by a ranking function that ranks documents, comprising:
-
a memory storing computer-executable instructions of; a component that trains the ranking function by repeating the following until a calculated loss is below a threshold loss; selecting a new ranking function by modifying a previous ranking function to reduce the calculated loss; applying the new ranking function to the pairs of documents of the training data to provide new rankings of the documents based on the queries; calculating by the computing device a model probability from the new rankings of the documents; and calculating by the computing device a loss between the calculated model probability and the target probability to indicate a difference between the new ranking of a pair of documents represented by the calculated model probability and a ranking of the pair of documents represented by the target probability, the loss varying between 0 and 1 and the loss being 0 when the calculated model probability is the same as the target probability; and a processor for executing the computer-executable instructions stored in the memory wherein the calculated loss is a fidelity loss and wherein the fidelity-based loss is represented by the following equation;
Fij=1−
(√
{square root over (Pij*·
Pij)}+√
{square root over ((1−
Pij*)·
(1−
Pij))}{square root over ((1−
Pij*)·
(1−
Pij))})where Fij represents the fidelity loss, Pij* represents the target probability for documents i and j, and Pij represents the calculated model probability for documents i and j. - View Dependent Claims (15, 16, 17, 18, 19)
-
Specification