Retrieval systems and methods employing probabilistic cross-media relevance feedback
First Claim
1. A non-transitory storage medium storing instructions executable by a digital processor to perform a method comprising:
- optimizing weights of a document relevance scoring function to generate a trained document relevance scoring function ƒ
(q,d) where q denotes a query and d denotes a document, wherein the document relevance scoring function comprises a weighted combination of scoring components including at least one pseudo-relevance scoring component and at least one cross-media relevance scoring component, and the optimizing is respective to a set of training documents including at least some multimedia training documents and a set of training queries and corresponding training document relevance annotations, the optimizing comprising optimizing a distribution-matchinq objective function respective to matching between;
a distribution
7 Assignments
0 Petitions
Accused Products
Abstract
In a retrieval application, a document relevance scoring function comprises a weighted combination of scoring components including at least one of a pseudo-relevance scoring component and a cross-media relevance scoring component. Weights of the document relevance scoring function are optimized to generate a trained document relevance scoring function. The optimizing is respective to a set of training documents including at least some multimedia training documents and a set of training queries and corresponding training document relevance annotations. A retrieval operation is performed for an input query respective to a database using the trained document relevance scoring function to retrieve one or more documents from the database.
-
Citations
22 Claims
-
1. A non-transitory storage medium storing instructions executable by a digital processor to perform a method comprising:
optimizing weights of a document relevance scoring function to generate a trained document relevance scoring function ƒ
(q,d) where q denotes a query and d denotes a document, wherein the document relevance scoring function comprises a weighted combination of scoring components including at least one pseudo-relevance scoring component and at least one cross-media relevance scoring component, and the optimizing is respective to a set of training documents including at least some multimedia training documents and a set of training queries and corresponding training document relevance annotations, the optimizing comprising optimizing a distribution-matchinq objective function respective to matching between;a distribution - View Dependent Claims (2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14)
-
7. A method comprising:
-
optimizing weights of a document relevance scoring function to generate a trained document relevance scoring function ƒ
(q,d) where q denotes a query and d denotes a document, wherein the document relevance scoring function comprises a weighted combination of scoring components including at least one pseudo-relevance scoring component and at least one cross-media relevance scoring component, and the optimizing is respective to a set of training documents including at least some multimedia training documents and a set of training queries and corresponding training document relevance annotations, the optimizing including optimizing a distribution-matching objective function respective to matching between;a distribution pq(d) of document relevance computed using the document relevance scoring function ƒ
(q,d) for the set of training queries and the set of training documents D, anda distribution pq*(d) of the training document relevance annotations corresponding to the set of training queries wherein pq*(d) is uniform over a set of documents Rq that are relevant to training query q and zero for all other training documents, the optimizing further including; for each training query of the set of training queries, computing training document relevance values for the training documents using the document relevance scoring function ƒ
(q,d); andscaling the computed training document relevance values using training query-dependent scaling factors, wherein the optimizing also optimizes the training query-dependent scaling factors; and performing a retrieval operation for an input query respective to a database using the trained document relevance scoring function to retrieve one or more documents from the database; wherein the optimizing and the performing are performed by a digital processor. - View Dependent Claims (19, 20, 21, 22)
-
-
15. An apparatus comprising:
a digital processor configured to train a document relevance scoring function to generate a trained document relevance scoring function ƒ
(q,d) where q denotes a query and d denotes a document, wherein the document relevance scoring function comprises a weighted linear combination of scoring components including at least one pseudo-relevance scoring component and at least one cross-media relevance scoring component, the training adjusts weights of the weighted linear combination of scoring components, and the training is respective to a set of training documents including at least some multimedia training documents and a set of training queries and corresponding training document relevance annotations, wherein the digital processor is configured to train the document relevance scoring function by a process including;for each training query of the set of training queries, computing training document relevance values for the training documents using the document relevance scoring function ƒ
(q,d);scaling the computed training document relevance values using training query-dependent scaling factors; and adjusting (i) weights of the weighted linear combination of scoring components and (ii) the training query-dependent scaling factors to optimize a distribution-matching objective function measuring an aggregate similarity between the computed training document relevance values and the corresponding training document relevance annotations, wherein the distribution-matching objective function is respective to matching between (1) a distribution pq(d) of document relevance computed using the document relevance scoring function ƒ
(q,d) for the set of training queries and the set of training documents D, and (2) a distribution pq*(d) of the training document relevance annotations corresponding to the set of training queries wherein pq*(d) is uniform over a set of documents Rq that are relevant to training query q and zero for all other training documents.- View Dependent Claims (16, 17, 18)
Specification