Determination of relationships between collections of disparate media types
First Claim
Patent Images
1. A system, comprising:
- a relationship component that automatically determines relationships between disparate collections of media types, the collections and relationships computed in a single process and employed as part of processing a query to return documents at query time, the relationship component computing vectors for the collections, the vectors including query vectors and document vectors both of which are generated based on a collection-wise probabilistic algorithm and are processed with a similarity function to create a combined model of query-document labeled data; and
a processor that executes computer-executable instructions associated with at least the relationship component,wherein the relationship component employs a cost function that defines the relationships between the disparate collections of media based on truth data, the defined relationships being usable in the processing of a query, andwherein the collections are clusters that are concurrently created as query clusters and document clusters, and the relationship component computes the relationships between the query clusters and document clusters.
2 Assignments
0 Petitions
Accused Products
Abstract
Architecture that automatically determines relationships between vector spaces of disparate media types, and outputs ranker signals based on these relationships, all in a single process. The architecture improves search result relevance by simultaneously clustering queries and documents, and enables the training of a model for creating one or more ranker signals using simultaneous clustering of queries and documents in their respective spaces.
67 Citations
20 Claims
-
1. A system, comprising:
-
a relationship component that automatically determines relationships between disparate collections of media types, the collections and relationships computed in a single process and employed as part of processing a query to return documents at query time, the relationship component computing vectors for the collections, the vectors including query vectors and document vectors both of which are generated based on a collection-wise probabilistic algorithm and are processed with a similarity function to create a combined model of query-document labeled data; and a processor that executes computer-executable instructions associated with at least the relationship component, wherein the relationship component employs a cost function that defines the relationships between the disparate collections of media based on truth data, the defined relationships being usable in the processing of a query, and wherein the collections are clusters that are concurrently created as query clusters and document clusters, and the relationship component computes the relationships between the query clusters and document clusters. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system, comprising:
-
a relationship component that automatically determines relationships between disparate collections of media types, the collections and relationships computed in a single process and employed as part of processing a query to return documents at query time, the relationship component computing a total weight of each query in a query collection, and a probability that the query belongs in a given query collection; and a processor that executes computer-executable instructions associated with at least the relationship component, wherein the relationship component employs a cost function that defines the relationships between the disparate collections of media based on truth data, the defined relationships being usable in the processing of a query, and wherein the collections are clusters that are concurrently created as query clusters and document clusters, and the relationship component computes the relationships between the query clusters and document clusters.
-
-
9. A system, comprising:
-
a relationship component that automatically determines relationships between disparate collections of media types, the collections and relationships computed in a single process and employed as part of processing a query to return documents at query time, the relationship component computing a total weight of each document in a document collection, and a probability that the document belongs in a given document collection; and a processor that executes computer-executable instructions associated with at least the relationship component, wherein the relationship component employs a cost function that defines the relationships between the disparate collections of media based on truth data, the defined relationships being usable in the processing of a query, and wherein the collections are clusters that are concurrently created as query clusters and document clusters, and the relationship component computes the relationships between the query clusters and document clusters.
-
-
10. A method, comprising:
-
processing a query of a single, first media type to return documents of a different media type, as part of a training phase; converting the query into collections of multi-dimensional query vectors and the documents into collections of multi-dimensional document vectors; automatically computing relationships between the query vectors and the document vectors for relevancy of the query to a given document; computing the relationship based on vector probabilities; and utilizing a processor that executes instructions stored in memory to perform at least one of the acts of processing, converting, or computing. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. A method, comprising:
-
as part of a training phase, processing a query of a single, first media type to return documents of a different media type; converting the query into clusters of multi-dimensional query vectors and the documents into clusters of multi-dimensional document vectors, the query and document vectors having elements of probabilities; automatically computing relationships between the query vectors and the document vectors based on vector probabilities; computing a probability that the query and a document simultaneously belong to a same cluster based on the relationships; applying a cost function to measure similarity between expected true labels and predicted labels as the relationships; and utilizing a processor that executes instructions stored in memory to perform at least one of the acts of processing, converting, computing relationships, or computing a probability. - View Dependent Claims (18, 19, 20)
-
Specification