Generating query suggestions using contextual information
First Claim
Patent Images
1. A system comprising:
- a computer-readable medium coupled to the one or more computers having instructions stored thereon which, when executed by the one or more computers, cause the one or more computers to perform operations comprising;
receiving an original query;
selecting one or more documents responsive to the original query;
generating a term vector for each document, each term vector being a vector of highest-weighted terms selected from the respective document;
generating a search query centroid from the term vectors, the search query centroid being a vector of terms, the terms in the search query centroid being the most common terms among the terms in the term vectors;
searching a centroid repository for previously stored centroids matching the search query centroid;
calculating a dot product of each previously stored centroid and the search query centroid, the dot product indicating a degree of similarity between each previously stored centroid and the search query centroid;
sorting the previously stored centroids by the respective dot products to produce a ranked list of centroids, where the most highly-ranked centroids most closely match the search query centroid;
converting each of a first number of the most highly-ranked centroids into a candidate query;
examining the candidate queries in a ranked order;
adding each candidate query to a set of suggestions if the respective candidate query contains a threshold number of terms that are not included in the original query; and
providing the set of suggestions in response to the original query.
2 Assignments
0 Petitions
Accused Products
Abstract
A search engine receives a query from an end-user. The search engine executes the query on a content database and identifies a set of matching content. The search engine utilizes the matching content to generate a query vector describing the end-user query. The search engine searches a repository of other vectors, called “centroids,” to produce a ranked set of centroids matching the query vector. These centroids are converted into search queries and form a set of candidate queries. The search engine filters the candidate queries to identify ones that are likely to be meaningful to the end-user. The selected candidate queries are returned to the end-user as query suggestions.
114 Citations
12 Claims
-
1. A system comprising:
-
a computer-readable medium coupled to the one or more computers having instructions stored thereon which, when executed by the one or more computers, cause the one or more computers to perform operations comprising; receiving an original query; selecting one or more documents responsive to the original query; generating a term vector for each document, each term vector being a vector of highest-weighted terms selected from the respective document; generating a search query centroid from the term vectors, the search query centroid being a vector of terms, the terms in the search query centroid being the most common terms among the terms in the term vectors; searching a centroid repository for previously stored centroids matching the search query centroid; calculating a dot product of each previously stored centroid and the search query centroid, the dot product indicating a degree of similarity between each previously stored centroid and the search query centroid; sorting the previously stored centroids by the respective dot products to produce a ranked list of centroids, where the most highly-ranked centroids most closely match the search query centroid; converting each of a first number of the most highly-ranked centroids into a candidate query; examining the candidate queries in a ranked order; adding each candidate query to a set of suggestions if the respective candidate query contains a threshold number of terms that are not included in the original query; and providing the set of suggestions in response to the original query. - View Dependent Claims (2, 3, 4)
-
-
5. A computer program product having a computer-readable storage medium having executable computer program instructions recorded thereon for providing suggestions to a client computer, the computer program instructions configured to implement a method comprising:
-
receiving an original query; selecting one or more documents responsive to the original query; generating a term vector for each document, each term vector being a vector of highest-weighted terms selected from the respective document; generating a search query centroid from the term vectors, the search query centroid being a vector of terms, the terms in the search query centroid being the most common terms among the terms in the term vectors; searching a centroid repository for previously stored centroids matching the search query centroid; calculating a dot product of each previously stored centroid and the search query centroid, the dot product indicating a degree of similarity between each previously stored centroid and the search query centroid; sorting the previously stored centroids by the respective dot products to produce a ranked list of centroids, where the most highly-ranked centroids most closely match the search query centroid; converting each of a first number of the most highly-ranked centroids into a candidate query; examining the candidate queries in a ranked order; adding each candidate query to a set of suggestions if the respective candidate query contains a threshold number of terms that are not included in the original query; and providing the set of suggestions in response to the original query. - View Dependent Claims (6, 7, 8)
-
-
9. A method for providing suggestions to a client computer, comprising:
-
receiving an original query; selecting one or more documents responsive to the query; generating a term vector for each document, each term vector being a vector of highest-weighted terms selected from the respective document; generating a search query centroid from the term vectors, the search query centroid being a vector of terms, the terms in the search query centroid being the most common terms among the terms in the term vectors; searching a centroid repository for previously stored centroids matching the search query centroid; calculating a dot product of each previously stored centroid and the search query centroid, the dot product indicating a degree of similarity between each previously stored centroid and the search query centroid; sorting the previously stored centroids by the respective dot products to produce a ranked list of centroids, where the most highly-ranked centroids most closely match the search query centroid; converting each of a first number of the most highly-ranked ranked centroids into a candidate query; examining the candidate queries in a ranked order; adding each candidate query to a set of suggestions if the respective candidate query contains a threshold number of terms that are not included in the original query; and providing the set of suggestions in response to the original query. - View Dependent Claims (10, 11, 12)
-
Specification