Estimating confidence for query revision models
First Claim
1. A computer-implemented method comprising:
- receiving an original query term from a user;
identifying commonly entered query terms from session data, the session data including a record of each of multiple past sessions of search activity by multiple different other users, each past session including a sequence of queries executed by a respective other user, each commonly entered query term being a query term occurring in a query after the original query term occurs in an earlier query in at least one of the past sessions of the other users;
determining, by one or more processors, a frequency of occurrence of each commonly entered query term in the past sessions as a successor query term to the original query term;
retaining one or more candidate query terms, the candidate query terms being the commonly entered query terms whose frequency of occurrence as the successor query term satisfies a first threshold;
determining a quality score of the original query term and of each of the candidate query terms based on user click data which specifies an extent to which, in the past sessions, the other users interacted with (i) a search result resulting from executing the queries using the original query term and (ii) a search result resulting from executing the queries using the candidate query terms;
retaining one or more improved query terms, each improved query term being the candidate query term having a quality score that exceeds the quality score of the original query term;
determining an expected utility for each improved query term based on multiplying a difference, in the quality score of the improved query terms over the quality score of the original query term, by the frequency of occurrence of the improved query term as the successor query term; and
providing a link to second search results, each of the second search results being associated with one or more of the improved query terms, and each of the second search results being associated with at least a portion of the improved query terms having the expected utility that satisfies a second threshold.
4 Assignments
0 Petitions
Accused Products
Abstract
An information retrieval system includes a query revision architecture that integrates multiple different query revisers, each implementing one or more query revision strategies. A revision server receives a user'"'"'s query, and interfaces with the various query revisers, each of which generates one or more potential revised queries. The revision server evaluates the potential revised queries, and selects one or more of them to provide to the user. A session-based reviser suggests one or more revised queries, given a first query, by calculating an expected utility for the revised query. The expected utility is calculated as the product of a frequency of occurrence of the query pair and an increase in quality of the revised query over the first query.
104 Citations
17 Claims
-
1. A computer-implemented method comprising:
-
receiving an original query term from a user; identifying commonly entered query terms from session data, the session data including a record of each of multiple past sessions of search activity by multiple different other users, each past session including a sequence of queries executed by a respective other user, each commonly entered query term being a query term occurring in a query after the original query term occurs in an earlier query in at least one of the past sessions of the other users; determining, by one or more processors, a frequency of occurrence of each commonly entered query term in the past sessions as a successor query term to the original query term; retaining one or more candidate query terms, the candidate query terms being the commonly entered query terms whose frequency of occurrence as the successor query term satisfies a first threshold; determining a quality score of the original query term and of each of the candidate query terms based on user click data which specifies an extent to which, in the past sessions, the other users interacted with (i) a search result resulting from executing the queries using the original query term and (ii) a search result resulting from executing the queries using the candidate query terms; retaining one or more improved query terms, each improved query term being the candidate query term having a quality score that exceeds the quality score of the original query term; determining an expected utility for each improved query term based on multiplying a difference, in the quality score of the improved query terms over the quality score of the original query term, by the frequency of occurrence of the improved query term as the successor query term; and providing a link to second search results, each of the second search results being associated with one or more of the improved query terms, and each of the second search results being associated with at least a portion of the improved query terms having the expected utility that satisfies a second threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system comprising:
-
one or more computers; and a computer-readable medium coupled to the one or more computers having instructions stored thereon which, when executed by the one or more computers, cause the one or more computers to perform operations comprising; receiving an original query term from a user, identifying commonly entered query terms from session data, the session data including a record of each of multiple past sessions of search activity by multiple different other users, each past session including a sequence of queries executed by a respective other user, each commonly entered query term being a query term occurring in a query after the original query term occurs in an earlier query in at least one of the past sessions of the other users, determining a frequency of occurrence of each commonly entered query term in the past sessions as a successor query term to the original query term, retaining one or more candidate query terms, the candidate query terms being the commonly entered query terms whose frequency of occurrence as the successor term satisfies a first threshold, determining a quality score of the original query term and of each of the candidate query terms based on user click data which specifies an extent to which, in the past sessions, the other users interacted with (i) a search result resulting from executing the queries using the original query term and (ii) a search result resulting from executing the queries using the candidate query terms, retaining one or more improved query terms, each improved query term being the candidate query term having a quality score that exceeds the quality score of the original query term, determining an expected utility for each improved query term based on multiplying a difference, in the quality score of the improved query terms over the quality score of the original query term, by the frequency of occurrence of the improved query term as the successor query term; and providing a link to second search results, each of the second search results being associated with one or more of the improved query terms, and each of the second search results being associated with at least a portion of the improved query terms having the expected utility that satisfies a second threshold. - View Dependent Claims (11, 12, 13)
-
-
14. A computer storage medium encoded with a computer program, the program comprising instructions that when executed by data processing apparatus cause the data processing apparatus to perform operations comprising:
-
receiving an original query term from a user; identifying commonly entered query terms from session data, the session data including a record of each of multiple past sessions of search activity by multiple different other users, each past session including a sequence of queries executed by a respective other user, each commonly entered query term being a query term occurring in a query after the original query term occurs in an earlier query in at least one of the past sessions of the other users; determining a frequency of occurrence of each commonly entered query term in the past sessions as the successor query term to the original query term; retaining one or more candidate query terms, the candidate query terms being the commonly entered query terms whose frequency of occurrence as the successor query term satisfies a first threshold; determining a quality score of the original query term and of each of the candidate query terms based on user click data which specifies an extent to which, in the past sessions, the other users interacted with (i) a search result resulting from executing the queries using the original query term and (ii) a search result resulting from executing the queries using the candidate query terms; retaining one or more improved query terms, each improved query term being the candidate query term having a quality score that exceeds the quality score of the original query term; and determining an expected utility for each improved query term based on multiplying a difference, in the quality score of the improved query terms over the quality score of the original query term, by the frequency of occurrence of the improved query term as the successor query term; and providing a link to second search results, each of the second search results being associated with one or more of the improved query terms, and each of the second search results being associated with at least a portion of the improved query terms having the expected utility that satisfies a second threshold. - View Dependent Claims (15, 16, 17)
-
Specification