Estimating confidence for query revision models
First Claim
1. A method for suggesting a revised query, given a first query, comprising:
- calculating a frequency for a query pair comprised of the first query and a second query;
calculating an increase in quality for the second query over the first query;
calculating an expected utility for the second query using the frequency for the query pair and the increase in quality for the second query; and
determining if the expected utility exceeds a threshold.
4 Assignments
0 Petitions
Accused Products
Abstract
An information retrieval system includes a query revision architecture that integrates multiple different query revisers, each implementing one or more query revision strategies. A revision server receives a user'"'"'s query, and interfaces with the various query revisers, each of which generates one or more potential revised queries. The revision server evaluates the potential revised queries, and selects one or more of them to provide to the user. A session-based reviser suggests one or more revised queries, given a first query, by calculating an expected utility for the revised query. The expected utility is calculated as the product of a frequency of occurrence of the query pair and an increase in quality of the revised query over the first query.
159 Citations
18 Claims
-
1. A method for suggesting a revised query, given a first query, comprising:
-
calculating a frequency for a query pair comprised of the first query and a second query;
calculating an increase in quality for the second query over the first query;
calculating an expected utility for the second query using the frequency for the query pair and the increase in quality for the second query; and
determining if the expected utility exceeds a threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for suggesting a revised query, given a first query, comprising:
-
generating a table of query occurrences during a given time period;
generating a table of query pair occurrences during the given time period;
calculating a frequency for a query pair as occurrence of the query pair as a fraction of occurrence of the first query;
generating a table of quality scores for each query occurrence in query pairs with frequencies exceeding a frequency threshold;
calculating the increase in quality for the second query as the difference between a quality score for the second query of the query pair and a quality score for the first query of the query pair;
calculating an expected utility for the second query as a product of the frequency for the query pair and the increase in quality for the second query; and
identifying the second query in the query pair as a revised query if the expected utility exceeds a utility threshold.
-
-
12. A method for suggesting a revised query, given a first query, comprising:
-
logging query data generated from user sessions;
generating from the query data a table of query occurrences during a given time period;
generating from the query data a table of query pair occurrences during the given time period;
calculating a frequency for a query pair as occurrence of the query pair as a fraction of occurrence of the first query;
generating from the query data a table of quality scores for each query occurrence in query pairs with frequencies exceeding 1%, wherein the quality scores are based on the duration of a first click on a search result, following an S-curve applied to the duration between the first click and a subsequent click, with longer clicks approaching a quality score of 1;
calculating an increase in quality as the difference between a quality score for the second query of the query pair and a quality score for the first query of the query pair;
calculating an expected utility for the second query as a product of the frequency for the query pair and the increase in quality for the second query;
marking the second query in the query pair as the revised query if the expected utility exceeds 0.02; and
wherein the expected utility is used as a confidence measure for ranking the revised query.
-
-
13. A computer program product for suggesting a revised query, given a first query, the computer program product comprising:
-
a computer-readable medium; and
computer program code, coded on the medium, for;
calculating a frequency for a query pair comprised of the first query and a second query;
calculating an increase in quality for the second query over the first query;
calculating an expected utility for the second query using the frequency for the query pair and the increase in quality for the second query; and
determining if the expected utility exceeds a threshold.
-
-
14. A computer program product for suggesting a revised query, given a first query, the computer program product comprising:
-
a computer-readable medium; and
computer program code, coded on the medium, for;
generating a table of query occurrences during a given time period;
generating a table of query pair occurrences during the given time period;
calculating a frequency for a query pair as occurrence of the query pair as a fraction of occurrence of the first query;
generating a table of quality scores for each query occurrence in query pairs with frequencies exceeding a frequency threshold;
calculating the increase in quality for the second query as the difference between a quality score for the second query of the query pair and a quality score for the first query of the query pair;
calculating an expected utility for the second query as a product of the frequency for the query pair and the increase in quality for the second query; and
identifying the second query in the query pair as a revised query if the expected utility exceeds a utility threshold.
-
-
15. A computer program product for suggesting a revised query, given a first query, the computer program product comprising:
-
a computer-readable medium; and
computer program code, coded on the medium, for;
logging query data generated from user sessions;
generating from the query data a table of query occurrences during a given time period;
generating from the query data a table of query pair occurrences during the given time period;
calculating a frequency for a query pair as occurrence of the query pair as a fraction of occurrence of the first query;
generating from the query data a table of quality scores for each query occurrence in query pairs with frequencies exceeding 1%, wherein the quality scores are based on the duration of a first click on a search result, following an S-curve applied to the duration between the first click and a subsequent click, with longer clicks approaching a quality score of 1;
calculating an increase in quality as the difference between a quality score for the second query of the query pair and a quality score for the first query of the query pair;
calculating an expected utility for the second query as a product of the frequency for the query pair and the increase in quality for the second query;
marking the second query in the query pair as the revised query if the expected utility exceeds 0.02; and
wherein the expected utility is used as a confidence measure for ranking the revised query.
-
-
16. A system for providing revised queries for an original query using a plurality of query revision strategies, the system comprising:
-
means for calculating a frequency for a query pair comprised of the first query and a second query;
means for calculating an increase in quality for the second query over the first query;
means for calculating an expected utility for the second query using the frequency for the query pair and the increase in quality for the second query; and
means for determining if the expected utility exceeds a threshold.
-
-
17. A system for providing revised queries for an original query using a plurality of query revision strategies, the system comprising:
-
means for generating a table of query occurrences during a given time period;
means for generating a table of query pair occurrences during the given time period;
means for calculating a frequency for a query pair as occurrence of the query pair as a fraction of occurrence of the first query;
means for generating a table of quality scores for each query occurrence in query pairs with frequencies exceeding a frequency threshold;
means for calculating the increase in quality for the second query as the difference between a quality score for the second query of the query pair and a quality score for the first query of the query pair;
means for calculating an expected utility for the second query as a product of the frequency for the query pair and the increase in quality for the second query; and
means for identifying the second query in the query pair as a revised query if the expected utility exceeds a utility threshold.
-
-
18. A system for providing revised queries for an original query using a plurality of query revision strategies, the system comprising:
-
means for logging query data generated from user sessions;
means for generating from the query data a table of query occurrences during a given time period;
means for generating from the query data a table of query pair occurrences during the given time period;
means for calculating a frequency for a query pair as occurrence of the query pair as a fraction of occurrence of the first query;
means for generating from the query data a table of quality scores for each query occurrence in query pairs with frequencies exceeding 1%, wherein the quality scores are based on the duration of a first click on a search result, following an S-curve applied to the duration between the first click and a subsequent click, with longer clicks approaching a quality score of 1;
means for calculating an increase in quality as the difference between a quality score for the second query of the query pair and a quality score for the first query of the query pair;
means for calculating an expected utility for the second query as a product of the frequency for the query pair and the increase in quality for the second query;
means for marking the second query in the query pair as the revised query if the expected utility exceeds 0.02; and
wherein the expected utility is used as a confidence measure for ranking the revised query.
-
Specification