Predictive stemming for web search with statistical machine translation models
First Claim
1. A method, comprising:
- receiving a particular query comprising a plurality of words;
determining stems to at least one of the words in the particular query;
based on the stems of the plurality of words in the particular query, determining whether one or more stems of particular words in the particular query occurs in a dictionary comprising one or more transformations based upon stems of words;
selecting, from the dictionary, one or more transformations of the one or more stems of the particular words;
generating at least one candidate query that includes a transformation of one or more particular words;
computing a value for each candidate query;
selecting at least one candidate query to execute based upon the computed value for each candidate query;
executing the particular query and the at least one selected candidate query to generate search results across a plurality of documents; and
displaying at least a portion of the search results,wherein the method is performed by one or more computing devices.
9 Assignments
0 Petitions
Accused Products
Abstract
Techniques for determining when and how to transform words in a query to return the most relevant search results while minimizing computational overhead are provided. A dictionary is generated based upon words used in a specified number of previous most frequent search queries and comprises lists of transformations that may include variants based upon the stems of words, synonyms, and abbreviation expansions. When a query is received from a user, candidate queries are generated based upon replacing particular words in the query with a transformation of the particular words. Candidate queries are selected that have a high probability of returning relevant results by computing values of the query using language model scoring and translation scoring. The selected candidate queries and the original query are executed to return search results. The search results are displayed to the user with the words in the original query and the transformed words in bold.
-
Citations
32 Claims
-
1. A method, comprising:
-
receiving a particular query comprising a plurality of words; determining stems to at least one of the words in the particular query; based on the stems of the plurality of words in the particular query, determining whether one or more stems of particular words in the particular query occurs in a dictionary comprising one or more transformations based upon stems of words; selecting, from the dictionary, one or more transformations of the one or more stems of the particular words; generating at least one candidate query that includes a transformation of one or more particular words; computing a value for each candidate query; selecting at least one candidate query to execute based upon the computed value for each candidate query; executing the particular query and the at least one selected candidate query to generate search results across a plurality of documents; and displaying at least a portion of the search results, wherein the method is performed by one or more computing devices. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A method, comprising:
-
receiving a particular query from a user; based on the particular query, determining whether one or more particular words in the particular query is able to be transformed; determining one or more transformed forms of the one or more particular words; determining whether using the one or more transformed forms of the one or more particular words has a probability to produce relevant search results that is higher than a specified threshold; in response to determining that transforming the one or more particular words has a probability to produce relevant search results that is higher than a specified threshold, using a particular word and a transformed word for the particular word within a version of the particular query; generating search results across a plurality of documents based on executing the particular query and the version of the particular query; and displaying at least a portion of the search results; wherein the method is performed by one or more computing devices. - View Dependent Claims (16)
-
-
17. A machine-readable storage medium carrying one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform a method comprising:
-
receiving a particular query comprising a plurality of words; determining stems to at least one of the words in the particular query; based on the stems of the plurality of words in the particular query, determining whether one or more stems of particular words in the particular query occurs in a dictionary comprising one or more transformations based upon stems of words; selecting, from the dictionary, one or more transformations of the one or more stems of the particular words; generating at least one candidate query that includes a transformation of one or more particular words; computing a value for each candidate query; selecting at least one candidate query to execute based upon the computed value for each candidate query; executing the particular query and the at least one selected candidate query to generate search results across a plurality of documents; and displaying at least a portion of the search results. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. A machine-readable storage medium carrying one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform a method comprising:
-
receiving a particular query from a user; based on the particular query, determining whether one or more particular words in the particular query is able to be transformed; determining one or more transformed forms of the one or more particular words; determining whether using the one or more transformed forms of the one or more particular words has a probability to produce relevant search results that is higher than a specified threshold; in response to determining that transforming the one or more particular words has a probability to produce relevant search results that is higher than a specified threshold, using a particular word and a transformed word for the particular word within a version of the particular query; generating search results across a plurality of documents based on executing the particular query and the version of the particular query; and displaying at least a portion of the search results. - View Dependent Claims (32)
-
Specification