Statistical Machine Translation Based Search Query Spelling Correction
First Claim
1. A method implemented by one or more computing devices, the method comprising:
- logging search data regarding searches performed by clients;
ascertaining error patterns based on query correction pairs described by the logged search data;
developing query correction models based at least in part upon the ascertained error patterns; and
translating an input search query to a corrected search query using the query correction models.
2 Assignments
0 Petitions
Accused Products
Abstract
Statistical Machine Translation (SMT) based search query spelling correction techniques are described herein. In one or more implementations, search data regarding searches performed by clients may be logged. The logged data includes query correction pairs that may be used to ascertain error patterns indicating how misspelled substrings may be translated to corrected substrings. The error patterns may be used to determine suggestions for an input query and to develop query correction models used to translate the input query to a corrected query. In one or more implementations, probabilistic features from multiple query correction models are combined to score different correction candidates. One or more top scoring correction candidates may then be exposed as suggestions for selection by a user and/or provided to a search engine to conduct a corresponding search using the corrected query version(s).
-
Citations
20 Claims
-
1. A method implemented by one or more computing devices, the method comprising:
-
logging search data regarding searches performed by clients; ascertaining error patterns based on query correction pairs described by the logged search data; developing query correction models based at least in part upon the ascertained error patterns; and translating an input search query to a corrected search query using the query correction models. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. One or more computer-readable storage media storing instructions that, when executed via the one or more processors, implement a correction module to perform operations comprising:
-
parsing an input search query to obtain individual search terms; generating query candidates corresponding to the input query using error patterns to find potential spelling corrections for the search terms; ranking the query candidates one to another in accordance with one or more query correction models developed based on the error patterns; picking one or more of the query candidates as suggestions according to the ranking; and outputting the query candidates that are picked as suggestions for selection by a user. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A system comprising:
-
one or more processors; one or more computer-readable storage media storing instructions that, when executed via the one or more processors, implement a correction module to perform operations to translate an input search query to a corrected query including; receiving the input search query; deriving candidates for the corrected query based on the input search query; for each candidate that is derived, computing a score for the candidate as a weighted combination of multiples features including at least; translation probabilities defined by a translation model according to character alignment of misspelled character strings with corrections of the misspelled character strings indicated by logged search data; probabilities for a next word in a word sequence given a designated number of preceding words in the word sequence encoded in a word-based language model; and probabilities for a next character in a character sequence given a designated number of preceding characters in the character sequence encoded in a character-based language model; and designating one of the candidates as the corrected query based on a comparison of the computed scores one to another. - View Dependent Claims (17, 18, 19, 20)
-
Specification