SYSTEMS AND METHODS FOR IMPROVED SPELL CHECKING
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention leverages iterative transformations of search query strings along with statistics extracted from search query logs and/or web data to provide possible alternative spellings for the search query strings. This provides a spell checking means that can be influenced to provide individualized suggestions for each user. By utilizing search query logs, the present invention can account for substrings not found in a lexicon but still acceptable as a search query of interest. This allows a means to provide a higher quality proposal for alternative spellings, beyond the content of the lexicon. One instance of the present invention operates at a substring level by utilizing word unigram and/or bigram statistics extracted from query logs combined with an iterative search. This provides substantially better spelling alternatives for a given query than employing only substring matching. Other instances can receive input data from sources other than a search query input.
123 Citations
40 Claims
-
1-20. -20. (canceled)
-
21. A system that facilitates spell checking, comprising:
-
a component that receives input data containing text; and
a spell checking component that identifies a set of potentially misspelled substrings in the text and proposes at least one alternative spelling for the substring set based on at least one query log;
the query log comprising data utilized by users to query a data collection over a time frame, wherein the identification of a misspelled substring is based at least in part on the frequency of occurrence of the substring in the at least one query log. - View Dependent Claims (22, 23, 24, 25, 26, 27, 29, 30, 31)
-
-
28. The system of claim 28, the substring co-occurrence statistics for the list of stop words with content further comprising a substring bigram with stop-word-sequence-skipping counts.
-
32. A method of facilitating spell checking, comprising:
-
receiving input data containing text;
identifying a set of potentially misspelled substrings in the text, wherein the identification of a misspelled substring is based at least in part on the prevalence of the substring in at least one query log;
the query log comprising data utilized by users to query a data collection over a time frame; and
proposing at least one alternative spelling for the substring set. - View Dependent Claims (33, 34, 35, 36, 37)
-
-
38. A system that facilitates spell checking queries to a search engine, comprising:
-
means for receiving input data containing text; and
means for identifying a set of potentially misspelled substrings in the text and proposing at least one alternative spelling for the substring set based on at least one query log;
the query log comprising data utilized by users to query a data collection over a time frame, the identification of a misspelled substring is based at least in part on the number of occurrence of the substring in the at least one query log. - View Dependent Claims (39, 40)
-
Specification