Spell checking system including a phonetic speller
First Claim
1. A computer-implemented method of ranking replacement target strings for a misspelled source string, the computer-implemented method comprising:
- converting the misspelled source string into a source phoneme sequence using a letter-to-sound system;
utilizing a computer processor that is a component of the computer to traverse at least one phoneme-based trie structure so as to select a plurality of different candidate phoneme sequences based on a comparison of phonemes in the phoneme-based trie structure to the source phoneme sequence but without doing a direct comparison of every component of the plurality of different candidate phoneme sequences to a component of the source phoneme sequence;
generating a count for each different candidate phoneme sequence in said plurality of different candidate phoneme sequences, the count being indicative of a quantity of edit operations required to transform the candidate phoneme sequence into the source phoneme sequence;
utilizing the computer processor to select a limited number of the plurality of different candidate phoneme sequences based at least in part on said counts, the limited number being less than all of the different candidate phoneme sequences included in said plurality of different candidate phoneme sequences;
utilizing the computer processor to select a first set of replacement target strings, each replacement target string in the first set being selected based on direct correspondence to a candidate phoneme sequences in said limited number of different candidate phoneme sequences;
utilizing the computer processor to traverse at least one letter-based trie structure so as to select a plurality of different candidate letter sequences, wherein selecting the plurality of different candidate letter sequences comprises traversing the letter-based trie structure so as to identify the plurality of different candidate letter sequences without doing a direct comparison of every component of the plurality of different candidate letter sequences with every component of the misspelled source string;
utilizing the computer processor to select a limited number of the plurality of different candidate letter sequences based at least in part on a count of a quantity of edit operations required to transform each different candidate letter sequence included in the limited number of different candidate letter sequences into the misspelled source string, the limited number of different candidate letter sequences being less than all of the different candidate letter sequences included in said plurality of different candidate letter sequences;
utilizing the computer processor to select a second set of replacement target strings, each replacement target string in the second set being one of the candidate letter sequences in said limited number of the plurality of different candidate letter sequences; and
utilizing the computer processor to rank the replacement strings in the first and/or second sets based on a summation of the count of the quantity of edit operations required to transform a particular different candidate phoneme sequence included in the limited number of the plurality of different candidate phoneme sequences into the source phoneme sequence plus the count of the quantity of edit operations required to transform a particular different candidate letter sequence included in the limited number of the plurality of different candidate letter sequences into the misspelled source string.
2 Assignments
0 Petitions
Accused Products
Abstract
A spell checking system includes a letter spelling engine. The letter spelling engine is configured to select a plurality of candidate letter target strings that closely match a misspelled source string. The spell checking system includes a phoneme spelling engine. The phoneme spelling engine is configured to select a plurality of candidate phoneme target strings that closely match the misspelled source string. A ranker module is configured to combine the candidate letter target strings and the candidate phoneme target strings into a combined list of candidate target strings. The ranker module is also configured to rank the list of candidate target strings to provide a list of best candidate target strings for the misspelled source string.
-
Citations
15 Claims
-
1. A computer-implemented method of ranking replacement target strings for a misspelled source string, the computer-implemented method comprising:
-
converting the misspelled source string into a source phoneme sequence using a letter-to-sound system; utilizing a computer processor that is a component of the computer to traverse at least one phoneme-based trie structure so as to select a plurality of different candidate phoneme sequences based on a comparison of phonemes in the phoneme-based trie structure to the source phoneme sequence but without doing a direct comparison of every component of the plurality of different candidate phoneme sequences to a component of the source phoneme sequence; generating a count for each different candidate phoneme sequence in said plurality of different candidate phoneme sequences, the count being indicative of a quantity of edit operations required to transform the candidate phoneme sequence into the source phoneme sequence; utilizing the computer processor to select a limited number of the plurality of different candidate phoneme sequences based at least in part on said counts, the limited number being less than all of the different candidate phoneme sequences included in said plurality of different candidate phoneme sequences; utilizing the computer processor to select a first set of replacement target strings, each replacement target string in the first set being selected based on direct correspondence to a candidate phoneme sequences in said limited number of different candidate phoneme sequences; utilizing the computer processor to traverse at least one letter-based trie structure so as to select a plurality of different candidate letter sequences, wherein selecting the plurality of different candidate letter sequences comprises traversing the letter-based trie structure so as to identify the plurality of different candidate letter sequences without doing a direct comparison of every component of the plurality of different candidate letter sequences with every component of the misspelled source string; utilizing the computer processor to select a limited number of the plurality of different candidate letter sequences based at least in part on a count of a quantity of edit operations required to transform each different candidate letter sequence included in the limited number of different candidate letter sequences into the misspelled source string, the limited number of different candidate letter sequences being less than all of the different candidate letter sequences included in said plurality of different candidate letter sequences; utilizing the computer processor to select a second set of replacement target strings, each replacement target string in the second set being one of the candidate letter sequences in said limited number of the plurality of different candidate letter sequences; and utilizing the computer processor to rank the replacement strings in the first and/or second sets based on a summation of the count of the quantity of edit operations required to transform a particular different candidate phoneme sequence included in the limited number of the plurality of different candidate phoneme sequences into the source phoneme sequence plus the count of the quantity of edit operations required to transform a particular different candidate letter sequence included in the limited number of the plurality of different candidate letter sequences into the misspelled source string. - View Dependent Claims (2, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
3. The computer-implemented method of 1, wherein selecting the plurality of different candidate letter sequences comprises parsing the misspelled source string to obtain a source letter sequence string.
-
13. A computer-implemented spell checking system for ranking replacement target strings for a misspelled source string, the computer-implemented method comprising:
-
a phoneme spelling engine that selects sets of the replacement target strings for ranking by; converting the misspelled source string into a source phoneme sequence using a letter-to-sound system; utilizing a computer processor that is a component of the computer to traverse at least one phoneme-based trie structure so as to select a plurality of different candidate phoneme sequences based on a comparison of phonemes in the phoneme-based trie structure to the source phoneme sequence but without doing a direct comparison of every component of the plurality of different candidate phoneme sequences to a component of the source phoneme sequence; generating a count for each different candidate phoneme sequence in said plurality of different candidate phoneme sequences, the count being indicative of a quantity of edit operations required to transform the candidate phoneme sequence into the source phoneme sequence; utilizing the computer processor to select a limited number of the plurality of different candidate phoneme sequences based at least in part on said counts, the limited number being less than all of the different candidate phoneme sequences included in said plurality of different candidate phoneme sequences; utilizing the computer processor to select a first set of replacement target strings, each replacement target string in the first set being selected based on direct correspondence to candidate phoneme sequences in said limited number of different candidate phoneme sequences; a letter spell checking engine that utilizes the computer processor to select the replacement target strings that closely match the misspelled source target string by; utilizing the computer processor to traverse at least one letter-based trie structure so as to select a plurality of different candidate letter sequences, wherein selecting the plurality of different candidate letter sequences comprises traversing the letter-based trie structure so as to identify the plurality of different candidate letter sequences without doing a direct comparison of every component of the plurality of different candidate letter sequences with every component of the misspelled source string; utilizing the computer processor to select a limited number of the plurality of different candidate letter sequences based at least in part on a count of a quantity of edit operations required to transform each different candidate letter sequence included in the limited number of different candidate letter sequences into the misspelled source string, the limited number of different candidate letter sequences being less than all of the different candidate letter sequences included in said plurality of different candidate letter sequences; utilizing the computer processor to select a second set of replacement target strings, each replacement target string in the second set being one of the candidate letter sequences in said limited number of the plurality of different candidate letter sequences; and a ranker module utilizing the computer processor to rank the replacement strings in the first and/or second sets based on a summation of the count of the quantity of edit operations required to transform a particular different candidate phoneme sequence included in the limited number of the plurality of different candidate phoneme sequences into the source phoneme sequence plus the count of the quantity of edit operations required to transform a particular different candidate letter sequence included in the limited number of the plurality of different candidate letter sequences into the misspelled source string. - View Dependent Claims (14, 15)
-
Specification