Spell checker with arbitrary length string-to-string transformations to improve noisy channel spelling correction
First Claim
Patent Images
1. An apparatus, comprising:
- a computing processor;
an electronic memory coupled with the computing processor;
an input device to receive an entered string s, anda program in the memory for execution on the computing processor to determine a probability P(s|w) expressing how likely a word w was to have been incorrectly entered as the string s, by partitioning the word w and the string s into different numbers of segments that define varying lengths of character sequences and computing probabilities for various partitionings, as follows;
where Part(w) is a set of possible ways of partitioning the word w, Part(s) is a set of possible ways of partitioning the string s, R is a particular partition of the word w, and T is a particular partition of the string s.
2 Assignments
0 Petitions
Accused Products
Abstract
A spell checker based on the noisy channel model has a source model and an error model. The source model determines how likely a word w in a dictionary is to have been generated. The error model determines how likely the word w was to have been incorrectly entered as the string s (e.g., mistyped or incorrectly interpreted by a speech recognition system) according to the probabilities of string-to-string edits. The string-to-string edits allow conversion of one arbitrary length character sequence to another arbitrary length character sequence.
54 Citations
10 Claims
-
1. An apparatus, comprising:
-
a computing processor; an electronic memory coupled with the computing processor; an input device to receive an entered string s, and a program in the memory for execution on the computing processor to determine a probability P(s|w) expressing how likely a word w was to have been incorrectly entered as the string s, by partitioning the word w and the string s into different numbers of segments that define varying lengths of character sequences and computing probabilities for various partitionings, as follows; where Part(w) is a set of possible ways of partitioning the word w, Part(s) is a set of possible ways of partitioning the string s, R is a particular partition of the word w, and T is a particular partition of the string s. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system, comprising:
-
means for receiving an entered string s, and an input device to receive an entered string s, and means for determining a probability P(s|w) expressing how likely a word w was to have been incorrectly entered as the string s, by partitioning the word w and the string s into different numbers of segments that define varying lengths of character sequences and computing probabilities for various partitonings, as follows; where Part(w) is a set of possible ways of partitioning the word w, Part(s) is a set of possible ways of partitioning the string s, R is a particular partition of the word w, and T is a particular partition of the string.
-
Specification