Method for rule-based correction of spelling and grammar errors
First Claim
Patent Images
1. A computer implemented method which does not require a stored dictionary for correcting spelling errors in a sequence of words, said method comprising the steps of:
- a) storing a plurality of dictionary pretested for exceptions spelling rules defined as regular expressions for matching potentially illegal n-grams some of which comprise less than all letters in a word and for replacing an illegal n-gram with a legal n-gram to return a corrected word;
b) submitting a word from said sequence of words to the spelling rules; and
c) replacing a word in the string of words with a corrected word, wherein an exception list is associated with at least one regular expression or with the system as a whole to prevent n-gram replacement where the word matches an exception to the rule.
2 Assignments
0 Petitions
Accused Products
Abstract
A computer implemented method which does not require a stored dictionary for correcting spelling errors in a sequence of words comprises storing a plurality of spelling rules defined as regular expressions for matching a potentially illegal n-gram which may comprise less than all letters in the word and for replacing an illegal n-gram with a legal n-gram to return a corrected word, submitting a word from said sequence of words to the spelling rules and replacing a word in the string of words with a corrected word.
-
Citations
21 Claims
-
1. A computer implemented method which does not require a stored dictionary for correcting spelling errors in a sequence of words, said method comprising the steps of:
-
a) storing a plurality of dictionary pretested for exceptions spelling rules defined as regular expressions for matching potentially illegal n-grams some of which comprise less than all letters in a word and for replacing an illegal n-gram with a legal n-gram to return a corrected word;
b) submitting a word from said sequence of words to the spelling rules; and
c) replacing a word in the string of words with a corrected word, wherein an exception list is associated with at least one regular expression or with the system as a whole to prevent n-gram replacement where the word matches an exception to the rule. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 21)
-
-
2. A computer implemented method which does not require a stored dictionary for correcting spelling errors and adjacent word grammar errors in a sequence of words, said method comprising the steps of:
-
a) storing a plurality of spelling and grammar rules pretested for exceptions defined as regular expressions for matching potentially illegal n-grams some of which comprise less than all letters in a word and given the context of one or more adjacent words replacing an illegal n-gram with a legal n-gram to return a corrected word;
b) submitting at least two adjacent words at a time from said sequence of words to said rules; and
c) replacing a word in the sequence of words with a corrected word, wherein an exception list is associated with at least one regular expression or with the system as a whole to prevent n-gram replacement where the word matches an exception to the rule.
-
-
12. The method according to claim 12, wherein the word boundary errors are selected from the group comprising missing spaces, inserted spaces, shifted spaces and combinations thereof.
-
19. A method which does not require a stored dictionary for correcting spelling errors in a sequence of words comprising the steps, some of which are implemented by a programmed computer, of:
-
a) generating spelling rules defined as regular expressions for matching illegal n-grams some of which comprise less than all letters in a word and for replacing an illegal n-gram with a legal n-gram to return a corrected word, said step of generating spelling rules comprising selecting as templates letters from errors in an error corpus and one or more letters of context to identify a set of rules pretesting and pruning from the set of rules those that are too general, too specific or do not identify the cause of the error;
b) storing said set of spelling rules defined as regular expressions;
c) submitting a word from said sequence of words to the spelling rules; and
d) replacing a word in the sequence of words with a corrected word. - View Dependent Claims (20)
-
Specification