System for text correction adaptive to the text being corrected
First Claim
Patent Images
1. A system for providing context sensitive word correction in which the context of a word in a sentence is utilized to determine which of several alternative or possible correctly-spelled words was intended, comprising:
- a conventional training corpus;
a target text having errors in which the word usage is mostly correct, but sometimes incorrect;
means for combining said conventional training corpus and said target text into a combined corpus;
means coupled to said combined corpus for ascertaining the probability that a word in said target text is the correct word based on the occurrence of said word in said combined corpus, such that context sensitive word correction is trained not only on a conventional training corpus, but also on a target text which contains word usage errors whereby the correspondence of said training corpus to said target text need not be strong, and whereby there is no user supplied feedback to supervise the training process.
3 Assignments
0 Petitions
Accused Products
Abstract
A system is provided for correcting users'"'"' mistakes including context-sensitive spelling errors and the like in which an adaptive correction algorithm is utilized which is trained on not only a conventional training corpus, but also on the text which is being corrected, thus to permit the correction of words based on the particular usages of the words in the text being corrected, taking advantage of the fact that the text to be corrected is by and large already mostly correct.
-
Citations
22 Claims
-
1. A system for providing context sensitive word correction in which the context of a word in a sentence is utilized to determine which of several alternative or possible correctly-spelled words was intended, comprising:
-
a conventional training corpus; a target text having errors in which the word usage is mostly correct, but sometimes incorrect; means for combining said conventional training corpus and said target text into a combined corpus; means coupled to said combined corpus for ascertaining the probability that a word in said target text is the correct word based on the occurrence of said word in said combined corpus, such that context sensitive word correction is trained not only on a conventional training corpus, but also on a target text which contains word usage errors whereby the correspondence of said training corpus to said target text need not be strong, and whereby there is no user supplied feedback to supervise the training process. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method of correcting target text, comprising the steps of:
-
analyzing uncorrected target text to create a training corpus; and applying the training corpus to the uncorrected target text to identify a first word as an erroneous word to be replaced to correct the uncorrected target text. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A system for replacing a word within uncorrected target text with another word to correct the uncorrected target text, comprising:
-
a training corpus created from uncorrected target text; and a buffer configured to store the uncorrected target text including a first word which is an erroneous word, and to replace, the first word with a second word, determined on the basis of the training corpus, to correct the uncorrected target text. - View Dependent Claims (20, 21, 22)
-
Specification