Knowledge-based character recognition
First Claim
1. A method responsive to a collection of input characters comprising the steps of:
- separating the collection of input characters into input character strings;
for each one of the input character strings, forming at least one set of candidate output strings, with most of the sets containing more than one candidate output string; and
based on probability measure related to the occurrence of input character combinations and probability measure related to the occurrence of an output string having a given string of characters, selecting one candidate output string from each of the sets as a determined output string of the setwherein said step of forming a set of candidate output strings carries out an iterative search procedure with the aid of a rewrite probabilities table and a word probabilities table.
1 Assignment
0 Petitions
Accused Products
Abstract
Character string recognition and identification is accomplished with a combined, multi-phase top-down and bottom-up process. Characters in an applied signal are recognized with a process that employs a knowledge source which contains information both, about the basic elements in the signal and about strings of the basic elements in the signal. The knowledge source, which may be derived from a training corpus, includes word probabilities, word di-gram probabilities, statisitics that relate the likelihood of words with particular character prefixes, and rewrite suggestions and their costs. Higher level word n-grams, such as word tri-gram probabilities, can also be used. A mechanism is provided for accepting words that are not found in the knowledge base, as well as for rewrite suggestions that are not in the knowledge base.
-
Citations
29 Claims
-
1. A method responsive to a collection of input characters comprising the steps of:
-
separating the collection of input characters into input character strings; for each one of the input character strings, forming at least one set of candidate output strings, with most of the sets containing more than one candidate output string; and based on probability measure related to the occurrence of input character combinations and probability measure related to the occurrence of an output string having a given string of characters, selecting one candidate output string from each of the sets as a determined output string of the set wherein said step of forming a set of candidate output strings carries out an iterative search procedure with the aid of a rewrite probabilities table and a word probabilities table. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
-
29. A method responsive to a collection of input characters comprising the steps of:
-
separating the collection of input characters into input character strings; for each one of the input character strings, forming at least one set of candidate output strings, with most of the sets containing more than one candidate output string; and based on probability measure related to the occurrence of input character combinations and probability measure related to the occurrence of an output string having a given string of characters, selecting one candidate output string from each of the sets as a determined output string of the set wherein said step of selecting considers probability measures of occurrence of pairs of character strings wherein the first character string in the pair of character strings is a previously determined output string, and the second character string in the pair of character strings is a candidate output string in the set of candidate output strings that correspond to an input character string which is adjacent to an input character string that corresponds to the first character string.
-
Specification