Cluster storage apparatus for post processing error correction of a character recognition machine
First Claim
1. A cluster storage apparatus for outputting groups of valid alpha words as potential candidates for the correct form of an alpha word misrecognized by an OCR machine, comprising:
- a two-dimenational array of alpha word read only storage locations, each location having a group of alpha words arranged such that adjacent locations contain alpha words having similar OCR misread propensities;
means for assigning numeric values to the characters of the input alpha word based upon the read reliability of the characters;
a first-dimensional accessing means for addressing said locations based upon the values assigned to the characters of which the input alpha word is composed;
a second-dimensional accessing means for accessing said locations based upon the number of characters in said input alpha word;
said first-dimensional accessing means calculating the first-dimensional address as a magnitude ##EQU7## where LN is the numeric value assigned to each alpha character;
whereby an input alpha word which is potentially in error can be associated with that portion of the read only storage which contains potential candidates for the correct form of the input alpha word.
0 Assignments
0 Petitions
Accused Products
Abstract
A cluster storage apparatus is disclosed for outputting groups of valid alpha words as potential candidates for the correct form of an alpha word misrecognized by a character recognition machine. Groups of alpha words are arranged in the cluster storage apparatus such that adjacent locations contain alpha words having similar character recognition misread propensities. Alpha words which have been determined to be misrecognized, are input to the cluster storage apparatus. Numerical values assigned to the characters of which the input word is composed, are used to calculate the address of that group of valid alpha words having similar character recognition misread propensities. The cluster storage apparatus then outputs the accessed groups of alpha words for subsequent processing. The organization of the cluster storage apparatus minimizes the difference in address between alpha words with similar character recognition misread propensities by assigning high numeric values to highly reliable characters, as determined by measuring the character transfer function of the character recognition machine.
53 Citations
10 Claims
-
1. A cluster storage apparatus for outputting groups of valid alpha words as potential candidates for the correct form of an alpha word misrecognized by an OCR machine, comprising:
-
a two-dimenational array of alpha word read only storage locations, each location having a group of alpha words arranged such that adjacent locations contain alpha words having similar OCR misread propensities; means for assigning numeric values to the characters of the input alpha word based upon the read reliability of the characters; a first-dimensional accessing means for addressing said locations based upon the values assigned to the characters of which the input alpha word is composed; a second-dimensional accessing means for accessing said locations based upon the number of characters in said input alpha word; said first-dimensional accessing means calculating the first-dimensional address as a magnitude ##EQU7## where LN is the numeric value assigned to each alpha character;
whereby an input alpha word which is potentially in error can be associated with that portion of the read only storage which contains potential candidates for the correct form of the input alpha word. - View Dependent Claims (2, 4, 6)
-
-
3. A cluster storage apparatus for outputting groups of valid phoneme words as potential candidates for the correct form of a phoneme word misrecognized by speech analyzer machine, comprising:
-
a two-dimensional array of phoneme word read only storage locations, each location having a group of phoneme words arranged such that adjacent locations contain phoneme words having similar speech analyzer misread propensities; means for assigning numeric values to the characters of the input phoneme word based upon the read reliability of the characters; a first-dimensional accessing means for addressing said locations based upon the values assigned to the characters of which the input phoneme word is composed; a second-dimensional accessing means for accessing said locations based upon the number of characters in said input phoneme word; said first-dimensional accessing means calculating the first-dimensional address as a magnitude ##EQU8## where LN is the numeric value assigned to each phoneme character;
whereby an input phoneme word which is potentially in error can be associated with that portion of the read only storage which contains potential candidates for the correct form of the input phoneme word.
-
-
5. A cluster storage apparatus for outputting groups of valid alpha words as potential candidates for the correct form of an alpha word mistyped on a keyboard machine, comprising:
-
a two-dimensional array of alpha word read only storage locations, each location having a group of alpha words arranged such that adjacent locations contain alpha words having similar keyboard typographical error propensities; means for assigning numeric values to the characters of the input alpha word based upon the typographical error propensity of the characters; a first-dimensional accessing means for addressing said locations based upon the values assigned to the characters of which the input alpha word is composed; a second-dimensional accessing means for accessing said locations based upon the number of characters in said input alpha word; said first-dimensional accessing means calculating the first-dimensional address as a magnitude ##EQU9## where LN is the numeric value assigned to each alpha character;
whereby an input alpha word which is potentially in error can be associated with that portion of the read only storage which contains potential candidates for the correct form of the input alpha word.
-
-
7. The post processing error correction system comprising:
-
a word generating source having a character transfer function which represents the error propensity of multicharacter words output thereby; a binary reference matrix having an input line connected to the input of said word generating source, to detect invalid alpha words; said binary reference matrix having an output control line carrying a binary signal which indicates whether the input alpha word is valid; a gate means connected to said output from said word generating source and having a control input from said control output of said binary reference matrix, for gating the input alpha word from said word generating source onto a first output line in response to a signal on said control line from said binary reference matrix indicating that said alpha word is valid, and gating said input alpha word onto a second output line in response to a signal from said binary reference matrix control line indicating said alpha word is invalid; a cluster storage apparatus having an input connected to said second output line from said gating means, to access from an associative memory therein, a group of correct alpha words which have some probability of having been confused with said invalid alpha words input on said second output line from said gate; the regional context error correction apparatus having an input connected to said output from said gating means and having a second input connected to the output from said cluster storage apparatus for accepting said group of correct alpha words; said cluster storage apparatus executing a conditional probability analysis to determine which one of the group of correct alpha words most closely corresponds to the invalid alpha word output by said word generating source; said regional context error correction apparatus outputting the word which most closely corresponds to the invalid alpha word output by said word generating source; whereby the most probable correct version of a garbled word output from said word generating source, is determined. - View Dependent Claims (8, 9, 10)
-
Specification