Word grouping accuracy value generation
First Claim
1. A computer-implemented method for generating word grouping accuracy values in a system wherein data are received, characters are recognized within the received data, character accuracy values are generated from the recognized characters, and word groupings are created from the recognized characters, comprising:
- selecting one of the created word groupings;
obtaining character accuracy values for the characters within the selected word grouping;
decreasing the obtained character accuracy values responsive to the recognized character being a character known to be less accurately recognized by the character recognition technique;
increasing the obtained character accuracy values responsive to the recognized character being a character known to be more accurately recognized by the character recognition technique;
calculating a word grouping accuracy value based upon the obtained character accuracy values; and
repeating the selecting, obtaining, decreasing, increasing and calculating steps for each created word grouping.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention is a computer-implemented method for calculating word accuracy. Word grouping accuracy values (260) are calculated (212) by using the character accuracy values (250) calculated by an OCR program present in a computer system. The present invention preferably uses these character accuracy values (250) to create a word grouping accuracy value (260). Various methods are employed to calculate the word accuracy (260), including binarizing the character accuracy values (250), modified averaging of the character accuracy values (250), and creating fuzzy visual displays of word grouping accuracy values (260). The calculated word grouping accuracy values (260) are then adjusted based upon known OCR strengths and weaknesses, and based upon comparisons to stored word lists and the application of language rules. In a system with multiple character recognition techniques, the system can compare the accuracy values (260) of different versions of the word groupings to find the most accurate version. Then, the most accurate version of the word groupings is kept.
139 Citations
20 Claims
-
1. A computer-implemented method for generating word grouping accuracy values in a system wherein data are received, characters are recognized within the received data, character accuracy values are generated from the recognized characters, and word groupings are created from the recognized characters, comprising:
-
selecting one of the created word groupings;
obtaining character accuracy values for the characters within the selected word grouping;
decreasing the obtained character accuracy values responsive to the recognized character being a character known to be less accurately recognized by the character recognition technique;
increasing the obtained character accuracy values responsive to the recognized character being a character known to be more accurately recognized by the character recognition technique;
calculating a word grouping accuracy value based upon the obtained character accuracy values; and
repeating the selecting, obtaining, decreasing, increasing and calculating steps for each created word grouping. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
decreasing the calculated word grouping accuracy value responsive to the selected word grouping contradicting one of the language rules.
-
-
4. The method of claim 1 in a system having one or more stored lists of word groupings, said method further comprising the steps of:
-
increasing the calculated word grouping accuracy value responsive to the selected word grouping matching one of the word groupings in the stored lists; and
decreasing the calculated word grouping accuracy value responsive to the selected word grouping not matching one of the word groupings in the stored lists.
-
-
5. The method of claim 4 wherein the stored lists of word groupings include lists of technical words, words found in a dictionary, foreign words, and trade words.
-
6. The method of claim 1 further comprising the step of:
-
displaying the selected word grouping and the calculated word grouping accuracy value; and
the repeating step further comprises repeating the selecting, obtaining, calculating, and displaying steps for each created word grouping.
-
-
7. The method of claim 1 wherein the calculating a word grouping accuracy value step comprises the substeps of:
-
determining a minimum character accuracy value of the obtained character accuracy values; and
setting the word grouping accuracy value equal to the determined minimum character accuracy value.
-
-
8. The method of claim 1 in a system which uses an indexing list composed of word groupings to search for and retrieve documents, said method further comprising the step of:
adding to the indexing list word groupings whose accuracy values exceed a threshold accuracy level.
-
9. The method of claim 8 further comprising the step of:
responsive to none of the word groupings having an accuracy value which exceeds the threshold accuracy level, displaying an option to set the threshold accuracy level to a different value.
-
10. The method of claim 1 wherein there are multiple threshold accuracy levels, and visual quality symbols associated with each level, said method further comprising the steps of:
-
comparing the obtained accuracy values of the recognized characters to the multiple threshold accuracy levels; and
assigning a visual identifier associated with a threshold closest in value to the recognized character'"'"'s accuracy level.
-
-
11. A computer-implemented method for generating word grouping accuracy values in a system wherein data are received, characters are recognized within the received data, character accuracy values are generated from the recognized characters, and word groupings are created from the recognized characters, said method comprising the steps of:
-
selecting one of the created word groupings;
obtaining character accuracy values for the characters within the selected word grouping;
calculating a word grouping accuracy value based upon the obtained character accuracy values comprising the substeps of;
selecting a character within the selected word grouping, determining whether the character accuracy value of the selected character exceeds a threshold accuracy level, responsive to the character accuracy value exceeding the threshold accuracy level, assigning a “
one”
to the character,responsive to determining that the character accuracy value does not exceed a threshold accuracy level, assigning a “
zero”
to the character,repeating the selecting, determining, assigning a “
one”
, and assigning a “
zero”
substeps for each character in the selected word grouping,determining a word grouping accuracy value by the logical combination of characters assigned a “
one” and
the total number of characters in the selected word grouping; and
repeating the selecting, obtaining, and calculating steps for each created word grouping. - View Dependent Claims (12)
-
-
13. A computer-implemented method for generating word grouping accuracy values in a system wherein data are received, characters are recognized within the received data, character accuracy values are generated from the recognized characters, and word groupings are created from the recognized characters, said method comprising the steps of:
-
selecting one of the created word groupings;
obtaining character accuracy values for the characters within the selected word grouping;
calculating a word grouping accuracy value based upon the obtained character accuracy values by the substeps of;
responsive to determining that all of the character accuracy values are at least equal to a threshold accuracy value, calculating the word grouping accuracy value of the selected word grouping by calculating an average of the character accuracy values, responsive to determining that at least one of the character accuracy values is less than the threshold accuracy value, calculating the word grouping accuracy value by;
dividing the number of characters that do not have an accuracy value that exceeds the threshold by one hundred, and subtracting the result from the threshold accuracy value; and
repeating the selecting, obtaining, and calculating steps for each created word grouping.
-
-
14. A computer-implemented method for generating word grouping accuracy values in a system with multiple character recognition techniques, wherein data are received, characters are recognized within the received data, character accuracy values are generated from the recognized characters, word groupings are created from the recognized characters, and multiple versions of word groupings are created, each version being a version created from characters recognized by one of the character recognition techniques, said method comprising the steps of:
-
selecting one of the created word groupings;
selecting one of the versions of the selected word grouping;
obtaining character accuracy values for the selected version of the word grouping;
calculating a word grouping accuracy value for the selected version of the word grouping based upon the obtained character accuracy values; and
repeating the selecting one of the versions, obtaining character accuracy value, and calculating a word grouping accuracy value steps until all of the versions of the selected word grouping have been selected;
determining a most accurate version of the selected word grouping; and
repeating the selecting one of the created word groupings, selecting one of the versions of the selected word groupings, obtaining character accuracy value, calculating a word grouping accuracy value, and determining the most accurate version steps until all of the word groupings have been selected. - View Dependent Claims (15, 16, 17, 18)
determining whether the character accuracy value exceeds a threshold accuracy value;
responsive to determining the character accuracy value exceeds a threshold accuracy value, assigning a “
one”
to the character;
responsive to determining the character accuracy value does not exceed a threshold accuracy value, assigning a “
zero”
to the character;
repeating the obtaining, determining and assigning steps for each character in the selected word grouping; and
determining a word grouping accuracy value by a logical combination of characters assigned a “
one” and
the total number of characters in the selected word grouping.
-
-
17. The method of claim 14 in a system where there is a predefined set of language rules that govern word groupings, said method further comprising the step of:
decreasing the calculated word grouping accuracy value responsive to the selected word grouping contradicting one of the language rules.
-
18. The method of claim 14 in a system where there is at least one stored list of word groupings, said method further comprising the steps of:
-
increasing the calculated word grouping accuracy value responsive to the selected word grouping matching one of the word groupings in the at least one stored list; and
decreasing the calculated word grouping accuracy value responsive to the selected word grouping not matching one of the word groupings in the at least one stored list.
-
-
19. A computer-readable medium containing a computer program that calculates word grouping accuracy values from data received in a document imaging system, wherein data are received, characters are recognized within the received data, “
- by performing a character recognition technique”
character accuracy values are generated from the recognized characters, and word groupings are created from the recognized characters, and the program causes the processor to select one of the created word groupings, obtain character accuracy values for the characters within the selected word grouping, decrease the obtained character accuracy values responsive to the recognized character being a character known to be less accurately recognized by the character recognition technique, increase the obtained character accuracy values responsive to the recognized character being a character known to be more accurately recognized by the character recognition technique, calculate a word grouping accuracy value based upon the obtained character accuracy values, and repeat selecting, obtaining, and calculating for each created word grouping.
- by performing a character recognition technique”
-
20. In a system for generating word grouping accuracy values wherein data are received, characters are recognized within the received data, character accuracy values are generated from the recognized characters, and word groupings are created from the recognized characters, a computer apparatus comprising:
-
a data receiver, for receiving data;
coupled to the data receiver, a first memory, for storing the received data;
coupled to the first memory, a central processing unit, for performing;
selecting one of the created word groupings, obtaining character accuracy values for the characters within the selected word grouping, decreasing the obtained character accuracy values responsive to the recognized character being a character known to be less accurately recognized by the character recognition technique, increasing the obtained character accuracy values responsive to the recognized character being a character known to be more accurately recognized by the character recognition technique, calculating a word grouping accuracy value based upon the obtained character accuracy values, and repeating the selecting, obtainingm, decreasing, increasing, and calculating steps for each created word grouping; and
coupled to the central processing unit, RAM, for temporarily storing the created word groupings.
-
Specification