Method and apparatus for displaying regions in a document image having a low recognition confidence
First Claim
Patent Images
1. A method of OCR output error detection, comprising the steps of:
- recognizing a plurality of characters in a document image;
determining words from a sequence of said plurality of characters;
determining regions of the document image that correspond to said words;
correlating said words to said regions of said document image in a correlation table;
determining a recognition confidence parameter for a plurality of words in said correlation table;
defining a threshold level for said recognition confidence parameter; and
displaying the regions of the document image containing a word having a recognition confidence parameter greater than said threshold level.
5 Assignments
0 Petitions
Accused Products
Abstract
A document image that is the source of Optical Character Recognition (OCR) output is displayed. Recognition confidence parameters are determined for regions of the document image corresponding to words in the OCR output. The regions are displayed in a manner (e.g., highlighted in various colors) that is indicative of the respective recognition confidence parameter. Preferably, a user can select a region of the displayed document image. When the region is selected, text of the OCR output corresponding to the selected region is displayed in a pop-up menu.
106 Citations
12 Claims
-
1. A method of OCR output error detection, comprising the steps of:
-
recognizing a plurality of characters in a document image;
determining words from a sequence of said plurality of characters;
determining regions of the document image that correspond to said words;
correlating said words to said regions of said document image in a correlation table;
determining a recognition confidence parameter for a plurality of words in said correlation table;
defining a threshold level for said recognition confidence parameter; and
displaying the regions of the document image containing a word having a recognition confidence parameter greater than said threshold level. - View Dependent Claims (2, 3, 4)
receiving input that selects a region in the document image;
determining a word from said correlation table that corresponds to said selected region; and
displaying the word corresponding to said region.
-
-
3. The method of claim 2, wherein the step of displaying the word includes the step of displaying the word in a pop-up menu.
-
4. The method of claim 1, further comprising the steps of:
-
determining a color for the regions having a recognition confidence parameter less than said threshold value; and
displaying the regions of the document image having said color.
-
-
5. An apparatus for OCR output error detection, comprising:
-
an OCR device for recognizing a plurality of characters in a document image;
means for determining words from a sequence of said plurality of characters;
means for determining regions of the document image that correspond to said words;
means for correlating said words to said regions of said document image in a correlation table;
means for determining a recognition confidence parameter for a plurality of words in said correlation table;
means for defining a threshold level for said recognition confidence parameter; and
a display for displaying the regions of the document image containing a word having a recognition confidence parameter greater than said threshold level. - View Dependent Claims (6, 7, 8)
a cursor control for receiving input that selects a region in the document image; and
means for determining a word from said correlation table that corresponds to said selected region;
wherein the display displays the word corresponding to said region.
-
-
7. The apparatus of claim 6, wherein the display displays the word corresponding to said region in a pop-up menu.
-
8. The apparatus of claim 5, further comprising:
-
means for determining a color for the regions having a recognition confidence parameter less than said threshold value;
wherein the display displays the regions of the document image having said color.
-
-
9. A computer readable medium having sequences of instructions for OCR output error detection, said sequences of instructions including sequences of instructions for performing the steps of:
-
recognizing a plurality of characters in a document image;
determining words from a sequence of said plurality of characters;
determining regions of the document image that correspond to said words;
correlating said words to said regions of said document image in a correlation table;
determining a recognition confidence parameter for a plurality of words in said correlation table;
defining a threshold level for said recognition confidence parameter; and
displaying the regions of the document image containing a word having a recognition confidence parameter greater than said threshold level. - View Dependent Claims (10, 11, 12)
receiving input that selects a region in the document image;
determining a word from said correlation table that corresponds to said selected region; and
displaying the word corresponding to said region.
-
-
11. The computer readable medium of claim 10, wherein the step of displaying the word includes the step of displaying the word in a pop-up menu.
-
12. The computer readable medium of claim 9, wherein said sequences of instructions further include the steps of:
-
determining a color for the regions having a recognition confidence parameter less than said threshold value; and
displaying the regions of the document image having said color.
-
Specification