System and method for enhanced character recogngition accuracy by adaptive probability weighting
First Claim
1. In a data processing system, including a scanner, a plurality of character recognition means, a recognition station processor and a monitoring and correction station processor, a method for selecting which one of said plurality of character recognition means to use for recognizing a character in a field in a document image, comprising the steps of:
- generating a character form confidence factor in the recognition station processor, for each of said plurality of character recognition means;
generating a field type confidence factor in the recognition station processor, for each of said plurality of character recognition means;
inputting an adaptive probability weighting factor from said monitoring and correction station processor to said recognition station processor, for each of said plurality of character recognition means;
generating a first guess character and first confidence value and a second guess character and second confidence value, using each of said plurality of character recognition means;
computing a recognition means choice confidence factor in said recognition station processor, as a product of said character form confidence factor, said field type confidence factor, and said adaptive probability weighting factor, times a difference between said first confidence value and said second confidence value; and
selecting one of said plurality of character recognition means in said data processing system, having a maximum value for said recognition means choice confidence factor.
3 Assignments
0 Petitions
Accused Products
Abstract
A data processing system and method are disclosed for selecting which one of several character recognition programs should be used to optimize the accuracy in recognizing characters in a field in an image of a document form. Consideration is taken of the character form and field type for particular characters and an optimized selection is performed on a realtime basis among the several candidate character recognition programs which could be applied. The resulting character recognition operation has its accuracy maximized for reading a wide variety of character forms and field types in recognition of preprinted forms.
-
Citations
25 Claims
-
1. In a data processing system, including a scanner, a plurality of character recognition means, a recognition station processor and a monitoring and correction station processor, a method for selecting which one of said plurality of character recognition means to use for recognizing a character in a field in a document image, comprising the steps of:
-
generating a character form confidence factor in the recognition station processor, for each of said plurality of character recognition means; generating a field type confidence factor in the recognition station processor, for each of said plurality of character recognition means; inputting an adaptive probability weighting factor from said monitoring and correction station processor to said recognition station processor, for each of said plurality of character recognition means; generating a first guess character and first confidence value and a second guess character and second confidence value, using each of said plurality of character recognition means; computing a recognition means choice confidence factor in said recognition station processor, as a product of said character form confidence factor, said field type confidence factor, and said adaptive probability weighting factor, times a difference between said first confidence value and said second confidence value; and selecting one of said plurality of character recognition means in said data processing system, having a maximum value for said recognition means choice confidence factor. - View Dependent Claims (2, 3, 4, 5)
-
-
6. In a data processing system, including a plurality of character recognition means, a recognition station processor and a monitoring and correction station processor, a method for selecting which one of said plurality of character recognition means to use for recognizing a character in a field in a document image, comprising the steps of:
-
inputting an adaptive probability weighting factor from said monitoring and correction station processor to said station recognition processor for each of said plurality of character recognition means; generating a first guess character and first confidence value and a second guess character and second confidence value in said character recognition means, using each of said plurality of character recognition means; computing a recognition means choice confidence factor in said recognition station processor, as a function of a product comprising an adaptive probability weighting factor times a difference between said first confidence value and said second confidence value; and selecting in said recognition station processor one of said plurality of character recognition means, having a maximum value for said recognition means choice confidence factor.
-
-
7. In a data processing system, including a plurality of character recognition means and a recognition station processor, a method for selecting which one of said plurality of character recognition means to use for recognizing a character in a field in a document image, comprising the steps of:
-
generating a character form confidence factor in said recognition station processor, for each of said plurality of character recognition means; generating a field type confidence factor in said recognition station processor, for each of said plurality of character recognition means; generating a first guess character and first confidence value and a second guess character and second confidence value using each of said plurality of character recognition means; computing a recognition means choice confidence factor in said recognition station processor, as a product of said character form confidence factor and said field type confidence factor, times a difference between said first confidence value and said second confidence value; and selecting in said recognition station processor one of said plurality of character recognition means having a maximum value for said recognition means choice confidence factor.
-
-
8. In a data processing system, including a plurality of:
-
character recognition means, a recognition station processor and a monitoring and correction station processor, a method for selecting which one of said plurality of character recognition means to use for recognizing a character in a field in a document image, comprising the steps of; inputting an adaptive probability weighting factor from said monitoring and correction station processor to said recognition station processor for each of said plurality of character recognition means; generating a first guess character and first confidence value and a second guess character and second confidence value using each of said plurality of character recognition means; computing a recognition means choice confidence factor in said recognition station processor as a product of said adaptive probability weighting factor, times a difference between said first confidence value and said second confidence value; and selecting in said recognition station processor one of said plurality of character recognition means having a maximum value for said recognition means choice confidence factor. - View Dependent Claims (9, 10, 11, 12)
-
-
13. In a data processing system, including a document image input means, a forms recognition means, a field extraction means, a recognition station processor, a monitoring and correction station processor and a plurality of character recognition means, a method for selecting which one of said plurality of character recognition means to use for a character in a field in a document image, comprising the steps of:
-
generating a character form confidence factor in said recognition station processor, for each of said plurality of character recognition means; generating a field type confidence factor in said recognition station processor for each of said plurality of character recognition means; generating an adaptive probability weighting factor in said recognition station processor, for each of said plurality of character recognition means; generating a first guess character and first confidence value and a second guess character and second confidence value using each of said plurality of character recognition means; computing an OCR engine choice confidence factor in said recognition station processor, as a product of said character form confidence factor, said field type confidence factor, and said adaptive probability weighting factor, times a difference between said first confidence value and said second confidence value; selecting in said recognition station processor one of said plurality of character recognition means having a maximum value for said OCR engine choice confidence factor; accumulating an error count for one of said plurality of character recognition means in said monitoring and correction station processor; computing a new value for said adaptive probability weighting factor in said monitoring and correction station processor, for each of said plurality of character recognition means, by modifying said adaptive probability weighting factor with a value derived from said error count.
-
-
14. In a data processing system, including a plurality of character recognition means, an apparatus for selecting which one of said plurality of character recognition means to use for a character in a field in a document image, comprising:
-
a memory for storing a character form confidence factor for each of said plurality of character recognition means; said memory storing a field type confidence factor for each of said plurality of character recognition means; said memory storing an adaptive probability weighting factor for each of said plurality of character recognition means; each of said plurality of character recognition means generating a first guess character and first confidence value and a second guess character and second confidence value; a first processor means coupled to said memory and to at least one of said plurality of character recognition means, for computing a recognition means choice confidence factor, as a product of said character form confidence factor, said field type confidence factor, and said adaptive probability weighting factor, times a difference between said first confidence value and said second confidence value; and said first processor means including means for selecting one of said plurality of character recognition means in said data processing system, having a maximum value for said recognition means choice confidence factor. - View Dependent Claims (15, 16, 17, 18)
-
-
19. In a data processing system, including a plurality of character recognition means a recognition station processor, and a monitoring and station processor, an apparatus for selecting which one of said plurality of character recognition means to use for a character in a field in a document image, comprising:
-
each of said plurality of character recognition means generating a first guess character and first confidence value and a second guess character and second confidence value; means for inputting an adaptive probability weighting factor from said monitoring and correction station processor to said recognition station processor for each of said plurality of character recognition means; a first processor means coupled to said memory and to each of said plurality of character recognition means, for computing a recognition means choice confidence factor, as a function of a product comprising an adaptive probability weighting factor times a difference between said first confidence value and said second confidence value; selection means included in said first processor means, for selecting one of said plurality of character recognition means in said data processing system, having a maximum value for said recognition means choice confidence factor.
-
-
20. In a data processing system, including a plurality of character recognition means, an apparatus for selecting which one of said plurality of character recognition means to use for a character in a field in a document image, comprising:
-
a memory for storing a character form confidence factor for each of said plurality of character recognition means; said memory storing a field type confidence factor for each of said plurality of character recognition means; each of said plurality of character recognition means generating a first guess character and first confidence value and a second guess character and second confidence value; a first processor means included in said memory and to each of said plurality of character recognition means, for computing a recognition means choice confidence factor, as a product of said character form confidence factor and said field type confidence factor, times a difference between said first confidence value and said second confidence value; and selection means coupled to said first processor means, for selecting one of said plurality of character recognition means in said data processing system, having a maximum value for said recognition means choice confidence factor.
-
-
21. In a data processing system, including a plurality of character recognition means, an apparatus for selecting which one of said plurality of character recognition means to use for a character in a field in a document image, comprising:
-
a memory for storing an adaptive probability weighting factor for each of said plurality of character recognition means; each of said plurality of character recognition means generating a first guess character and first confidence value and a second guess character and second confidence value; a first processor means coupled to said memory and to each of said plurality of character recognition means, for computing a recognition means choice confidence factor, as a product of said adaptive probability weighting factor, times a difference between said first confidence value and said second confidence value; selection means coupled to said first processor means, for selecting one of said plurality of character recognition means in said data processing system, having a maximum value for said recognition means choice confidence factor. - View Dependent Claims (22, 23, 24, 25)
-
Specification