Optical character recognition (OCR) engines having confidence values for text types
First Claim
Patent Images
1. A method comprising:
- for each known text sample of a plurality of known text samples, each known text sample having a text type,generating, by a processor, an image of the known text sample;
for each optical character recognition (OCR) engine of a plurality of OCR engines,inputting the image of the known text sample, by the processor, into the OCR engine;
receiving output text corresponding to the image of the known text sample, by the processor, from the OCR engine; and
,comparing the output text received from the OCR engine with the known text sample, by the processor, to determine a confidence value of the OCR engine for the text type of the known text sample.
1 Assignment
0 Petitions
Accused Products
Abstract
An image of a known text sample having a text type is generated. The image of the known text sample is input into each OCR engine of a number of OCR engines. Output text corresponding to the image of the known text sample is received from each OCR engine. For each OCR engine, the output text received from the OCR engine is compared with the known text sample, to determine a confidence value of the OCR engine for the text type of the known text sample.
-
Citations
19 Claims
-
1. A method comprising:
for each known text sample of a plurality of known text samples, each known text sample having a text type, generating, by a processor, an image of the known text sample; for each optical character recognition (OCR) engine of a plurality of OCR engines, inputting the image of the known text sample, by the processor, into the OCR engine; receiving output text corresponding to the image of the known text sample, by the processor, from the OCR engine; and
,comparing the output text received from the OCR engine with the known text sample, by the processor, to determine a confidence value of the OCR engine for the text type of the known text sample. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
13. A non-transitory computer-readable data storage medium having a computer program stored thereon for execution by a processor to perform a method comprising:
-
receiving an image of unknown text having a text type; inputting the image of the unknown text into each optical character recognition (OCR) engine of a plurality of OCR engines, each OCR having a confidence value for the text type; receiving output text corresponding to the image of the unknown text from each OCR engine; and
,where the output text received from each OCR engine is not identical, selecting the output text to use as at least provisionally correct for the unknown text, based on the confidence values of the OCR engines for the text type of the unknown text. - View Dependent Claims (14, 15, 16)
-
-
17. A computing system comprising:
-
a processor; a computer-readable data storage medium to store an image of unknown text having a text type; a plurality of optical character recognition (OCR) engines executable by the processor, each OCR engine having a confidence value for the text type, each OCR engine to generate output text corresponding to the image of the unknown text; and
,logic executable by the processor to, where the output text received from each OCR engine is not identical, select the output text to use as at least provisionally correct for the unknown text, based on the confidence values of the OCR engines for the text type of the unknown text. - View Dependent Claims (18, 19)
-
Specification