OPTICAL CHARACTER RECOGNITION
First Claim
1. A method comprising:
- receiving, at a computing system, a plurality of optical character recognition (OCR) outputs provided by a respective plurality of OCR engines, each of the plurality of OCR outputs being representative of text depicted in a portion of an electronic image;
identifying, using the computing system, a document context associated with the electronic image;
generating, using the computing system, an output character set by applying a character resolution model to resolve differences among the plurality of OCR outputs, the character resolution model defining a probability of character recognition accuracy for each of the plurality of OCR engines given the identified document context; and
updating, using the computing system, the character resolution model to generate an updated character resolution model such that subsequent generating of output character sets are based on the updated character resolution model.
1 Assignment
0 Petitions
Accused Products
Abstract
Optical character recognition is described in various implementations. In one example implementation, a method may include receiving a plurality of optical character recognition (OCR) outputs provided by a respective plurality of OCR engines, each of the plurality of OCR outputs being representative of text depicted in a portion of an electronic image. The method may also include identifying a document context associated with the electronic image, and generating an output character set by applying a character resolution model to resolve differences among the plurality of OCR outputs. The character resolution model may define a probability of character recognition accuracy for each of the plurality of OCR engines given the identified document context. The method may also include updating the character resolution model to generate an updated character resolution model such that subsequent generating of output character sets are based on the updated character resolution model.
7 Citations
15 Claims
-
1. A method comprising:
-
receiving, at a computing system, a plurality of optical character recognition (OCR) outputs provided by a respective plurality of OCR engines, each of the plurality of OCR outputs being representative of text depicted in a portion of an electronic image; identifying, using the computing system, a document context associated with the electronic image; generating, using the computing system, an output character set by applying a character resolution model to resolve differences among the plurality of OCR outputs, the character resolution model defining a probability of character recognition accuracy for each of the plurality of OCR engines given the identified document context; and updating, using the computing system, the character resolution model to generate an updated character resolution model such that subsequent generating of output character sets are based on the updated character resolution model. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
-
a processor resource; a document analysis module, executable on the processor resource, to identify a document context associated with an electronic image; a conflict resolution module, executable on the processor resource, to receive a plurality of optical character recognition (OCR) outputs provided by a respective plurality of OCR engines, each of the plurality of OCR outputs being representative of text depicted in a portion of the electronic image, and to generate an output document based on a character resolution model and the plurality of OCR outputs, the character resolution model defining a probability of character recognition accuracy for each of the plurality of OCR engines given the identified document context; and a model updater module, executable on the processor resource, to generate an updated character resolution model for subsequent use by the conflict resolution module. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer-readable storage medium storing instructions that, when executed, cause a processor resource to;
-
receive a plurality of optical character recognition (OCR) outputs provided by a respective plurality of OCR engines, each of the plurality of OCR outputs being representative of text depicted in a portion of an electronic image; identify a document context associated with the electronic image; generate an output character set by applying a character resolution model to resolve differences among the plurality of OCR outputs, the character resolution model defining a probability of character recognition accuracy for each of the plurality of OCR engines given the identified document context; and update the character resolution model to generate an updated character resolution model such that subsequent generating of output character sets are based on the updated character resolution model.
-
Specification