Image-based character recognition
First Claim
1. A computer-implemented method, comprising:
- under control of one or more computer systems configured with executable instructions,obtaining an image captured by at least one camera of a computing device;
analyzing the image to locate a region of text in the image;
selecting a subset of the image associated with the region of text;
binarizing the subset of the image associated with the region of text to yield a binarized region;
communicating the binarized region to a first text recognition engine and at least one second text recognition engine;
recognizing the text of the binarized region with the first text recognition engine to yield first recognized text;
recognizing, independently, the text of the binarized region with the at least one second text recognition engine to yield second recognized text, the first text recognition engine being different relative to the at least one second text recognition engine;
determining a first confidence score for the first recognized text from the first text recognition engine and at least a second confidence score for the second recognized text from the at least one second text recognition engine; and
applying a linear combination function of the first confidence score and the second confidence score to generate a consensus string of text comprising at least a portion of at least one of the first recognized text from the first text recognition engine or the second recognized text from the at least one second text recognition engine.
1 Assignment
0 Petitions
Accused Products
Abstract
Various embodiments enable a device to perform tasks such as processing an image to recognize and locate text in the image, and providing the recognized text an application executing on the device for performing a function (e.g., calling a number, opening an internet browser, etc.) associated with the recognized text. In at least one embodiment, processing the image includes substantially simultaneously or concurrently processing the image with at least two recognition engines, such as at least two optical character recognition (OCR) engines, running in a multithreaded mode. In at least one embodiment, the recognition engines can be tuned so that their respective processing speeds are roughly the same. Utilizing multiple recognition engines enables processing latency to be close to that of using only one recognition engine.
-
Citations
21 Claims
-
1. A computer-implemented method, comprising:
under control of one or more computer systems configured with executable instructions, obtaining an image captured by at least one camera of a computing device; analyzing the image to locate a region of text in the image; selecting a subset of the image associated with the region of text; binarizing the subset of the image associated with the region of text to yield a binarized region; communicating the binarized region to a first text recognition engine and at least one second text recognition engine; recognizing the text of the binarized region with the first text recognition engine to yield first recognized text; recognizing, independently, the text of the binarized region with the at least one second text recognition engine to yield second recognized text, the first text recognition engine being different relative to the at least one second text recognition engine; determining a first confidence score for the first recognized text from the first text recognition engine and at least a second confidence score for the second recognized text from the at least one second text recognition engine; and applying a linear combination function of the first confidence score and the second confidence score to generate a consensus string of text comprising at least a portion of at least one of the first recognized text from the first text recognition engine or the second recognized text from the at least one second text recognition engine. - View Dependent Claims (2, 3, 4, 5)
-
6. A computer-implemented method, comprising:
under control of one or more computer systems configured with executable instructions, analyzing an image to locate a region of text in the image; selecting a subset of the image associated with the region of text in the image; binarizing the subset of the image associated with the region of text in the image to yield a binarized region; communicating the binarized region to a first recognition engine and a second recognition engine; recognizing the text in the image with the first recognition engine to yield first recognized text; recognizing the text in the image with the second recognition engine to yield second recognized text, the first recognition engine being different relative to the second recognition engine; determining a first confidence score for the first recognized text from the first recognition engine and a second confidence score for the second recognized text from the second recognition engine; and applying a linear combination function of the first confidence score and the second confidence score to generate a consensus string of text comprising at least a portion of at least one of the first recognized text from the first recognition engine or the second recognized text from the second recognition engine. - View Dependent Claims (7, 8, 9, 10, 11, 12)
-
13. A computing device, comprising:
-
a processor; a display screen; and memory including instructions that, when executed by the processor, cause the computing device to; analyze an image to locate a region of text in the image; select a subset of the image associated with the region of text in the image; binarize the subset of the image associated with the region of text in the image to yield a binarized region; communicate the binarized region to a first text recognition engine and a second text recognition engine; recognize the text in the image with a first text recognition engine to yield first recognized text and the second text recognition engine to yield second recognized text, the first text recognition engine being different relative to the second text recognition engine; determine a first confidence score for the first recognized text from the first text recognition engine and a second confidence score for the second recognized text from the second text recognition engine; and apply a linear combination function of the first confidence score and the second confidence score to generate a consensus string of text comprising at least a portion of at least one of the first recognized text from the first text recognition engine or the second recognized text from the second text recognition engine. - View Dependent Claims (14, 15, 16)
-
-
17. A non-transitory computer-readable storage medium storing instructions that, when executed by at least one processor of a computing device, cause the computing device to:
-
analyze an image to locate a region of text in the image; select a subset of the image associated with the region of text in the image; binarize the subset of the image associated with the region of text in the image to yield a binarized region; communicate the binarized region to a first recognition engine and a second recognition engine; recognize the text in the image with a first text recognition engine to yield first recognized text and a second text recognition engine to yield second recognized text, the first text recognition engine being different relative to the second text recognition engine; determine a first confidence score for the first recognized text from the first text recognition engine and a second confidence score for the second recognized text from the second text recognition engine; and apply a linear combination function of the first confidence score and the second confidence score to generate a consensus string of text comprising at least a portion of at least one of the first recognized text from the first text recognition engine or the second recognized text from the second text recognition engine. - View Dependent Claims (18, 19, 20, 21)
-
Specification