Method and system for detecting and recognizing text in images
First Claim
1. A computer-implemented method comprising:
- under the control of one or more computer systems configured with executable instructions,obtaining an input image that includes at least one image variation;
filtering and segmenting the input image;
creating a mask of the image from a plurality of bounding boxes, each bounding box at least partially enclosing a connected component;
intersecting the filtered and segmented input image with the mask to produce a first output image;
separately processing the filtered and segmented input image to create a binary chip for each of the plurality of bounding boxes to produce a second output image;
separately recognizing text in the first output image and each binary chip of the second output image using an optical character recognizer; and
combining the separately recognized text from the first output image and the second output image to produce a single output.
1 Assignment
0 Petitions
Accused Products
Abstract
Various embodiments of the present invention relate to a method, system and computer program product for detecting and recognizing text in the images captured by cameras and scanners. First, a series of image-processing techniques is applied to detect text regions in the image. Subsequently, the detected text regions pass through different processing stages that reduce blurring and the negative effects of variable lighting. This results in the creation of multiple images that are versions of the same text region. Some of these multiple versions are sent to a character-recognition system. The resulting texts from each of the versions of the image sent to the character-recognition system are then combined to a single result, wherein the single result is detected text.
-
Citations
19 Claims
-
1. A computer-implemented method comprising:
under the control of one or more computer systems configured with executable instructions, obtaining an input image that includes at least one image variation; filtering and segmenting the input image; creating a mask of the image from a plurality of bounding boxes, each bounding box at least partially enclosing a connected component; intersecting the filtered and segmented input image with the mask to produce a first output image; separately processing the filtered and segmented input image to create a binary chip for each of the plurality of bounding boxes to produce a second output image; separately recognizing text in the first output image and each binary chip of the second output image using an optical character recognizer; and combining the separately recognized text from the first output image and the second output image to produce a single output. - View Dependent Claims (2, 3, 4)
-
5. A computer-implemented method comprising:
under the control of one or more computer systems configured with executable instructions, processing an input image using a first process to produce a first output image, the first output image including a plurality of bounding boxes, each bounding box at least partially enclosing a connected component; separately processing the first output using a second process to generate a binary chip of each bounding box to produce a second output image; recognizing text in the first output image and in the binary chip of each bounding box; and comparing the recognized text from the plurality of bounding boxes of the first output image to corresponding binary chips of the second output image to produce a consensus output for text in the input image. - View Dependent Claims (6, 7, 8, 9, 10, 11)
-
12. A computing system, comprising:
-
at least one processor; and memory including instructions that, when executed by the at least one processor, cause the computing system to; process an input image using a first process to produce a first output image, the first output image including a plurality of bounding boxes, each bounding box at least partially enclosing a connected component; separately process the first output image using a second process to generate a binary chip of each bounding box to produce a second output image; recognize text in the first output image and in the binary chip of each bounding box; and compare the recognized text from the plurality of bounding boxes of the first output image to corresponding binary chips of the second output image to produce a consensus output for text in the input image. - View Dependent Claims (13, 14, 15)
-
-
16. A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to:
-
obtain an input image that includes at least one image defect; process the input image using a first processing technique to produce a first output image, the first output image including a plurality of bounding boxes, each bounding box at least partially enclosing a connected component; separately process the first output image using a second process to generate a binary chip of each bounding box to produce a second output image; recognize text in the first output image and in the binary chip of each bounding box; and compare the recognized text from the plurality of bounding boxes of the first output image to corresponding binary chips of the second output image to produce a consensus output for text in the input image. - View Dependent Claims (17, 18, 19)
-
Specification