Method and system for detecting and recognizing text in images
First Claim
1. A computer-implemented method for detecting and recognizing text in an image, the method comprising:
- under the control of one or more computer systems configured with executable instructions,processing an input image to form an output image by filtering and segmenting the input image and intersecting the filtered and segmented input image with a mask created from a plurality of bounding boxes, each bounding box enclosing a connected component, each connected component including a plurality of pixels comprising the image and connected on the basis of a predetermined pixel intensity and predefined distance between the pixels, the output image comprising text regions detected by said filtering, segmenting, and intersecting;
separately processing the input image to create at least one binary chip, each binary chip comprising one detected text region;
recognizing the text in each binary chip from the detected text region comprising the binary chip using an optical character recognizer;
separately and independently recognizing the text from the text regions comprising the output image resulting from processing the input image using the optical character recognizer; and
logically combining the separately recognized text by taking the logical OR of the recognized text to form a single output producing detected text.
1 Assignment
0 Petitions
Accused Products
Abstract
Various embodiments of the present invention relate to a method, system and computer program product for detecting and recognizing text in the images captured by cameras and scanners. First, a series of image-processing techniques is applied to detect text regions in the image. Subsequently, the detected text regions pass through different processing stages that reduce blurring and the negative effects of variable lighting. This results in the creation of multiple images that are versions of the same text region. Some of these multiple versions are sent to a character-recognition system. The resulting texts from each of the versions of the image sent to the character-recognition system are then combined to a single result, wherein the single result is detected text.
-
Citations
25 Claims
-
1. A computer-implemented method for detecting and recognizing text in an image, the method comprising:
under the control of one or more computer systems configured with executable instructions, processing an input image to form an output image by filtering and segmenting the input image and intersecting the filtered and segmented input image with a mask created from a plurality of bounding boxes, each bounding box enclosing a connected component, each connected component including a plurality of pixels comprising the image and connected on the basis of a predetermined pixel intensity and predefined distance between the pixels, the output image comprising text regions detected by said filtering, segmenting, and intersecting; separately processing the input image to create at least one binary chip, each binary chip comprising one detected text region; recognizing the text in each binary chip from the detected text region comprising the binary chip using an optical character recognizer; separately and independently recognizing the text from the text regions comprising the output image resulting from processing the input image using the optical character recognizer; and logically combining the separately recognized text by taking the logical OR of the recognized text to form a single output producing detected text. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
13. A system for detecting and recognizing text in an image, the system comprising:
-
a text-detection module, the text-detection module configured to detect text regions in an image by filtering and segmenting the image and intersecting the filtered and segmented image with a mask created from a plurality of bounding boxes, each bounding box enclosing a connected component, each connected component including a plurality of pixels comprising the image and connected on the basis of a predetermined pixel intensity and predefined distance between the pixels to form an output image and to form at least one gray-level image chip; a chip-processing module, the chip-processing module configured to process the at least one gray-level image chip to form at least one binary chip; and an optical character-recognizer module, the optical character-recognizer module configured to; recognize the text in the at least one binary chip using an optical character-recognizer; separately and independently recognize the text in the output image using the optical character-recognizer; and logically combining the separately recognized text by taking the logical OR of the recognized text to form a single output producing detected text. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
-
21. A non-transitory computer program product for use with a computer, the computer program product comprising a computer usable medium having a computer-readable program code embodied therein for detecting and recognizing text in an image, the computer program product performing:
-
processing an image to form an output image by filtering and segmenting the image and intersecting the filtered and segmented image with a mask created from a plurality of bounding boxes, each bounding box enclosing a connected component, each connected component including a plurality of pixels comprising the image and connected on the basis of a predetermined pixel intensity and predefined distance between the pixels, the output image comprising text regions detected by said filtering, segmenting, and intersecting; processing the image to create at least one binary chip, each of the at least one binary chips comprising one detected text region; recognizing the text in each of the at least one binary chips using an optical character recognizer; separately and independently recognizing the text from the text regions comprising the output image resulting from processing the input image using the optical character recognizer; and logically combining the separately recognized text by taking the logical OR of the recognized text to form a single output producing detected text. - View Dependent Claims (22, 23, 24, 25)
-
Specification