System and method for detecting text in real-world color images
First Claim
Patent Images
1. A method of detecting text in real-world images comprising:
- dividing an image representing a real-world scene into one or more regions;
calculating a cascade of classifiers, the cascade comprising a plurality of stages, each stage including one or more weak classifiers, the plurality of stages organized to start out with classifiers that are most useful for ruling out non-text regions of the image;
feeding the one or more regions into the cascade; and
removing regions of the image classified as the non-text regions from the cascade prior to completion of the cascade to avoid subsequent processing of the removed regions;
utilizing a binarization process including classifying individual pixels as one of;
non-text, light potential-text, and dark potential-text based on one or more factors including;
a number of pixels in the connected component;
a number of pixels on the border of the connected component;
a height of the connected component;
a width of the connected component;
a ratio of the height of the connected component to the width of the connected component;
a ratio of the pixels in the connected component to the width of the connected component multiplied by the height of the connected component;
a local size of text in the connected component;
outputting binarization output data.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for detecting text in real-world images comprises calculating a cascade of classifiers, the cascade comprising a plurality of stages, each stage including one or more weak classifiers, the plurality of stages organized to start out with classifiers that are most useful for ruling out non-text regions, and removing regions classified as non-text regions from the cascade prior to completion of the cascade, to further speed up processing.
45 Citations
20 Claims
-
1. A method of detecting text in real-world images comprising:
-
dividing an image representing a real-world scene into one or more regions; calculating a cascade of classifiers, the cascade comprising a plurality of stages, each stage including one or more weak classifiers, the plurality of stages organized to start out with classifiers that are most useful for ruling out non-text regions of the image; feeding the one or more regions into the cascade; and removing regions of the image classified as the non-text regions from the cascade prior to completion of the cascade to avoid subsequent processing of the removed regions; utilizing a binarization process including classifying individual pixels as one of;
non-text, light potential-text, and dark potential-text based on one or more factors including;a number of pixels in the connected component; a number of pixels on the border of the connected component; a height of the connected component; a width of the connected component; a ratio of the height of the connected component to the width of the connected component; a ratio of the pixels in the connected component to the width of the connected component multiplied by the height of the connected component; a local size of text in the connected component; outputting binarization output data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system for detecting text in real-world images comprising:
-
a processor including; a dividing logic to divide an image into one or more regions; a calculating logic to calculate a cascade of classifiers, the cascade comprising a plurality of stages, each stage including one or more weak classifiers, wherein the plurality of stages is organized to start out with classifiers that are most useful for ruling out non-text regions; a feeding logic to feed the one or more regions into the cascade remove non-text image regions logic to remove image regions classified as the non-text regions from the cascade prior to completion of the cascade, to avoid subsequent processing of the removed regions; binarization logic including logic to classify individual pixels as one of;
non-text, light potential-text, and dark potential-text; anda training system including; a feed logic to feed training images into the cascade; a comparison logic to compare classifier results to known training image results; and an adapting logic to adapt one or more of an order of stages in cascade of classifiers, an order of classifiers in the stages, one or more classifiers confidence level thresholds, and the classifiers by selecting features for each classifier that reduce the number of false positive and false negative detections by a reduced number of tests. - View Dependent Claims (11, 12)
-
-
13. A non-transitory computer readable medium storing instructions thereon which, when executed by a system, cause the system to perform a method comprising:
-
dividing an image into one or more regions; calculating a cascade of classifiers, the cascade comprising a plurality of stages, each stage including one or more weak classifiers, wherein the plurality of stages is organized to start out with classifiers that are most useful for ruling out non-text regions of the mage; receiving training images; feeding the training images into the cascade; comparing classifier results to known training image results; adapting one or more of an order of stages in the cascade, an order of classifiers in the stages, one or more classifier confidence level thresholds, and the classifiers by selecting features for each classifier that reduce a number of false positive and false negative detections by a reduced number of tests; feeding the one or more regions into the cascade; and removing regions of the image classified as the non-text regions from the cascade prior to completion of the cascade to avoid subsequent processing of the removed regions. - View Dependent Claims (14, 15)
-
-
16. A method of detecting text in real-world images comprising:
-
dividing an image representing a real-world scene into one or more regions; calculating a cascade of classifiers, the cascade comprising a plurality of stages, each stage including one or more weak classifiers, the plurality of stages organized to start out with classifiers that are most useful for ruling out non-text regions of the image; receiving training images; feeding the training images into the cascade; comparing classifier results to known training image results; and adapting one or more of an order of stages in the cascade, an order of classifiers in the stages, one or more classifier confidence level thresholds, and the classifiers by selecting features for each classifier that reduce a number of false positive and false negative detections by a reduced number of tests; feeding the one or more regions into the cascade; and removing regions of the image classified as the non-text regions from the cascade prior to completion of the cascade to avoid subsequent processing of the removed regions; and displaying a result. - View Dependent Claims (17, 18, 19, 20)
-
Specification