Recognizing text in a multicolor image
First Claim
1. A computer-implemented method recognizing text in a multicolor image stored in a computer, the multicolor image comprising a plurality of pixels, each pixel having an associated color, the method comprising:
- separating the image into multiple blocks;
analyzing color distributions of the pixels in each of the blocks;
identifying blocks having two main colors;
grouping two-color blocks having similar colors into two-color zones;
identifying text in the two-color zones;
mapping the pixels of each block to a three-dimensional color space;
defining, for each two-color block, a cylinder that encloses the pixels, the cylinder having a height and a radius;
classifying a block as a text block if the ratio of the radius to the height is less than a predefined value,wherein the text identifying step is performed in the text blocks.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for recognizing text in a multicolor image stored in a computer. The image is separated into multiple blocks, and the color distributions of each of the blocks are analyzed. The blocks having two main colors are identified, and two-color blocks have similar colors are grouped into two-color zones. The two colors in each zone are converted to black and white to produce a black and white image. Text is identified in the two-color zones by performing optical character recognition of the black and white image.
209 Citations
24 Claims
-
1. A computer-implemented method recognizing text in a multicolor image stored in a computer, the multicolor image comprising a plurality of pixels, each pixel having an associated color, the method comprising:
-
separating the image into multiple blocks; analyzing color distributions of the pixels in each of the blocks; identifying blocks having two main colors; grouping two-color blocks having similar colors into two-color zones; identifying text in the two-color zones; mapping the pixels of each block to a three-dimensional color space; defining, for each two-color block, a cylinder that encloses the pixels, the cylinder having a height and a radius; classifying a block as a text block if the ratio of the radius to the height is less than a predefined value, wherein the text identifying step is performed in the text blocks. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer program residing on a computer-readable medium for recognizing text in a multicolor image, the multicolor image comprising a plurality of pixels, each pixel having an associated color, the computer program comprising instructions for causing the computer to:
-
separate the image into multiple blocks; analyze color distributions of each of the blocks; identify blocks having two main colors; group two-color blocks having similar colors into two-color zones; and identify text in the two-color zones; map the pixels of each block to a three-dimensional color space; define, for each two-color block, a cylinder that encloses the pixels, the cylinder having a height and a radius; and classify a block as a text block if the ratio of the radius to the height is less than a predefined value, wherein the text is identified in the text blocks. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. Apparatus to recognize text in a multicolor image, the multicolor image comprising a plurality of pixels, each pixel having an associated color, the apparatus comprising:
-
storage medium to store the image; and a processor operatively coupled to the storage medium and configured to; separate the image into multiple blocks; analyze color distributions of each of the blocks; identify blocks having two main colors; group two-color blocks having similar colors into two-color zones; and identify text in the two-color zones; map the pixels of each block to a three-dimensional color space; define, for each two-color block, a cylinder that encloses the pixels, the cylinder having a height and a radius; and classify a block as a text block if the ratio of the radius to the height is less than a predefined value, wherein the text identifying step is performed in the text blocks. - View Dependent Claims (16, 17)
-
-
18. A computer-implemented method recognizing text in a multicolor image stored in a computer, the multicolor image comprising a plurality of pixels, each pixel having an associated color, the method comprising:
-
separating the image into multiple blocks; analyzing color distributions of each of the blocks; identifying blocks having two main colors; grouping two-color blocks having similar colors into two-color zones; identifying text in the two-color zones; representing each block as a vector in a three-dimensional color space, the vector originating at a point in a first group of pixels corresponding to a first color and terminating at a point in a second group of pixels corresponding to a second color; identifying clusters of vectors that point generally in the same directions; and marking blocks corresponding to clusters that contain more than a predefined number of vectors as text blocks. - View Dependent Claims (19, 20)
-
-
21. A computer program residing on a computer-readable medium for recognizing text in a multicolor image, the multicolor image comprising a plurality of pixels, each pixel having an associated color, the computer program comprising instructions for causing the computer to:
-
separate the image into multiple blocks; analyze color distributions of each of the blocks; identify blocks having two main colors; group two-color blocks having similar colors into two-color zones; identify text in the two-color zones; represent each block as a vector in a three-dimensional color space, the vector originating at a point in a first group of pixels corresponding to a first color and terminating at a point in a second group of pixels corresponding to a second color; identify clusters of vectors that point generally in the same directions; and mark blocks corresponding to clusters that contain more than a predefined number of vectors as text blocks. - View Dependent Claims (22, 23)
-
-
24. Apparatus to recognize text in a multicolor image, the multicolor image comprising a plurality of pixels, each pixel having an associated color, the apparatus comprising:
-
storage medium to store the image; and a processor operatively coupled to the storage medium and configured to; separate the image into multiple blocks; analyze color distributions of each of the blocks; identify blocks having two main colors; group two-color blocks having similar colors into two-color zones; identify text in the two-color zones; represent each block as a vector in a three-dimensional color space, the vector originating at a point in a first group of pixels corresponding to a first color and terminating at a point in a second group of pixels corresponding to a second color; identify clusters of vectors that point generally in the same directions; and mark blocks corresponding to clusters that contain more than a predefined number of vectors as text blocks.
-
Specification