Intelligent detection of text on a page
First Claim
1. A method for segmenting an image into text areas and non-text areas, comprising the computer aided steps of:
- tiling said image by blocks;
wherein each block is typed as to text content, ranked for intensity, and rated as neutral or color;
storing information about each block in a buffer;
sweeping said buffer a predetermined number of tile rows at a time;
making a preliminary decision whether a portion of the image being swept is a text area or a non-text area of said image for every tile-block in a middle row of said sweep;
examining said preliminary decision in a context block;
making revisions to said preliminary determination as necessary based upon said examination step; and
outputting image information which segments said image into text areas and non-text areas.
7 Assignments
0 Petitions
Accused Products
Abstract
A technique for segmenting an image into text areas and non-text areas in which an image is stored with the following information per pixel: gray scale intensity (4 bits) and an indication of whether the pixel is neutral or color (1 bit). The image, e.g. a scanned RGB image, is converted to 0-15 levels of intensity and has a neutral/color indication bit assigned to each pixel. The technique proceeds in three phases as follows: Tile the image by square blocks, e.g. 6×6 or 8×8 for 600 dpi images, and store information about each block in a buffer; sweep the buffer left to right three tile rows at a time and make a preliminary decision for every tile-block in the middle row; examine the decision made in the previous step in a context block, e.g. a 3×3 block, and make revisions if necessary.
-
Citations
21 Claims
-
1. A method for segmenting an image into text areas and non-text areas, comprising the computer aided steps of:
-
tiling said image by blocks;
wherein each block is typed as to text content, ranked for intensity, and rated as neutral or color;
storing information about each block in a buffer;
sweeping said buffer a predetermined number of tile rows at a time;
making a preliminary decision whether a portion of the image being swept is a text area or a non-text area of said image for every tile-block in a middle row of said sweep;
examining said preliminary decision in a context block;
making revisions to said preliminary determination as necessary based upon said examination step; and
outputting image information which segments said image into text areas and non-text areas. - View Dependent Claims (2, 3, 4, 5, 6, 7)
gray scale intensity; and
an indication of whether said pixel is neutral or color.
-
-
6. The method of claim 1, wherein said image is a scanned RGB image.
-
7. The method if claim 1, further comprising the step of:
-
converting said image to M levels of intensity; and
assigning a neutral/color indication bit to each pixel.
-
-
8. An apparatus for segmenting images into text and non-text areas, comprising:
-
a buffer for storing input images;
means for tiling said image in tile-blocks;
means for determining an intensity value for each pixel in said image relative to a threshold level;
means for classifying each tile-block as to text content type and intensity; and
means for rating each tile-block as neutral or color. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
means for sweeping groups of tile-blocks.
-
-
11. The apparatus of claim 10, wherein sets of three consecutive tile-rows are sweeped from left to right;
- and
wherein each three-row-stripe is cut into segments and a decision is made whether a portion of the image being swept is a text area or a non-text area of said image for each segment.
- and
-
12. The apparatus of claim 10, said sweeping means comprising:
-
a left pointer that marks a beginning of a segment;
a right pointer that marks said segment'"'"'s end; and
means for advancing both pointers to a point just beyond a segment each time a segment is cut out;
wherein said right pointer is advanced one tile-block at a time so a decision can be made whether to cut out a segment;
wherein said right pointer is advanced if a cut-out is not supported; and
wherein a decision is made as to the type of tile tile-block corresponding to said segment once the segment is cut out.
-
-
13. The apparatus of claim 10, wherein said means for sweeping implements a break process that sets one of four attributes:
- NON-TEXT, CORE, SPACE, and SIZE; and
wherein said process breaks as NON-TEXT if the number of non-text tile-blocks in a current column is three, the number of neutral tile-blocks is three, the number of non-text blocks in a current segment exceeds four, or the number of color tile-blocks exceeds three;
the process breaks as CORE if the number of tile-blocks with type BLACK-CORE exceeds three less than the segment area and the segment width exceeds the threshold T_core_limit;
the process breaks as SPACE if the number of blocks in a current column with type;
WHITE equals three, the segment width exceeds the threshold T_segment_width and the number of tile-block in a current column with a white13 rows_cols indicator equals three, or the end-of-line is reached; and
the process breaks as SIZE if the segment width exceeds a threshold T_size.
- NON-TEXT, CORE, SPACE, and SIZE; and
-
14. The apparatus of claim 12, wherein an attribute for the type of tile-block is attached in a cut segment.
-
15. The apparatus of claim 12, further comprising:
means for making a preliminary decision as to whether tile-blocks corresponding to a segment are text or non-text.
-
16. The apparatus of claim 15, wherein if said preliminary decision is that a current tile-block is non-text but said current tile-block is surrounded by text, then said current tile-block is identified as also being text.
-
17. The apparatus of claim 15, wherein if said preliminary decision is that a current tile-block is text but at least half of the tile-blocks surrounding said current tile-block are non-text, then said current tile-block is identified as also being non-text.
-
18. The apparatus of claim 8, wherein said means for classifying identifies a type:
- WHITE if substantially all the pixels in a tile-block are at or near maximum intensity;
a type;
SMALL-TEXT if a majority of pixels contain ink and a range of lightest to darkest exceeds a predefined range;
a type;
BLACK-CORE if, on average, there is significant dark gray in substantially all the pixels;
a type;
OUTLINE if there is significant contrast and there are large relative differences in the light areas to the dark areas; and
a type;
NOT-TEXT if said tile-block includes pixels containing significant grayscale information.
- WHITE if substantially all the pixels in a tile-block are at or near maximum intensity;
-
19. The apparatus of claim 8, wherein said means for determining intensity attributes a one-of-four level code for intensity of each tile-block, wherein intensity is initially set to zero;
- the intensity is set;
1 if as few as one pixel has an intensity greater than threshold T1 or if the total amount of ink in pixels with intensity darker than threshold T1 is greater than a predetermined amount;
the intensity is set;
2 if the average pixel intensity is in a middle range; and
the intensity is set;
3 if there is no ink in any pixel that exceeds threshold T1.
- the intensity is set;
-
20. A method for segmenting an image into text and non-text areas, comprising the steps of:
-
tiling said image by blocks to classify pixel information therein;
typing each tile-block for text content;
ranking each tile-block for intensity;
rating each tile block as neutral or color;
sweeping said image to determine the identity of text areas and non-text areas therein; and
revising said determination as necessary in accordance with a predetermined scheme. - View Dependent Claims (21)
-
Specification