Method and apparatus for character recognition
First Claim
1. A computer-implemented method of selecting blocks of pixels from pixel image data comprising, in order, the steps of:
- outlining contours of connected components in the pixel data;
forming a rectangle around each connected component outlined in said outlining step;
forming a hierarchical tree based on the outlined connected components, and designating as a descendent a connected component which is within a rectangle formed around another connected component;
a first connecting step in which rectangles are selectably connected widthwisely based on size and proximity to other rectangles to form text lines;
a second connecting step in which a plurality of text lines formed in the first connecting step are selectably connected vertically based on size and proximity to other formed text lines to form text blocks, at least one formed text block having a plurality of formed text lines; and
modifying the hierarchical tree based on the first and second connecting steps.
2 Assignments
0 Petitions
Accused Products
Abstract
In a character recognition system or the like, method and apparatus for selecting blocks of pixels from pixel image data so as to permit identification and grouping of similarly-typed pixels, such as text-type pixels and non-text-type pixels. Pixel image data is inputted and, if the pixel image data is not binary image data then the pixel image data is converted into binary pixel image data. Blocks of pixel image data are selected by outlining contours of connected components in the pixel image data, determining whether the outlined connected components include text unit or non-text units based on the size of the outlined connected components, selectively connecting text units widthwisely to form text lines based on proximity of adjacent text units, and selectively connecting text lines vertically to form text blocks based on proximity of adjacent text lines and on the position of non-text units between text lines. A hierarchical tree is formed based on the outlined connected components.
-
Citations
67 Claims
-
1. A computer-implemented method of selecting blocks of pixels from pixel image data comprising, in order, the steps of:
-
outlining contours of connected components in the pixel data; forming a rectangle around each connected component outlined in said outlining step; forming a hierarchical tree based on the outlined connected components, and designating as a descendent a connected component which is within a rectangle formed around another connected component; a first connecting step in which rectangles are selectably connected widthwisely based on size and proximity to other rectangles to form text lines; a second connecting step in which a plurality of text lines formed in the first connecting step are selectably connected vertically based on size and proximity to other formed text lines to form text blocks, at least one formed text block having a plurality of formed text lines; and modifying the hierarchical tree based on the first and second connecting steps. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 59, 60)
-
-
24. An apparatus for selecting blocks of pixels from pixel image data, said apparatus under control of a computer program, said apparatus comprising:
-
outlining means for outlining contours of connected components in the pixel data; forming means for forming a rectangle around each connected component outlined by said outlining means; tree forming means for forming a hierarchical tree based on the outlined connected components, and for designating as a descendent a connected component which is within a rectangle formed around another connected component; first connecting means for selectably connecting rectangles widthwisely based on size and proximity to other rectangles to form text lines; second connecting means for vertically connecting a plurality of text lines formed by said first connecting means, based on size and proximity to other formed text lines, to form blocks, at least one formed text block having a plurality of formed text lines; and modifying means for modifying the hierarchical tree based on the first and second connections. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 61, 62)
-
-
47. A computer-implemented method of selecting blocks of pixels from pixel image data comprising, in order, the steps of:
-
searching for connected components in the pixel data; classifying the connected components into text components and non-text components; forming a hierarchical tree based on the classified connected components; forming a rectangle around each connected component and designating as a descendent a connected component which is within a rectangle formed around another connected component; grouping text components into horizontal text lines; grouping the grouped horizontal text lines vertically into text blocks, at least one grouped text block having a plurality of grouped horizontal text lines; and modifying the hierarchical tree based on the grouping steps. - View Dependent Claims (48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 63, 64, 65)
-
-
66. Computer-executable process steps stored on a computer readable medium, said process steps for selecting blocks of pixels from pixel image data, said process steps comprising:
-
an outlining step to outline contours of connected components in the pixel data; a first forming step to form a rectangle around each connected component outlined in the outlining step; a second forming step to form a hierarchical tree based on the outlined connected components, wherein the second forming step further comprises a designating step to designate as a descendent a connected component which is within a rectangle formed around another connected component; a first connecting step to selectably connect rectangles widthwisely based on size and proximity to other rectangles to form text lines; a second connecting step to selectably connect vertically a plurality of text lines formed in the first connecting step based on size and proximity to other formed text lines to form text blocks, at least one formed text block having a plurality of formed text lines; and a modifying step to modify the hierarchical tree based on the first and second connecting steps. - View Dependent Claims (67)
-
Specification