Segmentation of text, picture and lines of a document image
First Claim
1. In a character recognition system, a method for segmenting portions of a medium into text and non-text types, said method comprising the steps of:
- a) compressing a bit mapped representation of said medium, said compressing said bit mapped representation of said medium includinga.i) providing said bit mapped representation of said medium to a compression means,a.ii) compressing group of N scanlines of said bit mapped representation into a corresponding compressed scanline, anda.iii) constructing a compressed representation of said medium from said compressed scanlines;
b) providing said compressed representation of said medium to a run length extraction and classification means, said compressed representation comprised of one or more scanlines;
c) extracting run lengths from each scanline of said compressed representation of said medium;
d) creating a run length record for each extracted run length, each run length record including a classification of the corresponding run length as short, medium or long based on it'"'"'s length;
e) constructing rectangles from said run length records, said rectangles representing a portion of said medium;
f) determining a skew of said rectangles;
g) correcting for skew of said rectangles;
h) classifying each of said rectangles as type image, vertical line, horizontal line or unknown; and
i) merging rectangles of type UNKNOWN into one or more text blocks.
2 Assignments
0 Petitions
Accused Products
Abstract
In a character recognition system, a method and apparatus for segmenting a document image into areas containing text and non-text. Document segmentation in the present invention is comprised generally of the steps of: providing a bit-mapped representation of the document image, extracting run lengths for each scanline from the bit-mapped representation of the document image; constructing rectangles from the run lengths; initially classifying each of the rectangles as either text or non-text; correcting for the skew in the rectangles; merging associated text into one or more text blocks; and logically ordering the text blocks.
198 Citations
10 Claims
-
1. In a character recognition system, a method for segmenting portions of a medium into text and non-text types, said method comprising the steps of:
-
a) compressing a bit mapped representation of said medium, said compressing said bit mapped representation of said medium including a.i) providing said bit mapped representation of said medium to a compression means, a.ii) compressing group of N scanlines of said bit mapped representation into a corresponding compressed scanline, and a.iii) constructing a compressed representation of said medium from said compressed scanlines; b) providing said compressed representation of said medium to a run length extraction and classification means, said compressed representation comprised of one or more scanlines; c) extracting run lengths from each scanline of said compressed representation of said medium; d) creating a run length record for each extracted run length, each run length record including a classification of the corresponding run length as short, medium or long based on it'"'"'s length; e) constructing rectangles from said run length records, said rectangles representing a portion of said medium; f) determining a skew of said rectangles; g) correcting for skew of said rectangles; h) classifying each of said rectangles as type image, vertical line, horizontal line or unknown; and i) merging rectangles of type UNKNOWN into one or more text blocks.
-
-
2. In a character recognition system, a method for segmenting portions of a medium into text and non-text types, said method comprising the steps of:
-
a) providing a bit mapped representation of said medium to a compression means; b) compressing groups of N scanlines of said bit mapped representation into a corresponding compressed scanline by performing the steps of; b1) examining corresponding bytes of said N scanlines and assigning a first or second logical value to a corresponding bit of a corresponding byte of a temporary con, pressed scanline according to the rules; assigning a first logical value if any corresponding bits of said corresponding bytes of said N scanlines has said first logical value; assigning a second logical value if none of said bits of said corresponding bits of said corresponding bytes of said N scanlines has said first logical value; b2) assigning all bits in a corresponding byte of a compressed scanline said first logical value if any bits in a corresponding byte in said temporary compressed scanline has said first logical value; and b3) assigning all bits in a corresponding byte of a compressed scanline said second logical value if no bits in said corresponding byte in said temporary compressed scanline has said first logical value; c) constructing a compressed representation of said medium from said compressed scanlines; d) providing said compressed representation of said medium to a run length extraction and classification means, said representation comprised of one or more scanlines; e) extracting run lengths from each scanline of said compressed representation of said medium; f) creating a run length record for each extracted run length, each run length record including a classification of the corresponding run length as short, medium or long based on it'"'"'s length; g) constructing rectangles from said run length records, said rectangles representing a portion of said medium; h) determining a skew of said rectangles; i) correcting for skew of said rectangles; j) classifying each of said rectangles as type image, vertical line, horizontal line or unknown; and k) merging rectangles of type UNKNOWN into one or more text blocks. - View Dependent Claims (3, 4, 5, 6, 7, 8)
-
-
9. An apparatus for segmenting portions of a medium into text and non-text types, said apparatus comprising:
-
a) scanline compression means for compressing N scanlines of a bit-mapped representation of a document image into one compressed scanline; b) run length extraction means coupled to said compression means, said run length extraction means for extracting and classifying run lengths from each compressed scanline, and further for storing each of said run lengths in a run length storage means; c) rectangle construction means coupled to said run length storage means, said rectangle extraction means for constructing and classifying rectangles and storing in a rectangle storage means; e) skew correction means coupled to said rectangle storage means, said skew correction means for correcting a skew angle of said rectangles; f) rectangle classification means coupled to said rectangle storage means, said rectangle classification means for assigning a classification to each of said rectangles as type image, vertical line, horizontal line or unknown; g) merging means coupled to said rectangle storage means, said rectangle classification means for merging rectangles of the type unknown into text blocks and storing in said rectangle storage means; and h) coordinate resolution means coupled to said rectangle storage means, said coordinate resolution means for resolving rectangle and block information in said rectangle storage means with real coordinate addresses of said bit-mapped representation of said document image. - View Dependent Claims (10)
-
Specification