Method and apparatus for separating document image object types
First Claim
Patent Images
1. A method for separating types of objects present in an image, the method comprising steps of:
- a) inputting the image having objects including character type objects stroke type objects and blob type objects;
b) generating a first bitmap representing the image;
c) determining which of the objects of the image are of the character type by comparing predetermined decision criteria to data obtained from the first bitmap by;
i) performing boundary contour tracing on the objects represented in the first bitmap to obtain a contour of each of the objects, ii) measuring a width and height of each contour based on a boundary box thereof, iii) measuring a perimeter of the each contour, iv) measuring an area of the each contour, v) determining a ratio of the perimeter to the area for the each contour, and vi) measuring a wiggliness of the each contour;
d) separating character type objects from the first bitmap to obtain a second bitmap, having only characters represented therein, and a third bitmap;
e) performing N−
1 thinning steps on the third bitmap to obtain a fourth bitmap;
f) copying the fourth bitnmap to obtain a fifth bitmap;
g) performing another thinning step on the fourth bitmap;
h) removing all non-interior pixels of the fifth bitmap to obtain a sixth bitmap;
i) performing an image morphology based dilation on the sixth bitmap to restore pixels eroded by the thinning and removing steps and obtain a seventh bitmap;
j) performing a bitwise boolean operation between the first bitmap and the seventh bitmap to obtain an eighth bitmap having only blob type objects represented therein;
k) performing a bitwise boolean operation between the fourth bitmap and the eighth bitmap to obtain a ninth bitmap; and
, l) performing a tracing operation on the ninth bitmap to obtain a tenth bitmap having only stroke type objects represented therein.
7 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus is provided for segmenting a binary document image so as to assign image objects to one of three types: CHARACTER-type objects, STROKE-type objects, and LARGE-BITMAP-type objects. The method makes use of a contour tracing technique, statistical analysis of contour features, a thinning technique, and image morphology.
78 Citations
22 Claims
-
1. A method for separating types of objects present in an image, the method comprising steps of:
-
a) inputting the image having objects including character type objects stroke type objects and blob type objects;
b) generating a first bitmap representing the image;
c) determining which of the objects of the image are of the character type by comparing predetermined decision criteria to data obtained from the first bitmap by;
i) performing boundary contour tracing on the objects represented in the first bitmap to obtain a contour of each of the objects, ii) measuring a width and height of each contour based on a boundary box thereof, iii) measuring a perimeter of the each contour, iv) measuring an area of the each contour, v) determining a ratio of the perimeter to the area for the each contour, and vi) measuring a wiggliness of the each contour;
d) separating character type objects from the first bitmap to obtain a second bitmap, having only characters represented therein, and a third bitmap;
e) performing N−
1 thinning steps on the third bitmap to obtain a fourth bitmap;
f) copying the fourth bitnmap to obtain a fifth bitmap;
g) performing another thinning step on the fourth bitmap;
h) removing all non-interior pixels of the fifth bitmap to obtain a sixth bitmap;
i) performing an image morphology based dilation on the sixth bitmap to restore pixels eroded by the thinning and removing steps and obtain a seventh bitmap;
j) performing a bitwise boolean operation between the first bitmap and the seventh bitmap to obtain an eighth bitmap having only blob type objects represented therein;
k) performing a bitwise boolean operation between the fourth bitmap and the eighth bitmap to obtain a ninth bitmap; and
,l) performing a tracing operation on the ninth bitmap to obtain a tenth bitmap having only stroke type objects represented therein. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for separating types of objects present in an image, the method comprising steps of:
-
a) inputting the image having objects including character type objects, stroke type objects and blob type objects;
b) generating a first bitmap representing the image;
c) determining which of the objects of the image are of the character type by comparing predetermined decision criteria to data obtained from the first bitmap;
d) separating character type objects from the first bitmap to obtain a second bitmap, having only characters represented therein, and a third bitmap;
e) performing N−
1 thinning steps on the third bitmap to obtain a fourth bitmap;
f) copying the fourth bitmap to obtain a fifth bitmap;
g) performing another thinning step on the fourth bitmap;
h) removing all non-interior pixels of the fifth bitmap to obtain a sixth bitmap;
i) performing an image morphology based dilation on the sixth bitmap to restore pixels eroded by the thinning and removing steps and obtain a seventh bitmap;
j) performing a bitwise boolean operation between the first bitmap and the seventh bitmap to obtain an eighth bitmap having only blob type objects represented therein;
k) performing a bitwise boolean operation between the fourth bitmap and the eighth bitmap to obtain a ninth bitmap; and
,l) performing a tracing operation on the ninth bitmap to obtain a tenth bitmap having only stroke type objects represented therein. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17)
i) performing boundary contour tracing on the objects represented in the first bitmap to obtain a contour of each of the objects, ii) measuring a width and height of each contour based on a boundary box thereof, iii) measuring a perimeter of the each contour, iv) measuring an area of the each contour, v) determining a ratio of the perimeter to the area for the each contour, and vi) measuring a wiggliness of the each contour. -
17. The method as set forth in claim 16 wherein the wiggliness of each contour is determined based on a sum-squared curvature of the contour.
-
-
18. A method for separating types of objects present in an image, the method comprising steps of:
-
a) inputting the image having objects including character type objects, stroke type objects and blob type objects;
b) generating a first bitmap representing the image;
c) determining which of the objects of the image are of the character type by comparing predetermined decision criteria to data obtained from the first bitmap;
d) separating character type objects from the first bitmap to obtain a second bitmap, having only characters represented therein, and a third bitmap; and
,e) separating stroke type objects and blob type objects of the image, respectively, by selectively using techniques of thinning, dilation, and bitwise logical operations on at least one of the first and third bitmaps. - View Dependent Claims (19)
f) performing N−
1 thinning steps on the third bitmap to obtain a fourth bitmap;
g) copying the fourth bitmap to obtain a fifth bitmap;
h) performing another thinning step on the fourth bitmap;
i) removing all non-interior pixels of the fifth bitmap to obtain a sixth bitmap;
j) performing an image morphology based dilation on the sixth bitmap to restore pixels eroded by the thinning and removing steps and obtain a seventh bitmap;
k) performing a bitwise boolean operation between the first bitmap and the seventh bitmap to obtain an eighth bitmap having only blob type objects represented therein;
l) performing a bitwise boolean operation between the fourth bitmap and the eighth bitmap to obtain a ninth bitmap; and
,m) performing a tracing operation on the ninth bitmap to obtain a tenth bitmap having only stroke type objects represented therein.
-
-
20. A system for separating types of objects present in an input image having character type objects, stroke type objects and blob type objects, the system comprising:
-
means for generating a first bitmap representing the image;
means for determining which of the objects of the image are of the character type by comparing predetermined decision criteria to data obtained from the first bitmap;
means for separating character type objects from the first bitmap to obtain a second bitmap, having only characters represented therein, and a third bitmap; and
,means for separating stroke type objects and blob type objects of the image, respectively, by selectively using techniques of thinning, dilation, and bitwise logical operations on at least one of the first and third bitmaps. - View Dependent Claims (21, 22)
-
Specification