Methods and systems for automatic detection of continuous-tone regions in document images
First Claim
Patent Images
1. An image segmentation method, said method comprising:
- a) obtaining pixel attribute data for a mixed-content image, said pixel attribute data comprising at least one of a luminance data, a chrominance data and a hue data;
b) downsampling said pixel data;
c) filtering said pixel data to remove noise;
d) computing a local discriminating feature, selected from the group consisting of standard deviation and spread, to identify a text region in said image, wherein a region is identified as text when said feature is above a local feature threshold value;
e) analyzing a luminance histogram of said image to identify a background region in said image, wherein a region is identified as background when an initial maximum histogram bin containing the highest number of pixels exceeds a background threshold value;
f) verifying said background region analysis using region chrominance data;
g) labeling any background regions as such;
h) analyzing areas in said image outside any of said background regions and outside any of said text regions to identify contone regions;
i) verifying said contone regions using region properties, wherein said contone regions are eliminated when a contone region'"'"'s area is smaller than the square of one tenth of the page width;
j) analyzing said contone regions to identify text regions present within said contone regions;
k) analyzing said contone regions to identify background regions present in said contone regions;
l) analyzing areas in said contone regions outside any of said background regions and outside any of said text regions to identify contone sub-regions;
m) repeating steps e-g until no further sub-regions are found; and
n) analyzing said contone regions and said contone sub-regions to identify pictorial contone regions and non-pictorial contone regions.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems for segmentation of digital mixed-content documents. Segmentation processes may include identification of text and background regions and identification of contone regions outside the text and background regions. Further analysis may be performed to identify additional text and background regions within the contone regions thereby identifying verified contone regions, which may then be divided into contone sub-regions.
-
Citations
2 Claims
-
1. An image segmentation method, said method comprising:
-
a) obtaining pixel attribute data for a mixed-content image, said pixel attribute data comprising at least one of a luminance data, a chrominance data and a hue data; b) downsampling said pixel data; c) filtering said pixel data to remove noise; d) computing a local discriminating feature, selected from the group consisting of standard deviation and spread, to identify a text region in said image, wherein a region is identified as text when said feature is above a local feature threshold value; e) analyzing a luminance histogram of said image to identify a background region in said image, wherein a region is identified as background when an initial maximum histogram bin containing the highest number of pixels exceeds a background threshold value; f) verifying said background region analysis using region chrominance data; g) labeling any background regions as such; h) analyzing areas in said image outside any of said background regions and outside any of said text regions to identify contone regions; i) verifying said contone regions using region properties, wherein said contone regions are eliminated when a contone region'"'"'s area is smaller than the square of one tenth of the page width; j) analyzing said contone regions to identify text regions present within said contone regions; k) analyzing said contone regions to identify background regions present in said contone regions; l) analyzing areas in said contone regions outside any of said background regions and outside any of said text regions to identify contone sub-regions; m) repeating steps e-g until no further sub-regions are found; and n) analyzing said contone regions and said contone sub-regions to identify pictorial contone regions and non-pictorial contone regions.
-
-
2. An image segmentation method, said method comprising:
-
a) obtaining pixel attribute data for a mixed-content image; b) identifying a text region in said image; c) identifying a background region in said image, wherein said identifying comprises; i) calculating a luminance histogram for said image, ii) identifying a histogram bin containing a maximum number of values, iii) comparing said maximum number of values to a threshold value, and iv) classifying a pixel as background if said pixel correspond to said histogram bin and said maximum number of values is greater than said threshold value; d) analyzing areas in said image outside any of said background regions and outside any of said text regions to identify contone regions, wherein said analyzing comprises; i) calculating an area luminance histogram for said areas not classified as background or text, ii) determining a number of populated histogram bins whose pixel count exceeds a threshold value, iii) comparing said number of populated histogram bins to a bin number threshold value, iv) classifying said area as a contone region when said number of populated histogram bins exceeds said bin number threshold value; e) analyzing said contone region to identify any text regions present within said contone regions; f) analyzing said contone regions to identify any background regions present in said contone regions; g) analyzing areas in said contone regions outside any of said background regions and outside any of said text regions to identify contone sub-regions; h) repeating steps e-g until no further sub-regions are found; i) removing any contone regions whose area is smaller than one tenth of the square of the page width.
-
Specification