Document processing apparatus and method
First Claim
Patent Images
1. A document processing apparatus comprising:
- determining means for determining a plurality of threshold values on the basis of a luminance distribution of a color image;
binarization means for obtaining a plurality of binary image data by binarizing the color image based on each of the plurality of threshold values determined by said determining means;
extracting means for extracting region information based on components included in the plurality of binary image data, the region information indicating a position and a size of regions having different background colors;
generation means for generating second binary image data based on the plurality of binary image data and the region information by executing binarization processes in each of the regions associated with the region information extracted by said extracting means, wherein said generation means comprises means for determining each binarization threshold value for each partial region of the color image corresponding to each of the regions associated with the region information extracted by said extracting means; and
wherein the second binary image data is generated by binarizing each of the partial regions using the set determined threshold value; and
segmentation processing means for performing region segmentation based on the second binary image data generated by said generation means and the region information extracted by said extracting means.
1 Assignment
0 Petitions
Accused Products
Abstract
A document processing apparatus for segmenting a color document image into regions obtains a binary image by binarizing a color image, and extracts regions having different background colors from the color image to generate region information indicating the position and size of each extracted region. By making region segmentation on the basis of the binary image and region information, a region segmentation result that reflects the background colors can be obtained. In this way, region segmentation which can maintain region differences expressed by colors in a color document can be implemented.
41 Citations
7 Claims
-
1. A document processing apparatus comprising:
-
determining means for determining a plurality of threshold values on the basis of a luminance distribution of a color image; binarization means for obtaining a plurality of binary image data by binarizing the color image based on each of the plurality of threshold values determined by said determining means; extracting means for extracting region information based on components included in the plurality of binary image data, the region information indicating a position and a size of regions having different background colors; generation means for generating second binary image data based on the plurality of binary image data and the region information by executing binarization processes in each of the regions associated with the region information extracted by said extracting means, wherein said generation means comprises means for determining each binarization threshold value for each partial region of the color image corresponding to each of the regions associated with the region information extracted by said extracting means; and
wherein the second binary image data is generated by binarizing each of the partial regions using the set determined threshold value; andsegmentation processing means for performing region segmentation based on the second binary image data generated by said generation means and the region information extracted by said extracting means. - View Dependent Claims (2)
-
-
3. A document processing apparatus comprising:
-
determination means for determining a plurality of threshold values based on a luminance distribution of a color image; binarization means for obtaining a plurality of binary image data by binarizing the color image based on each of the plurality of threshold values determined by said determination means; extraction means for extracting region information based on components included in the plurality of binary image data, the region information indicating a position and a size of regions having different background colors; generation means for generating second binary image data based on the plurality of binary image data and the region information; segmentation processing means for performing region segmentation based on the second binary image data generated by said generation means and the region information extracted by said extraction means; forming means for forming a tree structure by extracting document elements from the second binary image data generated by said generation means; and changing means for changing the tree structure by forming a partial tree structure, which has the document elements included in a region indicated by the region information to be children connected to a parent indicating that region, wherein said segmentation processing means performs the region segmentation based on the tree structure obtained by said forming means and said changing means.
-
-
4. A document processing method comprising:
-
a determination step of determining a plurality of threshold values based on a luminance distribution of a color image; a binarization step of obtaining a plurality of binary image data by binarizing the color image based on each of the plurality of threshold values determined in said determination step; an extraction step of extracting region information based on components included in the plurality of binary image data, the region information indicating a position and a size of regions having different background colors; a generation step of generating second binary image data based on the plurality of binary image data and the region information, by executing binarization processes in each of the regions associated with the region information extracted in the extraction step, and wherein the generation step comprises determining a binarization threshold value for a partial region of the color image corresponding to each of the regions associated with the region information extracted in the extraction step; and
wherein the second binary image data is generated by binarization of the partial regions using the set determined threshold value; anda segmentation processing step of performing region segmentation based on the second binary image data generated in said generation step and the region information extracted in said extraction step. - View Dependent Claims (5)
-
-
6. A document processing method comprising:
-
a determination step of determining a plurality of threshold values based on a luminance distribution of a color image; a binarization step of obtaining a plurality of binary image data by binarizing the color image based on each of the plurality of threshold values determined in said determination step; an extraction step of extracting region information based on components included in the plurality of binary image data, the region information indicating a position and a size of regions having different background colors; a generation step of generating a second binary image data based on the plurality of binary image data and the region information; a segmentation processing step of performing region segmentation based on the second binary image data generated in said generation step and the region information extracted in said extraction step; a forming step of forming a tree structure by extracting document elements from the second binary image data generated in the generation step; and a changing step of changing the tree structure by forming a partial tree structure, which has the document elements included in a region indicated by the region information to be children connected to a parent indicating that region, wherein the segmentation processing step includes performing the region segmentation based on the tree structure obtained in the forming step and the changing step.
-
-
7. A computer readable medium storing a control program for making a computer execute a document processing method, said document method comprising:
-
a determination step of determining a plurality of threshold values based on a luminance distribution of a color image; a binarization step of obtaining a plurality of binary image data by binarizing the color image based on each of the plurality of threshold values determined in said determination step; an extraction step of extracting region information based on components included in the plurality of binary image data, the region information indicating a position and a size of regions having different background colors; a generation step of generating second binary image data based on the plurality of binary image data and the region information, by executing binarization processes in each of the regions associated with the region information extracted in the extraction step, and wherein the generation step comprises determining a binarization threshold value for a partial region of the color image corresponding to each of the regions associated with the region information extracted in the extraction step, and wherein the second binary image data is generated by binarization of the partial regions using the determined threshold value; and a segmentation processing step of performing region segmentation based on the second binary image data generated in said generation step and the region information extracted in said extraction step.
-
Specification