Apparatus and method for image compression
First Claim
1. An apparatus for comprising a bit mapped image stored at a first resolution to a compressed image which has zones stored at said first resolution and zones stored at a second resolution comprising:
- means for profiling digital data representing said bit mapped image to determine where probable zones of text are located by determining the smallest space of a selected color and examining other spaces of the same color to determine whether the sizes of these other spaces is an integer multiple of said smallest space of said selected color and for determining where probable zones of graphics are located by examining patterns of blank spaces in the image; and
means coupled to said means for profiling for compressing said zones of text at said second resolution and for converting said compressed zones to text to compressed data and for storing said compressed data along with said bit mapped data representing said zones of graphics as said compressed image.
1 Assignment
0 Petitions
Accused Products
Abstract
A system for compressing the digitized images of documents so that they take up less memory space. The system is comprised of discrimination logic which scans the uncompressed image vertically in a swath through the middle of the raster lines and collects data regarding the raster lines which have a number of black pixels which exceed a user defined threshold. Interrupts are generated to cause a microprocessor to collect the data and write a vertical profile of the document. The microprocessor then causes the discrimination logic to scan unsuspected text lines horizontally and collect data regarding the columns which have a number of black pixels which exceed a user defined threshold. Interrupts are generated to cause the microprocessor to collect the data and write a horizontal document profile. After the profiling stage, the microprocessor examines the profiles and determines the zones wherein the blank spaces are integer multiples of the smallest blank spaces and labels these zones as text. The microprocessor then identifies these text zones to a compression engine which deletes every other pixel on each line and deletes every other line in text zones.
52 Citations
3 Claims
-
1. An apparatus for comprising a bit mapped image stored at a first resolution to a compressed image which has zones stored at said first resolution and zones stored at a second resolution comprising:
-
means for profiling digital data representing said bit mapped image to determine where probable zones of text are located by determining the smallest space of a selected color and examining other spaces of the same color to determine whether the sizes of these other spaces is an integer multiple of said smallest space of said selected color and for determining where probable zones of graphics are located by examining patterns of blank spaces in the image; and means coupled to said means for profiling for compressing said zones of text at said second resolution and for converting said compressed zones to text to compressed data and for storing said compressed data along with said bit mapped data representing said zones of graphics as said compressed image.
-
-
2. A method of compressing an image of a document into less storage space comprising the steps of:
-
mapping the document image into zones of text and zones of graphics by examining the regularity of the sizes of blank spaces in two axes of the image to determine how many blank spaces along each axis of profile have sizes which are integer multiples of the size of the smallest blank space in the corresponding axis of profile and for labelling as zones of text those areas of the image where the number of blank spaces which have sizes which are integer multiples of the smallest blank space on the corresponding axis of profile exceeds a predetermined threshold; compressing the zones of text by eliminating predetermined data therefrom while not eliminating data from zones of graphics.
-
-
3. A method for compressing the pixel data of a raster scanned image of a document into less storage space comprising the steps of:
-
reading a plurality of pixels from each scan line along a path orthogonal to the long axis of text lines and comparing the number of black pixels in the pixels read from each scan line to a first user defined threshold; repeating the above step for each scan line in the image; recording the location of each color change between a scan line where the number of black pixels exceeds said first threshold and a scan line where the number of black pixels is less than said first user defined threshold; forming a first profile database consisting of information from which the location and size of each probable text line, as defined by the first scan line which has a number of black pixels which exceeds said first user defined threshold followed by the first scan line with a number of black pixels which is less than said user defined threshold, and from which the location and size of each blank space between probable text lines in said path can be derived where a blank space is defined as one or more contiguous scan lines where the number of black pixels is less than the user defined threshold; reading a plurality of pixels which define a column of pixels in one of said probable text lines defined in the step next above and comparing the number of black pixels to a second user defined threshold; repeating the step next above for each column in each text line identified in said first profile database; recording the size of each blank space in each said probable text line as defined by contiguous columns having a number of black pixels which does not exceed said second user defined threshold in a second profile database; determining the smallest blank space in each of said first and second profiles and comparing the number of other blank spaces in each profile which are not under integer multiples of the smallest blank space in each profile to a third and fourth user defined threshold; mapping the image into zones of text where the third and fourth thresholds are not exceeded and zones of graphics where said third and fourth thresholds are exceeded; and compressing the zones which are determined by the above steps to be text by eliminating certain data therefrom but not eliminating data from zones determined to be graphics.
-
Specification