Apparatus and method for separating handwritten characters by line and word
First Claim
1. A computer system for locating a predetermined group of foreground image pixels chosen from a digital pixel image consisting of foreground image pixels and background pixels set forth in an array of columns and rows, said foreground image pixels forming characters arranged in a plurality of lines, said computer system comprising:
- means for computing horizontal distances between horizontally aligned foreground image pixels separated by at least one background pixel and determining the first peak distance in a histogram of occurrences of distances, said peak distance referred to as the interstroke distance;
means for horizontally dilating and then horizontally eroding said digital pixel image to enhance vertical separation of said characters in said plurality of lines;
means for grouping the characters together into blocks based on the interstroke distance and wider distances between said characters;
means for skeletonizing said blocks into lines extending the horizontal length of each block;
means for dilating the resulting skeletonized image in a vertical direction to create box areas of uniform vertical thickness;
means for dilating said resulting box areas horizontally such that box areas overlapping in the horizontal direction are merged together to form line images;
means for determining and labeling the medial axis of each respective line image;
means for simultaneously bleeding the foreground image pixels from each medial axis to identify foreground image pixels connected to a medial axis directly or via other foreground image pixels such that two characters that are connected to two different medial axes and are connected together are divided where the bleeding from the two medial axes meet;
means for identifying a desired line of said characters and associating possible wording groups from interstroke distance; and
means for selecting said predetermined group of foreground image pixels from said possible wording groups by using interstroke distances.
4 Assignments
0 Petitions
Accused Products
Abstract
A computer system and a method for a mail sorting operation in which the computer system determines the location of the ZIP code within a digital image of an address block from a piece of mail. An interstroke distance is calculated for the image and the strokes of the image are thinned to enhance vertical separation between the lines of the address block. A medial axis for each line is determined and the medial axis is superimposed upon the digital image. A bleeding operation is conducted on the digital image from the medial axis at which data bits that do not connect to the medial axis are notated as punctuation and interlinear connected strokes are then divided between the two lines. The last line which is determined to be large enough to contain a ZIP code based on bounding box size is then selected. Alternate splits of words are formed and the best split is selected in which the last formed group is detected to be the ZIP code.
-
Citations
16 Claims
-
1. A computer system for locating a predetermined group of foreground image pixels chosen from a digital pixel image consisting of foreground image pixels and background pixels set forth in an array of columns and rows, said foreground image pixels forming characters arranged in a plurality of lines, said computer system comprising:
-
means for computing horizontal distances between horizontally aligned foreground image pixels separated by at least one background pixel and determining the first peak distance in a histogram of occurrences of distances, said peak distance referred to as the interstroke distance; means for horizontally dilating and then horizontally eroding said digital pixel image to enhance vertical separation of said characters in said plurality of lines; means for grouping the characters together into blocks based on the interstroke distance and wider distances between said characters; means for skeletonizing said blocks into lines extending the horizontal length of each block; means for dilating the resulting skeletonized image in a vertical direction to create box areas of uniform vertical thickness; means for dilating said resulting box areas horizontally such that box areas overlapping in the horizontal direction are merged together to form line images; means for determining and labeling the medial axis of each respective line image; means for simultaneously bleeding the foreground image pixels from each medial axis to identify foreground image pixels connected to a medial axis directly or via other foreground image pixels such that two characters that are connected to two different medial axes and are connected together are divided where the bleeding from the two medial axes meet; means for identifying a desired line of said characters and associating possible wording groups from interstroke distance; and means for selecting said predetermined group of foreground image pixels from said possible wording groups by using interstroke distances. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer system for locating a desired group of characters chosen from a digital pixel image consisting of foreground image pixels and background pixels set forth in an array of columns and rows, said foreground image pixels forming characters arranged in a plurality of lines, said computer system comprising:
-
means for determining the medial axis of each respective line; means for superimposing each respective medial axis onto said digital pixel image from each medial axis to identify foreground image pixels connected to the medial axis either directly or via other foreground image pixels such that two characters that are in two different horizontal lines and are connected together are divided where the bleeding from the two corresponding medial axes meet; line selecting means for selecting a desired line; group selecting means for selecting said desired group of characters within said desired line. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13)
-
-
14. A method for locating a desired group of characters chosen from a digital pixel image consisting of foreground image pixels and background pixels set forth in an array of columns and rows, said foreground image pixels forming characters arranged in a plurality of lines, said method comprising the steps of:
-
computing horizontal distances between horizontally aligned foreground image pixels separated by at least one background pixel and determining the first peak distance in a histogram of occurrences of distances, said peak distance referred to as the interstroke distance; horizontally dilating and then horizontally eroding said characters to enhance vertical separation of said characters in said plurality of lines; determining the medial axis of each of said plurality of lines; simultaneously bleeding foreground image pixels from each axis to identify all foreground image pixels connected to a medial axis directly or via other foreground image pixels such that two characters that are in two different lines and are connected together are divided where the bleeding from the two corresponding medial axes meet; and selecting said desired group of characters by using said interstroke distance. - View Dependent Claims (15, 16)
-
Specification