Method and apparatus for isolating image data for character recognition
First Claim
1. A process of segmenting data bits derived in response to scanning characters, comprising the steps of:
- (a) isolating a descrete group of said data bits which may be associated with more than one character;
(b) examining said discrete group of data bits according to predetermined criteria in search for a
1 Assignment
0 Petitions
Accused Products
Abstract
A banking system in which this invention is used is disclosed. The system includes an imaging device which produces digitized image data of one side of documents, like checks, as they are moved along a document transporter. The system is designed to read, automatically, the "courtesy" or monetary amount of the documents from the digitized image data. Handwritten, courtesy amounts which have touching or overlapping characters or numbers must be separated or segmented prior to being subjected to character recognition. According to this invention, the digitized data is processed to "single out" image data which ostensibly contains data for more than one character. Special window "masks" are used to examine the image data (which is binary) to search for a potential joint between the "two characters". When the potential joint is found, potential segmentation vectors at different angles are examined to determine the preferred segmentation vector. The image data for more than one character is divided along the preferred segmentation vector and the divided image data is later subjected to character recognition.
155 Citations
9 Claims
-
1. A process of segmenting data bits derived in response to scanning characters, comprising the steps of:
-
(a) isolating a descrete group of said data bits which may be associated with more than one character; (b) examining said discrete group of data bits according to predetermined criteria in search for a - View Dependent Claims (3, 5, 6, 7)
-
-
2. Possible joint between first and second groups of data bits within said group;
-
(c) selecting a segmentation vector according to second predetermined criteria to separate said first and second groups of data bits; and (d) separating said first and second groups of data bits at said joint along a selected said segmentation vector; said data bits consisting of binary ones and zeros and said selecting step including the step of; (c-1) finding the length of the shortest segmentation vector consisting of binary ones which lies within a predetermined range and which also provides a path of binary zero from the outward end of said segmentation vector to a predetermined side of the associated object; said predetermined range being given by the equation;
space="preserve" listing-type="equation">MP-MP/2≦
R≦
(MP+MP/2);wherein MP is equal to the midpoint of said discrete group of data bits and R is equal to said predetermined range.
-
-
4. A process of facilitating character recognition associated with image data derived in response to scanning characters, said image data being in the form of a plurality of records comprised of binary ones and zeroes corresponding to at least portions of images of said characters, said process comprising the steps of:
-
(a) grouping said records into objects, with each said object ostensibly corresonding to at least one character; (b) isolating those of said objects which ostensibly contain at least two characters; (c) examining a said object from step b according to predetermined criteria in search for a possible joint between first and second groups of image data within said object; (d) selecting a segmentation vector according to second predetermined criteria to separate said first and second groups of image data at said joint; (e) separating said first and second groups of image data at said joint along a segmentation vector selected from step d to form discrete objects; and (f) forwarding said objects from step a and from step e to character recognition circuitry to effect character recognition; said examining step including the step of; (c-1) searching for said possible joint in a predetermined range which is determined from the physical size of the image data associated with said object; said predetermined range being given by the equation;
space="preserve" listing-type="equation">MP-MP/2<
R<
(MP+MP/2);wherein MP is equal to the midpoint of said discrete group of data bits and R is equal to said predetermined range; said searching step including the step of; (c-2) using masks to search for particular patterns of binary ones and zeros which are indicative of a possible joint; said searching step being effected along columns of said image data, with said columns of image data having the same orientation as said characters which are scanned; and said selecting step including the step of; (d-1) finding the length of the shortest segmentation vector consisting of binary ones which lies within said predetermined range.
-
-
8. A character recognition process comprising the steps of:
-
(a) scanning a document having characters thereon to generate binary image data corresponding to said characters; (b) presenting said binary image data associated with a document in the form of discrete objects, with each said object being comprised of a matrix of data bits and with each said object ostensibly corresponding to at least one of said characters; (c) isolating those of said objects which ostensibly contain at least two characters; (d) moving an examining window relating to one of said objects from step c to search according to predetermined criteria for a possible joint between first and second sub-groups of data bits within a predetermined range of said object; (e) separating said first and second subgroups of data bits at a said joint when found to form discrete objects; and (f) forwarding said objects from steps b and e to character recognition apparatus to effect character recognition; said predetermined range being given by the equation;
space="preserve" listing-type="equation">MP-MP/2<
R<
(MP+MP/2);wherein MP is equal to the midpoint of said discrete group of data bits and R is equal to said predetermined range; said separating step including the step of; (e-1) selecting a segmentation vector according to second predetermined criteria to separate said first and second sub-groups of data bits within said object; said data bits consisting of binary ones and zeros and said selecting step including the step of; (e-2) finding the length of the shortest segmentation vector consisting of binary ones which lies within said predetermined range and which also provides a path of binary zeros from the outward end of said segmentation vector to a predetermined side of the associated object.
-
-
9. An apparatus for facilitating character recognition associated with image data derived in response to scanning characters, said image data being in the form of a plurality of records comprised of binary ones and zeroes corresponding to at least portions of images of said characters, said apparatus comprising:
-
means for grouping said records into objects, with each said object ostensibly corresponding to at least one character; means for isolating those of said objects which ostensibly contain at least two characters; means for examining a said object from said isolating means according to predetermined criteria in search for a possible joint between first and second groups of data within said object; means for selecting a segmentation vector according to second predetermined criteria to separate said first and second groups of data at said joint; means for separating said first and second groups of data at said joint along a segmentation vector selected from said selecting means to form discrete objects; and means for forwarding said objects from said grouping means and from said separating means to a character recognition processor to effect character recognition; said isolating means including means for testing predetermined physical parameters of the image data associated with said objects against derived values; said testing means including masks to search for particular patterns of binary ones and zeros which are indicative of a possible joint; and said selecting means including means for finding the length of the shortest segmentation vector consisting of binary ones which lies within a predetermined range and which also provides a path of binary zeros from the outward end of said segmentation vector to a predetermined side of the associated object; said predetermined range being given by the equation;
space="preserve" listing-type="equation">MP-MP/2≦
R≦
(MP+MP/2);wherein MP is equal to the midpoint of said discrete group of data bits and R is equal to said predetermined range.
-
Specification