System for recognizing handwritten character strings containing overlapping and/or broken characters
First Claim
1. A method of recognizing characters in a string of handwritten text, comprising the steps of:
- (a) providing an array of pixels corresponding to said string;
(b) estimating the number of characters contained in said string;
(c) partitioning said array into a plurality of groups of rows of said pixels;
(d) vertically partitioning each of said groups of rows in order to separate occupied columns within each of said groups of rows from unoccupied columns, said columns being occupied if they contain a foreground pixel and being unoccupied if they contain no foreground pixel;
(e) designating as a component a plurality of said columns having a foreground pixel if said plurality is contiguous, said component thereby forming a discrete pattern;
(f) removing non-character components from each of said groups of rows;
(g) recognizing said discrete patterns as characters;
(h) removing from said array said recognized discrete patterns having a predetermined confidence level of corresponding to one of said characters; and
(i) reestimating said number of characters contained in said string to obtain a reestimated number of characters.
1 Assignment
0 Petitions
Accused Products
Abstract
A system (method and apparatus) for handwritten character recognition recognizes characters within strings of text despite the presence of overlapping or disjointed characters. A string is segmented into discrete characters by removing noise and punctuation, joining the disconnected components of characters and splitting probable overlapping characters. After joining and splitting (as required), character recognition is performed to recognize characters in the string. Recognized characters are removed and the string is split in order that more characters can be recognized. If unrecognized characters remain in the string, the process is repeated. The joining and splitting operations are repeated until all subsequent characters in the string have been either recognized or determined to be unrecognizable.
-
Citations
29 Claims
-
1. A method of recognizing characters in a string of handwritten text, comprising the steps of:
-
(a) providing an array of pixels corresponding to said string; (b) estimating the number of characters contained in said string; (c) partitioning said array into a plurality of groups of rows of said pixels; (d) vertically partitioning each of said groups of rows in order to separate occupied columns within each of said groups of rows from unoccupied columns, said columns being occupied if they contain a foreground pixel and being unoccupied if they contain no foreground pixel; (e) designating as a component a plurality of said columns having a foreground pixel if said plurality is contiguous, said component thereby forming a discrete pattern; (f) removing non-character components from each of said groups of rows; (g) recognizing said discrete patterns as characters; (h) removing from said array said recognized discrete patterns having a predetermined confidence level of corresponding to one of said characters; and (i) reestimating said number of characters contained in said string to obtain a reestimated number of characters. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. Apparatus for recognizing characters in a string of handwritten text, comprising:
-
means for providing a patterned array of pixels corresponding to said string; means for estimating the number of characters contained in said string; means for partitioning said array into a plurality of groups of rows of said pixels; means for vertically partitioning each of said groups of rows to separate occupied columns within each of said groups of rows from unoccupied columns, said columns being occupied if they contain a foreground pixel and being unoccupied if they do not contain a foreground pixel; means for removing non-character components from each of said groups of rows; means for designating as a component a plurality of said columns having a foreground pixel if said plurality is contiguous, said component thereby forming a discrete pattern; means for recognizing said discrete patterns as characters; means for removing from said array ones of said components having a predetermined confidence level of forming a character; and means for reestimating said number of characters contained in said string to obtain a reestimated number of characters. - View Dependent Claims (24, 25, 26)
-
-
27. Apparatus for recognizing characters in a string of handwritten text, comprising:
-
means for providing a patterned array of pixels corresponding to said string; means for estimating the number of characters contained in said string; means for partitioning said array into a plurality of groups of rows of said pixels; means for vertically partitioning each of said groups of rows to separate occupied columns from unoccupied columns, said columns being occupied if they contain a foreground pixel and being unoccupied if they do not contain a foreground pixel; means for removing non-character components from each of said groups of rows; means for designating as a component a plurality of said columns having a foreground pixel if said plurality is contiguous, said component thereby forming a discrete pattern; means for recognizing said discrete patterns as characters; means for removing from said array ones of said components having a predetermined confidence level of forming a character; means for reestimating said number of characters contained in said string to obtain a reestimated number of characters; and means for recognizing a word comprising said recognized characters. - View Dependent Claims (28, 29)
-
Specification