Recognition of characters in cursive script
First Claim
1. A method of recognizing characters in cursive script in which the script is scanned to detect word boundaries and words are then segmented into characters, characterized by the steps of:
- (i) choosing and extracting (30) a word boundary from a cursive script comprised of characters;
(ii) starting at said word boundary, extracting (50) a portion of said word;
(iii) comparing (60) said extracted portion with a set of reference portions representing known characters, each of said known characters having an average width;
(iv) extracting a second portion, said second portion being successive to said first portion, and comparing of said second portion with said set of reference portions;
(v) repeating said extracting step (iv) and said comparing step (iii) with successive portions until said successive portions have been identified as one of said known characters, and(vi) skipping a number of portions depending on said average width of said identified known character;
(vii) starting from the last skipped portion, extracting a portion of said word and then repeating the process from step (ii) for the identification of next and subsequent characters.
0 Assignments
0 Petitions
Accused Products
Abstract
A system and method for recognizing characters in cursive script is provided in which the script is scanned to detect word boundaries and words are then segmented into characters. This is accomplished by segmenting the script to form an initial portion, the segmentation being performed with reference to its position relative to a word boundary. This initial portion is then compared with a set of reference portions. Subsequent portions of the script are taken in sequence and compared with reference portions until a character is identified with an uncertainty less than a predetermined threshold value. A new initial portion is then segmented, with the new initial portion chosen on the basis of the average width of the character identified and the comparison process repeated to identify the next character.
32 Citations
11 Claims
-
1. A method of recognizing characters in cursive script in which the script is scanned to detect word boundaries and words are then segmented into characters, characterized by the steps of:
-
(i) choosing and extracting (30) a word boundary from a cursive script comprised of characters; (ii) starting at said word boundary, extracting (50) a portion of said word; (iii) comparing (60) said extracted portion with a set of reference portions representing known characters, each of said known characters having an average width; (iv) extracting a second portion, said second portion being successive to said first portion, and comparing of said second portion with said set of reference portions; (v) repeating said extracting step (iv) and said comparing step (iii) with successive portions until said successive portions have been identified as one of said known characters, and (vi) skipping a number of portions depending on said average width of said identified known character; (vii) starting from the last skipped portion, extracting a portion of said word and then repeating the process from step (ii) for the identification of next and subsequent characters. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A system for recognizing characters in cursive script in which the script is scanned to detect word boundaries (40) and words are then segmented into characters, characterized by:
-
a sectioning means (50) for forming a series of portions representing features of cursive script at different positions in characters constituting the script; and recognition and segmentation means (60) for comparing an initial portion, chosen with reference to a word boundary, with a known portion from a set of reference portions and to compare subsequent portions in said series similarly with known portions, until the cumulative results of the comparison identify a character with an uncertainty less than a predetermined threshold value, said character having an average width, skipping means for skipping a number of portions determined by said average width of said identified character and determining the positions of said cursive script at which said sectioning means (50) and said recognition and segmentation means (60) shall be applied. - View Dependent Claims (7, 8, 9, 10)
-
-
11. Apparatus for recognizing characters in cursive script comprising:
-
(a) means for image processing (40) to detect word boundaries and isolating patterns relating to words in the cursive script, (b) means for feature extraction (50) for receiving the output of means (40) for image processing and forming a series of vectors (57) representing features of the characters at different positions in the cursive script, (c) means for performing unsupervised learning (70) to generate a set of reference vectors representative of possible vectors relating to characters in the cursive script and storing them in a code book (90), (d) means for supervised learning (80) to compute statistics necessary for identification of characters in the cursive script, including (i) means for computing the a-priori conditional probability (100) of a vector relative to the nearest reference vector stored in the code book, and (ii) means for computing a-posteriori probability (110) of characters in the cursive script, (e) electron means for directing the output of said means for feature extraction (50) to the unsupervised learning means (70) or the supervised learning means (80), (f) means for performing recognition/segmentation (60) to compare known characters with referenced portions of characters in the cursive script until a character has been identified as one of the known characters, each of said known characters having an average width, (g) means for supplying input to said means for performing recognition/segmentation (60) from the code book (90);
said means for computing a-posteriori probability (110) and said means for feature extraction (50),(h) output means connected to said means for performing recognition/segmentation (60) for providing recognized text or providing unrecognized text as an input to said means for feature extraction, and (i) skipping means for providing a number of portions to be skipped as an input to said means for feature extraction (50), said number based on input from said means for performing recognition/segmentation (110), said input based on said average width of said identified known character.
-
Specification