Estimation of baseline, line spacing and character height for handwriting recognition
First Claim
1. A method of adjusting line space and baseline in a handwriting recognition system, where a baseline is the natural line upon which a user places characters which do not have descender strokes, and line space is the maximum positive distance from the baseline which can contain points pertaining to a character, said method comprising the computer implemented steps of:
- sampling a serial electrical data stream representing (X,Y) coordinate pairs corresponding to a position of a stylus relative to an electronic tablet'"'"'s coordinate system for each character written on said electronic tablet;
producing, in response to the sampling step, an electrical representation of each character written on the tablet;
computing from the electrical representations a minimum value of Y points (Ymin) and a maximum value of Y points (Ymax) sampled for each character written on said tablet;
computing a line space (LS) for a given character to be;
##EQU1## where;
LSold is a previously computed value of LS; and
Wold and Wcur are system constants;
computing a baseline (BL) for the given character to be;
##EQU2## where;
BLold is a previously computed value of BL;
converting the electrical representations of each character to an adjusted electrical representations reflecting the computed values of LS and BL.
1 Assignment
0 Petitions
Accused Products
Abstract
A line space baseline adjuster in a handwriting recognition system achieves improved recognition accuracy by normalizing the Cartesian coordinates of the writings captured by a digitizer to coincide with prototype character space. The normalization techniques include weighted average estimation, prototype extraction estimation, extreme point clustering estimation and a combination of prototype extraction estimation and extreme point clustering estimation.
42 Citations
6 Claims
-
1. A method of adjusting line space and baseline in a handwriting recognition system, where a baseline is the natural line upon which a user places characters which do not have descender strokes, and line space is the maximum positive distance from the baseline which can contain points pertaining to a character, said method comprising the computer implemented steps of:
-
sampling a serial electrical data stream representing (X,Y) coordinate pairs corresponding to a position of a stylus relative to an electronic tablet'"'"'s coordinate system for each character written on said electronic tablet; producing, in response to the sampling step, an electrical representation of each character written on the tablet; computing from the electrical representations a minimum value of Y points (Ymin) and a maximum value of Y points (Ymax) sampled for each character written on said tablet; computing a line space (LS) for a given character to be;
##EQU1## where;
LSold is a previously computed value of LS; and
Wold and Wcur are system constants;computing a baseline (BL) for the given character to be;
##EQU2## where;
BLold is a previously computed value of BL;
converting the electrical representations of each character to an adjusted electrical representations reflecting the computed values of LS and BL. - View Dependent Claims (2)
-
-
3. A method of adjusting line space and baseline in a handwriting recognition system, where a baseline is the natural line upon which a user places characters which do not have descender strokes, and line space is the maximum positive distance from the baseline which can contain points pertaining to a character, said method comprising the computer implemented steps of:
-
storing electrical representations of a set of prototype characters, including a baseline (BLpro) value and a line space (LSpro) value for each such character; sampling a serial electrical data stream representing (X,Y) coordinate pairs corresponding to a position of a stylus relative to an electronic tablet,s coordinate system for each character written on said electronic tablet; providing an electrical representation of a sampled character for each character written on said tablet; providing an electrical representation of a recognized character in response to comparing the electrical representation of a given sampled character with the stored electrical representations of the prototype characters, with the electrical representation of the recognized character being chosen to be the electrical representation of a prototype character which the electrical representation of the sampled character most closely matches; converting the electrical representation of the next sampled character such that the next sampled character'"'"'s baseline and line space are adjusted by BLpro and LSpro, respectively, of the matched prototype of the last sampled character; wherein the line space (LSsys) of a just sampled character is;
##EQU3## and wherein the baseline of the just sampled character is;
##EQU4## where;
LSnew and BLnew are the previously computed values of LSsys and BLsys, respectively, and Wpro and Wcur are system constants.
-
-
4. A method of adjusting line space and baseline in a handwriting recognition system, where a baseline is the natural line upon which a user places characters which do not have descender strokes, and line space is the maximum positive distance from the baseline which can contain points pertaining to a character, said method comprising the computer implemented steps of:
-
(a) sampling a serial electrical data stream representing (X,Y) coordinate pairs corresponding to a position of a stylus relative to an electronic tablet'"'"'s coordinate system for each character written on said electronic tablet; (b) producing in response to the sampling an electrical representation of each character written on said tablet; (c) providing a default value of LS and BL; (d) storing electrical representations of the Ymin and Ymax numbers of each character in a line of characters; (e) taking a histogram of the stored representations of the Ymin and Ymax numbers for a line of characters to determine a set of four groups of numbers, with each such group being termed a cluster;
with the first cluster being indicative of a descender bottom of a character, the second cluster being indicative of a baseline, the third cluster being indicative of the top of a lower case character, and the fourth cluster being indicative of the top of a tall lower case character or the top of an upper case character or number;(f) determining the number of clusters found as a result of taking the histogram; (g) using the default values of LS and BL as the line space and baseline, respectively, of a sampled character in the line if under the two clusters are found; (h) estimating the missing clusters if one of two and three clusters are found with the line space and baseline, respectively, of a sampled character in the line being computed as; LS=fourth cluster - second cluster BL=second cluster; (i) if four clusters are found, computing line space and baseline, respectively, of a sampled character in the line as; LS=fourth cluster - second cluster BL=second cluster; (j) converting, in response to step (f), the electrical representation of the sampled character to an adjusted electrical representation of the character which reflects an appropriate LS and BL from steps (g)-(i).
-
-
5. A method of adjusting line space and baseline in a handwriting recognition system, where a baseline is the natural line upon which a user places characters which do not have descender strokes, and line space is the maximum positive distance from the baseline which can contain points pertaining to a character, said method comprising the computer implemented steps of:
-
(a) storing electrical representations of a set of prototype characters, including a baseline (BLpro) value and a line space (LSpro) value for each such character; (b) sampling a serial electrical data stream representing (X,Y) coordinate pairs corresponding to a position of a stylus relative to an electronic tablet'"'"'s coordinate system for each character written on said electronic tablet; (c) providing an electrical representation of a sampled character for each character written on said tablet; (d) providing an electrical representation of a recognized character in response to comparing the electrical representation of a given sampled character with the stored electrical representations of the prototype characters, with the electrical representation of the recognized character being chosen to be the electrical representation of the prototype character which the electrical representation of the sampled character most closely matches; (e) providing a first intermediate line space (LS1) of a just sampled character as;
##EQU5## (f) providing a first intermediate baseline of the just sampled character as;
##EQU6## where;
LSnew and BLnew are the previously computed values of LSsys and BLsys, respectively, and Wpro and Wcur are system constants;(g) providing a default value of LS and BL; (h) storing electrical representations of the Ymin and Ymax numbers of each character in a line of characters; (i) taking a histogram of the stored Ymin and Ymax numbers for a line of characters to determine a set of four groups of numbers, with each such group being termed a cluster;
with the first cluster being indicative of a descender bottom of a character, the second cluster being indicative of a baseline, the third cluster being indicative of the top of a lower case character, and the fourth cluster being indicative of the top of a tall lower case character or the top of an upper case character or number;(j) determining the number of clusters found as a result of taking the histogram; (k) using the default values of LS and BL as a second intermediate line space LS2 and baseline BL2, respectively, of a sampled character in the line if under two clusters are found; (l) estimating the missing clusters if one of two and three clusters are found with the second intermediate line space and baseline, respectively, of a sampled character in the line being computed as; LS2 =fourth cluster - second cluster BL2 =second cluster; (m) if four clusters are found, computing the second intermediate line space and baseline, respectively, of a sampled character in the line as; LS2 =fourth cluster - second cluster BL2 =second cluster; (n) computing a line space (LSsys) of a sampled character to be;
##EQU7## (o) computing a baseline BLsys of the sampled character to be ##EQU8## where W1 and W2 are system constants;
(p) converting, in response to step (j), the electrical representation of the sampled character to an adjusted electrical representation which reflects an appropriate line space and base line from steps (k)-(o). - View Dependent Claims (6)
-
Specification