Continuous parameter hidden Markov model approach to automatic handwriting recognition
First Claim
1. A program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform method steps for recognizing a handwritten character, said method steps comprising:
- (1) receiving character signals from an input device, said character signals representing training observation sequences of sample characters;
(2) sorting said character signals according to lexemes which represent different writing styles for a given character, by mapping said character signals in lexographic space, said lexographic space containing one or more character-level feature vectors, to find high-level variations in said character signals;
(3) selecting one of said lexemes;
(4) generating sequences of feature vector signals representing feature vectors for said character signals associated with said selected lexeme by mapping in chirographic space, said chirographic space containing one or more frame-level feature vectors; and
(5) generating a Markov model signal representing a hidden Markov model for said selected lexeme, said hidden Markov model having model parameter signals and one or more states, each of said states having emission transitions and non-emission transitions, wherein said step (5) comprises the steps of;
(i) initializing said model parameters signals comprising the steps of;
(a) setting a length for said hidden Markov model;
(b) initializing state transition probabilities of said hidden Markov model to be uniform;
(c) for each of said states, typing one or more output probability distributions for said emission transitions;
(d) for each of said states, assigning a Gaussian density distribution for each of one or more codebooks; and
(e) alternatively initializing one or more mixture coefficients to be values obtained from a statistical mixture model; and
(ii) updating said model parameter signals.
0 Assignments
0 Petitions
Accused Products
Abstract
A computer-based system and method for recognizing handwriting. The present invention includes a pre-processor, a front end, and a modeling component. The present invention operates as follows. First, the present invention identifies the lexemes for all characters of interest. Second, the present invention performs a training phase in order to generate a hidden Markov model for each of the lexemes. Third, the present invention performs a decoding phase to recognize handwritten text. Hidden Markov models for lexemes are produced during the training phase. The present invention performs the decoding phase as follows. The present invention receives test characters to be decoded (that is, to be recognized). The present invention generates sequences of feature vectors for the test characters by mapping in chirographic space. For each of the test characters, the present invention computes probabilities that the test character can be generated by the hidden Markov models. The present invention decodes the test character as the recognized character associated with the hidden Markov model having the greatest probability.
-
Citations
26 Claims
-
1. A program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform method steps for recognizing a handwritten character, said method steps comprising:
-
(1) receiving character signals from an input device, said character signals representing training observation sequences of sample characters; (2) sorting said character signals according to lexemes which represent different writing styles for a given character, by mapping said character signals in lexographic space, said lexographic space containing one or more character-level feature vectors, to find high-level variations in said character signals; (3) selecting one of said lexemes; (4) generating sequences of feature vector signals representing feature vectors for said character signals associated with said selected lexeme by mapping in chirographic space, said chirographic space containing one or more frame-level feature vectors; and (5) generating a Markov model signal representing a hidden Markov model for said selected lexeme, said hidden Markov model having model parameter signals and one or more states, each of said states having emission transitions and non-emission transitions, wherein said step (5) comprises the steps of; (i) initializing said model parameters signals comprising the steps of; (a) setting a length for said hidden Markov model; (b) initializing state transition probabilities of said hidden Markov model to be uniform; (c) for each of said states, typing one or more output probability distributions for said emission transitions; (d) for each of said states, assigning a Gaussian density distribution for each of one or more codebooks; and (e) alternatively initializing one or more mixture coefficients to be values obtained from a statistical mixture model; and (ii) updating said model parameter signals. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer program product for recognizing handwriting, comprising:
-
(a) a computer usable medium having computer readable program code means embodied in said medium for causing a computer to recognize a handwritten character, said computer readable program code means comprising, (b) computer readable program code means for causing a computer to receive character signals into a preprocessor unit from the input device representing training observation sequences of sample characters; (c) computer readable program code means for causing a computer to sort said character signals in said preprocessor unit according to lexemes which represent different writing styles for a given character, by mapping said sample characters in lexographic space, said lexographic space containing one;
or more character-level feature vectors, to find high-level variations in said character signals;(d) computer readable program code means for causing a computer to generate sequences of feature vector signals in a front end unit representing feature vectors for said character signals by mapping in chirographic space, said chirographic space containing one or more frame-level feature vectors; (e) computer readable program code means for causing a computer to generate Markov model signals in the modeling component representing hidden Markov models for said lexemes, each of said hidden Markov models having model parameter signals and one or more states, each of said states having emission transitions and non-emission transitions, wherein said generating means comprises; (i) computer readable program code means for causing a computer to initialize said model parameter signals in each of said hidden Markov model, said initializing means comprising; computer readable program code means for causing a computer to set a length for said hidden Markov model; computer readable program code means for causing a computer to initialize state transition probabilities of said hidden Markov model to be uniform; computer readable program code means for causing a computer to tie one or more output probability distributions for said emission transitions for each of said states; computer readable program code means for causing a computer to assign a Gaussian density distribution for each one or more codebooks for each of said states; and computer readable program code means for causing a computer to alternatively initialize one or more mixture coefficients to be values obtained from a statistical mixture model; and (ii) computer readable program code means for causing a computer to update said model parameter signals in each of said hidden Markov models. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26)
-
Specification