Character recognition system and method therefor accommodating on-line discrete and cursive handwritten
First Claim
1. A method of sequentially receiving on-line input data as sequences of coordinates and recognizing a word represented by said input data in a data recognition system, said method comprising the steps of:
- training said system to recognize individual character units by entering said character units into said system, fitting said character units with first corresponding imaginary strokes, preprocessing said character units, converting said character units into corresponding sequences of first chain codes and representing said first chain codes in first hidden Markov models;
training said system to recognize ligatures connecting said character units by entering said ligatures into said system, fitting said ligatures with second corresponding imaginary strokes, preprocessing said ligatures, converting said ligatures into corresponding sequences of second chain codes and representing said second chain codes in second hidden Markov models;
interconnecting said first hidden Markov models with said second hidden Markov models to form a circularly connected hidden Markov model by connecting a first node of every one of said first hidden Markov models to a global initial state and a last node of every one of said first hidden Markov models to a global final state, and connecting said last node of each one of said first hidden Markov models to a first node of every one of said second hidden Markov models and a last node of each one of said second hidden Markov models to said first node of every one of said first hidden Markov models; and
recognizing said word represented by said input data by entering said word into said system, fitting said word with third corresponding imaginary strokes, preprocessing said word, converting said word into corresponding sequences of third chain codes and passing said third chain codes through said circularly connected hidden Markov model.
3 Assignments
0 Petitions
Accused Products
Abstract
Hidden Markov models (HMMs) are used in a system for recognizing on-line English characters. Input data from a tablet is represented as chain codes and HMMs are trained in character units for recognition. During HMM training, imaginary strokes are inserted into actual character strokes of input data and distances between adjacent points of the strokes are normalized. The input data is converted into chain codes, HMM-trained and then constructed into circular HMMs. Characters to be recognized are inserted with imaginary strokes, normalized, converted into chain codes, and then fed into the constructed circular HMMs, thereby enabling recognition.
50 Citations
30 Claims
-
1. A method of sequentially receiving on-line input data as sequences of coordinates and recognizing a word represented by said input data in a data recognition system, said method comprising the steps of:
-
training said system to recognize individual character units by entering said character units into said system, fitting said character units with first corresponding imaginary strokes, preprocessing said character units, converting said character units into corresponding sequences of first chain codes and representing said first chain codes in first hidden Markov models; training said system to recognize ligatures connecting said character units by entering said ligatures into said system, fitting said ligatures with second corresponding imaginary strokes, preprocessing said ligatures, converting said ligatures into corresponding sequences of second chain codes and representing said second chain codes in second hidden Markov models; interconnecting said first hidden Markov models with said second hidden Markov models to form a circularly connected hidden Markov model by connecting a first node of every one of said first hidden Markov models to a global initial state and a last node of every one of said first hidden Markov models to a global final state, and connecting said last node of each one of said first hidden Markov models to a first node of every one of said second hidden Markov models and a last node of each one of said second hidden Markov models to said first node of every one of said first hidden Markov models; and recognizing said word represented by said input data by entering said word into said system, fitting said word with third corresponding imaginary strokes, preprocessing said word, converting said word into corresponding sequences of third chain codes and passing said third chain codes through said circularly connected hidden Markov model. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method of sequentially receiving on-line input data representative of actual strokes of characters and ligatures connecting said characters and recognizing a word represented by said characters wherein said input data is represented as sequences of coordinates, said method comprising the steps of:
-
receiving and storing said input data representative of said actual strokes; identifying any severed portions existing between said actual strokes; inserting an imaginary stroke at each one of said severed portions; merging, as one stroke, said actual strokes and each said imaginary stroke to form a merged stroke; normalizing distances between adjacent coordinates in said merged stroke to generate a normalized stroke; converting data representative of said normalized stroke into a plurality of chain codes; finding an optimal path in a circularly formed hidden Markov model comprised of interconnected character and ligature models, said circularly formed hidden Markov model constructed in a training mode by connecting a first node of every one of said character models to a global initial state and a last node of every one of said character models to a global final state, and connecting said last node of each one of said character models to a first node of every one of said ligature models and a last node of each one of said ligature models to said first node of every one of said character models; and recognizing said word represented by said characters, including both discrete style and cursive style characters, in dependence upon said optimal path. - View Dependent Claims (7, 8)
-
-
9. A method of sequentially receiving on-line input data, processing said input data and recognizing a word represented by said input data in a data recognition system, said method comprising the steps of:
-
receiving said input data representative of said word to be recognized as a sequence of original coordinate points, said word being formed by cursive strokes; selecting a unit length in dependence upon physical dimensions of said cursive strokes; calculating normalized coordinate points wherein adjacent ones of said normalized coordinate points are separated by a distance equal to said unit length; generating chain codes in dependence upon angles between said adjacent ones of said normalized coordinate points; inputting said chain codes into a circularly formed hidden Markov model comprised of interconnected character Markov models and ligature Markov models, said circularly formed hidden Markov model having been constructed by connecting a first node of every one of said character Markov models to a global initial state and a last node of every one of said character Markov models to a global final state and connecting said last node of each one of said character Markov models to a first node of every one of said ligature Markov models and a last node of each one of said ligature Markov models to said first node of every one of said character Markov models; and recognizing said word represented by said input data by finding an optimal path in said circularly formed hidden Markov model. - View Dependent Claims (10, 11, 12, 13)
-
-
14. A data recognition method, comprising the steps of:
-
determining whether one of a training process and a recognition process for recognition of data is activated, said training process being a preparatory step for data recognition; determining whether a character training mode or a ligature training mode is activated when said training process has been activated; receiving first coordinate data representative of actual strokes of cursively written characters when said character training mode has been activated and receiving second coordinate data representative of actual strokes of cursively written ligatures when said ligature training mode has been activated; fitting said actual strokes of said cursively written characters with first imaginary strokes during said character training mode and fitting said actual strokes of said cursively written ligatures with second imaginary strokes during said ligature training mode; generating first merged strokes comprised of said actual strokes of said cursively written characters and said first imaginary strokes during said character training mode, and generating second merged strokes comprised of said actual strokes of said cursively written ligatures and said second imaginary strokes during said ligature training mode; normalizing distances between adjacent points of said first merged strokes during said character training mode to generate first normalized strokes, and normalizing distances between adjacent points of said second merged strokes during said ligature training mode to generate second normalized strokes; convening directional angles of vectors connecting adjacent points of said first normalized strokes into a first plurality of chain codes during said character training mode, and converting directional angles of vectors connecting adjacent points of said second normalized strokes into a second plurality of chain codes during said ligature training mode; feeding said first plurality of chain codes into character hidden Markov models during said character training mode, and feeding said second plurality of chain codes into ligature hidden Markov models during said ligature training mode; checking whether said training process is complete and connecting said character hidden Markov models to said ligature hidden Markov models to construct a circularly formed hidden Markov model when said training process is complete; activating said recognition process; receiving, for recognition, third coordinate data representative of selected ones of said actual strokes of said cursively written characters and ligatures, said selected ones of said actual strokes of said cursively written characters and ligatures comprising a word to be recognized; fitting said selected ones of said actual strokes of said cursively written characters and ligatures with third imaginary strokes; generating a third merged stroke comprised of said selected ones of said actual strokes of said cursively written characters and ligatures and said third imaginary strokes; normalizing distances between adjacent points of said third merged stroke to generate a third normalized stroke; converting said third normalized stroke into a third plurality of chain codes; feeding said third plurality of chain codes into said circularly formed hidden Markov model; and performing word recognition on a basis of an optimal path in said circularly formed hidden Markov model. - View Dependent Claims (15, 16, 17)
-
-
18. A data recognition system, comprising:
-
first means for determining whether one of a training process that is a prepatory step for data recognition and a recognition process for recognizing data has been activated; second means for providing a character training mode and a ligature training mode during said training process; third means for receiving first coordinate data representative of actual strokes of cursively written characters during said character training mode and for receiving second coordinate data representative of cursively written ligatures during said ligature training mode; fourth means for fitting said actual strokes of said cursively written characters with first imaginary strokes during said character training mode and for fitting said actual strokes of said cursively written ligatures with second imaginary strokes during said ligature training mode; fifth means for generating first merged strokes comprised of said actual strokes of said cursively written characters and said first imaginary strokes during said character training mode, and generating second merged strokes comprised of said actual strokes of said cursively written ligatures and said second imaginary strokes during said ligature training mode; sixth means for normalizing distances between adjacent points of said first merged strokes during said character training mode to generate first normalized strokes, and normalizing distances between adjacent points of said second merged strokes during said ligature training mode to generate second normalized strokes; seventh means for converting directional angles of vectors connecting adjacent points of said first normalized strokes into a first plurality of chain codes during said character training mode, and for converting directional angles of vectors connecting adjacent points of said second normalized strokes into a second plurality of chain codes during said ligature training mode; eighth means for recording training results by feeding said first plurality of chain codes into character hidden Markov models during said character training mode, and feeding said second plurality of chain codes into ligature hidden Markov models during said ligature training mode; ninth means for checking whether said training process is complete, and interconnecting said character hidden Markov models to said ligature hidden Markov models to construct a circularly formed hidden Markov model when said training process is complete; tenth means for activating said recognition process and receiving, for recognition, data representative of selected ones of said actual strokes of said cursively written characters and ligatures when construction of said circularly formed hidden Markov model is complete, fitting said selected ones of said actual strokes of said cursively written characters and ligatures with third imaginary strokes, generating a third merged stroke comprised of said selected ones of said actual strokes of said cursively written characters and ligatures and said third imaginary strokes, normalizing distances between adjacent points of said third merged stroke to generate a third normalized stroke and converting said third normalized stroke into a third plurality of chain codes; eleventh means for feeding said third plurality of chain codes converted by said tenth means into said circularly formed hidden Markov model; and twelfth means for providing a word represented by said selected ones of said actual strokes of said cursively written characters and ligatures, said word corresponding to an optimal path found in said circularly formed hidden Markov model. - View Dependent Claims (19)
-
-
20. A method for training a data recognition system to recognize words comprised of characters and ligatures connecting said characters, said method comprising the steps of:
-
inputting coordinates of actual strokes representative of said characters and ligatures; fitting imaginary strokes between said actual strokes to generate merged strokes comprised of said actual strokes and said imaginary strokes; normalizing distances between adjacent points of said merged strokes to generate normalized strokes; converting directional angles of vectors connecting adjacent points of said normalized strokes into a plurality of chain codes; recording training results by incorporating said plurality of chain codes into character Markov models and ligature Markov models; and interconnecting said character Markov models and said ligature Markov models to generate a circularly formed hidden Markov model for enabling recognition of said words comprised of said characters and ligatures, said circularly formed hidden Markov model being generated by connecting a first node of every one of said character Markov models to a global initial state and a last node of every one of said character Markov models to a global final state and connecting said last node of each one of said character Markov models to a first node of every one of said ligature Markov models and a last node of each one of said ligature Markov models to said first node of every one of said character Markov models. - View Dependent Claims (21, 22, 23)
-
-
24. A method for recognizing words in a character recognition system, comprising the steps of:
-
training said system to recognize individual characters units and corresponding ligatures connecting said individual character units by entering handwritten representations of said individual character units and said corresponding ligatures into said system, and constructing a circularly formed hidden Markov model comprised of character Markov models interconnected with ligature Markov models in dependence upon said handwritten representations of said individual character units and said corresponding ligatures, said circularly formed hidden Markov model being constructed by connecting a first node of every one of said character Markov models to a global initial state and a last node of every one of said character Markov models to a global final state and connecting said last node of each one of said character Markov models to a first node of every one of said ligature Markov models and a last node of each one of said ligature Markov models to said first node of every one of said character Markov models; inputting a sequence of coordinates of actual strokes representative of a word to be recognized, said word comprised of selected ones of said individual character units and said corresponding ligatures; fitting imaginary strokes between said actual strokes to generate a merged stroke comprised of said actual strokes and said imaginary strokes; normalizing distances between adjacent points of said merged stroke to generate a normalized stroke; converting directional angles of vectors connecting adjacent points of said normalized stroke into a plurality of chain codes; and locating an optimal path in said circularly formed hidden Markov model and outputting said word represented by said actual strokes. - View Dependent Claims (25, 26, 27, 28)
-
-
29. A character training system for enabling recognition of a word comprised of characters and ligatures connecting said characters, said system comprising:
-
means for inputting coordinates of actual strokes representative of said characters and ligatures; means for fitting imaginary strokes between said actual strokes to generate merged strokes comprised of said actual strokes and said imaginary strokes; means for normalizing distances between adjacent points of said merged strokes to generate normalized strokes; means for converting directional angles of vectors connecting adjacent points of said normalized strokes into a plurality of chain codes; means for incorporating said plurality of chain codes into character Markov models and ligature Markov models; and means for interconnecting said character Markov models and said ligature Markov models to form a circularly formed hidden Markov model for enabling recognition of said word, said interconnecting means connecting a first node of every one of said character Markov models to a global initial state and a last node of every one of said character Markov models to a final state, and for connecting said last node of each one of said character Markov models to a first node of every one of said ligature Markov models and a last node of each one of said ligature Markov models to said first node of every one of said character Markov models. - View Dependent Claims (30)
-
Specification