Universal symbolic handwriting recognition system
First Claim
1. A system for the symbolic-based recognition of a handwriting input to an input surface having an output provided as a stream of coordinate data points, comprising:
- a data base memory storing a compilation of text-pattern pairs, said pattern of each pair having been derived as sample features of said handwriting and including an associated sample index derived as aspects of said handwriting sample features, said text of each said pair representing at least one predetermined character glyph;
a processor, responsive to said output to extract pattern test features and a corresponding test index therefrom, responsive to access said memory stored compilation and identify a said sample index corresponding with said test index, responsive to carry out a comparison of said pattern test features with said pattern sample features associated with said identified sample index, responsive in the presence of a correspondence between said pattern test and sample features to derive output signals corresponding with said text associated with said pattern sample features;
said pattern sample and test features including index based features derived from the stroke defining sequences of said data points representing a word and comprisinga first index component provided as a value corresponding with the total number of strokes comprising a word,a second index component provided as a value corresponding with the number of inflection points of a stroke, wherein such inflection points comprise a significant bend in a stroke, a top location, a bottom location and a baseline crossing point in the stroke, anda third index component provided as a value corresponding with the termination of inflection point index components for a given stroke; and
display means responsive to said derived output signals for effecting the publication of a said predetermined character glyph of said text associated with said pattern sample features.
2 Assignments
0 Petitions
Accused Products
Abstract
A universal symbolic handwriting recognition system for converting user entered time ordered stroke sequences into computer readable text is described. The system operates on two levels: (1) a word-level recognizer, which recognizes the entire group of strokes as a unit, and (2) a parser-level recognizer, which breaks the strokes into segments and recognizes groups of stroke segments within a word, thus recognizing separate characters or character sequences within a word to build a complete recognition string. In both recognition levels, the system trains on actual user samples, either on an entire word, or on a character or character sequence within a word. It does so by building a user specific sample recognition data-base file of text/pattern pairs, where the text is specified by the user in a word confirmation process and the pattern, composed of an index and a feature vector, is created from the actual user input strokes. Thus, as the user continues to use the recognition system and augments his/her user specific sample recognition data-base file, the correct recognition rate climbs approaching 100 percent in normal usage. The word-level recognizer can also be used to train on abbreviations, custom shorthands, and pictographic characters, such as the Japanese Kanji, or Chinese. An abbreviated Japanese Kanji or Chinese handwritten entry can even be trained for recognition. The text in the user specific sample data-base file is maintained in the Unicode format, and the user can specify the recognized return string format as either Unicode, ANSI, or JIS.
195 Citations
41 Claims
-
1. A system for the symbolic-based recognition of a handwriting input to an input surface having an output provided as a stream of coordinate data points, comprising:
-
a data base memory storing a compilation of text-pattern pairs, said pattern of each pair having been derived as sample features of said handwriting and including an associated sample index derived as aspects of said handwriting sample features, said text of each said pair representing at least one predetermined character glyph; a processor, responsive to said output to extract pattern test features and a corresponding test index therefrom, responsive to access said memory stored compilation and identify a said sample index corresponding with said test index, responsive to carry out a comparison of said pattern test features with said pattern sample features associated with said identified sample index, responsive in the presence of a correspondence between said pattern test and sample features to derive output signals corresponding with said text associated with said pattern sample features; said pattern sample and test features including index based features derived from the stroke defining sequences of said data points representing a word and comprising a first index component provided as a value corresponding with the total number of strokes comprising a word, a second index component provided as a value corresponding with the number of inflection points of a stroke, wherein such inflection points comprise a significant bend in a stroke, a top location, a bottom location and a baseline crossing point in the stroke, and a third index component provided as a value corresponding with the termination of inflection point index components for a given stroke; and display means responsive to said derived output signals for effecting the publication of a said predetermined character glyph of said text associated with said pattern sample features. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. The method for generating and displaying text corresponding with a stream of coordinate data points generated as the output of a stylus responsive user handwriting input device, comprising the steps of:
-
storing a compilation of text-pattern pairs in memory, each said pair having been derived as sample features of said handwriting and including an associated sample index derived as aspects of said handwriting sample features, said text of each said pair representing at least one predetermined character glyph; extracting pattern test features from a stream of said coordinate data points and forming a corresponding test index including the steps of; determining bounding box parameters for a word defining sequence of said coordinate data points, scanning said word defining coordinate data points and determining word crossings corresponding to each scan line therewith, determining a base line for said word defining sequence of data coordinate points, determining bounding box parameter for each stroke of said word defining sequence of said coordinate data points, determining the aspect ratio of each said stroke wherein such stroke aspect ratio generally is determined to be K/π
*ARCTAN (stroke-height/stroke-width) where K is a constant,determining the aspect ratio of said word, wherein such word aspect ratio generally is derived as K (height/diagonal) where K is a constant, height is word height and diagonal is the square root of the height squared plus the width squared, determining the relative bottom location of said word, and determining the relative height of said word; accessing said compilation and identifying a said sample index corresponding with said test index; accessing said pattern sample features corresponding with said identified sample index and comparing said accessed pattern sample features with said pattern test features; accessing the said text corresponding with said accessed pattern sample features in the event said comparison derives an acceptable correspondence between said accessed pattern sample features and said pattern test features; and displaying said accessed text. - View Dependent Claims (14, 15)
-
-
16. The method for generating and displaying text corresponding with a stream of coordinate data points generated as the output of a stylus responsive user handwriting input device, comprising the steps of:
-
storing a compilation of text-pattern pairs in memory, each said pair having been derived as sample features of said handwriting and including an associated sample index derived as aspects of said handwriting sample features, said text of each said pair representing at least one predetermined character glyph; extracting pattern test features from a stream of said coordinate data points and a corresponding test index, including the steps of; carrying out an extrapolation between a sequence of said coordinate data points defining a stroke and deriving a representation of a sequence of data points which are uniformly spaced along the locus thereof defining said strokes, and determining worm moments of said stroke evaluating a select, progressive sequence of a predetermined number of positions of said data points; accessing said compilation and identifying a said sample index corresponding with said test index; accessing said pattern sample features corresponding with said identified sample index and comparing said accessed pattern sample features with said pattern test features; accessing the said text corresponding with said accessed pattern sample features in the event said comparison derives an acceptable correspondence between said accessed pattern sample features and said pattern test features; and displaying said accessed text. - View Dependent Claims (17, 18, 19)
-
-
20. A system for the symbolic-based recognition of a handwriting input to an input surface having an output provided as a stream of coordinate data points, comprising:
-
a data base memory storing a compilation of text-pattern pairs, said pattern of each said pair having been derived as sample features of said handwriting and including an associated sample index derived as aspects of said handwriting sample features, said text of each said pair representing at least one predetermined character glyph; a processor, responsive to said output to extract pattern test features and a corresponding test index therefrom, responsive to access said memory stored compilation and identify a said sample index corresponding with said test index, responsive to carry out a comparison of said pattern test features with said pattern sample features associated with said identified sample index, responsive in the presence of a correspondence between said pattern test and sample features to derive output signals corresponding with said text associated with said pattern sample features, said pattern symbols and test features including; a first word feature provided as a value corresponding with the word bottom location relative to the word base line of a said output representing a word, said relative word bottom being determined to be K* ((ymax +line-space)/(2*line-space)) where ymax is the visually lowest part of such word measured from the baseline, line-space is a selected spacing between writing lines and K is a constant, a second feature word provided as a value corresponding with the relative height of a said output representing a word, and a third word feature provided as a value corresponding with the aspect ratio of a said output representing a word, said word aspect ratio generally being determined to be K (height/diagonal) where K is a constant, height is word height, and diagonal is the square root of the height squared plus the width squared; and display means responsive to said derived output signals for effecting the publication of a said predetermined character glyph sequence of said text associated with said pattern sample features. - View Dependent Claims (21, 22, 23, 24, 25, 26)
-
-
27. A system for the symbolic-based recognition of a handwriting input to an input surface having an output provided as a stream of coordinate data points, comprising:
-
a data base memory storing a compilation of text-pattern pairs, said pattern of each said pair having been derived as sample features of said handwriting and including an associated sample index derived as aspects of said handwriting sample features, said text of each said pair representing at least one predetermined character glyph; a processor, responsive to said output to extract pattern test features and a corresponding test index therefrom, responsive to access said memory stored compilation and identify a said sample index corresponding with said test index, responsive to carry out a comparison of said pattern test features with said pattern sample features associated with said identified sample index, responsive in the presence of a correspondence between said pattern test and sample features to derive output signals corresponding with said text associated with said pattern sample features, wherein said pattern sample and test features include stroke based features derived from each stroke defining sequence of said data points and comprising; a first stroke feature provided as a value corresponding with the average worm moment based moment of stroke, a second stroke feature provided as a value corresponding with the aspect ratio of a stroke, said stroke aspect ratio generally being determined to be K/π
* ARCTAN (stroke-height/stroke-width) where K is a constant, anda third stroke feature provided as a value corresponding with the beginning and ending locations of a stroke relative to a dimension of a word within which such stroke is incorporated, said third stroke feature for the general case being expressed as; x0 =K (xb -xmin)/(xmax -xmin) and x1 =K (xe -xmin)/(xmax xmin), y0 =K (yb -ymin)/(ymax ymin) and y1 =K (ye -ymin)/(x=ymax ymin), where x0, y0 are the stroke'"'"'s respective beginning x- and y- coordinate features, x1, y1 are the strokes respective ending x- and y- coordinate features xb, yb are the stroke'"'"'s beginning respective x- and y- coordinate values, xmin, ymin are the word'"'"'s minimum respective x- and y- coordinates, xmax, ymax are the word'"'"'s maximum respective x- and y- coordinates, and xe, ye are the stroke'"'"'s respective x- and y- ending coordinate values; and display means responsive to said derived output signals for effecting the publication of a said predetermined character glyph sequence of said text associated with said pattern sample features. - View Dependent Claims (28, 29, 30, 31, 32)
-
-
33. A system for the symbolic-based recognition of a handwriting input to an input surface having an output provided as a stream of coordinate data points, comprising:
-
a data base memory storing a compilation of text-pattern pairs, said pattern of each pair having been derived as sample features of said handwriting and including an associated sample index derived as aspects of said handwriting sample features, said text of each said pair representing at least one predetermined character glyph; a processor, responsive to said output to extract pattern test features and a corresponding test index therefrom, responsive to access said memory stored compilation and identify a said sample index corresponding with said test index, responsive to carry out a comparison of said pattern test features with said pattern sample features associated with said identified sample index, responsive in the presence of a correspondence between said pattern test and sample features to derive output signals corresponding with said text associated with said pattern sample features; said pattern sample and test features including segment based features derived from each segment of each stroke defining sequence of said data points and comprising; a first segment feature provided as a value corresponding with the accumulated distances between sequential pairs of said data points defining a stroke from the beginning of a stroke to the end of a segment of a stroke relative to the total distance from the beginning of the stroke to the end of the stroke, a second segment feature provided as a value corresponding with the x-coordinate overlap between two adjacent segments of a stroke, and a third segment feature provided as a value corresponding with the relative x-coordinate based distance between adjacent inflection points of a stroke, wherein such inflection points comprise a significant bend in a stroke, a top location, a bottom location, and a baseline crossing point in the stroke; and display means responsive to said derived output signals for effecting the publication of a said predetermined character glyph sequence of said text associated with said pattern sample features. - View Dependent Claims (34, 35, 36, 37)
-
-
38. A system for the symbolic-based recognition of a handwriting input to an input surface having an output provided as a stream of coordinate data points, comprising:
-
a data base memory storing a compilation of text-pattern pairs, said pattern of each said pair having been derived as sample features of said handwriting and including an associated sample index derived as aspects of said handwriting sample features, said text of each said pair representing at least one predetermined character glyph; a processor responsive to said output to extract pattern test features and a corresponding test index therefrom, responsive to access said memory stored compilation and identify a said sample index corresponding with said test index, responsive to carry out a comparison of said pattern test features with said pattern sample features associated with said identified sample index, responsive in the presence of a correspondence between said pattern test and sample features to derive output signals corresponding with said text associated with said pattern sample features; said pattern sample and test features including segment based features derived from each segment of each stroke defining sequence of said data points and comprising; a first segment feature provided as a value corresponding with the absolute inflection point x-coordinate, said absolute inflection point x-coordinate being a function of the difference of the x-coordinate distance of the inflection point from the minimum x-coordinate value of a stroke to the x-coordinate span of such stroke, a second segment feature provided as a value corresponding with the absolute inflection point y-coordinate, said absolute inflection point y-coordinate being a function of the difference of the y-coordinate distance of the inflection point from the minimum y-coordinate value of a stroke to the y-coordinate span of such stroke, a third segment feature provided as a value corresponding with the x curvature of a segment of a said stroke relative to the stroke'"'"'s bounding box, and a fourth segment feature provided as a value corresponding with the y curvature of a segment of a said stroke relative to the stroke'"'"'s bounding box; and display means responsive to said derived output signals for effecting the publication of a said predetermined character glyph of said text associated with said pattern sample features. - View Dependent Claims (39, 40, 41)
-
Specification