Stroke-number-free and stroke-order-free on-line Chinese character recognition method
First Claim
1. A stroke-order-free and stroke-number-free method for the on-line recognition of Chinese characters comprising the steps of:
- (a) obtaining a database of template characters, each template character being represented by a set of stroke correspondence rules for describing its constituent basic strokes;
(b) obtaining a database of spatial relationships between strokes of characters and a database of character patterns;
(c) inputting a handwritten input script on an on-line basis;
(d) preprocessing said input script to select candidate template characters for matching against said input script;
(e) performing a basic stroke recognition procedure to identify all basic strokes contained in said input script using a database of basic strokes;
(f) classifying said strokes into fore strokes, back strokes, and points, wherein said fore strokes are strokes that actually appear in the character, said back strokes are fictitious strokes to allow for stroke connections that do not appear in said database of template characters, and said points are provided to allow for truncated back strokes in said input script;
(g) for each candidate template character, performing a stroke correspondence for each stroke correspondence rule contained therein until all the stroke correspondence rules contained in said template character are exhausted, so as to identify all fore stroke→
fore stroke correspondence;
(h) performing stroke matchings in accordance with an input stroke order to find other strokes correspondences including;
back→
point, back→
fore, fore→
null, null→
fore, back→
null, and null→
back; and
(i) performing computation of discrimination functions to find a template character with a minimum distance.
2 Assignments
0 Petitions
Accused Products
Abstract
A stroke-order-free and stroke-number-free method for the on-line recognition of Chinese characters comprising the steps of: (a) inputting a handwritten input script on an on-line basis, (b) preprocessing the input script to reduce a number of possible matching template characters; (c) performing basic stroke recognition using a database of basic strokes to identify all possible basic strokes contained in the handwritten input script; (d) performing stroke correspondence using a database of stroke correspondence rules to find matching strokes in template characters for the strokes contained in the handwritten input script; and (e) performing computation of discrimination functions using a database of character patterns and a database of spatial relationships between strokes of characters to find one or more template characters with minimum error. Some of the key features of the method include: (a) the strokes recognized during the stroke correspondence are expanded to include fore strokes, back strokes, and points, in which the fore strokes are strokes that actually appear in a character, the back strokes are fictitious strokes provided to allow for stroke connections that should not appear, and the points are provided to allow for truncated back strokes in the input script; and (b) each stroke correspondence rule contains a specific set of stroke information including (i) allowed stroke types, (ii) at least one geometric feature measure, and (iii) criterion for applying the geometric feature measure; the geometric feature measure is a geometrically related characteristic measure, which can be x or y coordinates, length, or distance, associated with a particular stroke, to facilitate stroke recognition. In a preferred embodiment, eight types of stroke correspondences are allowed:(a) fore→fore, (2) back→back, (3) back→fore, (4) back→point, (5) back→null, (6) null→back, (7) fore→null, and (8) null→fore.
98 Citations
18 Claims
-
1. A stroke-order-free and stroke-number-free method for the on-line recognition of Chinese characters comprising the steps of:
-
(a) obtaining a database of template characters, each template character being represented by a set of stroke correspondence rules for describing its constituent basic strokes; (b) obtaining a database of spatial relationships between strokes of characters and a database of character patterns; (c) inputting a handwritten input script on an on-line basis; (d) preprocessing said input script to select candidate template characters for matching against said input script; (e) performing a basic stroke recognition procedure to identify all basic strokes contained in said input script using a database of basic strokes; (f) classifying said strokes into fore strokes, back strokes, and points, wherein said fore strokes are strokes that actually appear in the character, said back strokes are fictitious strokes to allow for stroke connections that do not appear in said database of template characters, and said points are provided to allow for truncated back strokes in said input script; (g) for each candidate template character, performing a stroke correspondence for each stroke correspondence rule contained therein until all the stroke correspondence rules contained in said template character are exhausted, so as to identify all fore stroke→
fore stroke correspondence;(h) performing stroke matchings in accordance with an input stroke order to find other strokes correspondences including;
back→
point, back→
fore, fore→
null, null→
fore, back→
null, and null→
back; and(i) performing computation of discrimination functions to find a template character with a minimum distance. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. An apparatus for on-line recognition of Chinese characters which is constrained by neither the stroke order nor the stroke number of said Chinese character, said apparatus comprising:
-
(a) memory means for storing a database of template characters, each template character being represented by a set of stroke correspondence rules for describing its constituent basic strokes; (b) memory means for storing a database of spatial relationships between strokes of characters and memory means for storing a database of character patterns; (c) means for inputting a handwritten input script on an on-line basis; (d) means for preprocessing said input script to select candidate template characters for matching against said input script; (e) means for performing a basic stroke recognition procedure to identify all basic strokes contained in said input script using a database of basic strokes; (f) means for classifying strokes into fore strokes, back strokes, and points, wherein said fore strokes are strokes that actually appear in the character, said back strokes are fictitious strokes to allow for stroke connections that do not appear in said database of template characters, and said points are provided to allow for truncated back strokes in said input script; (g) means for performing a stroke correspondence for each and every stroke correspondence rule contained in each candidate template character until all the stroke correspondence rules contained in said template character are exhausted, so as to identify all fore stroke→
fore stroke correspondence;(h) means for performing stroke matchings in accordance with an input stroke order to find other strokes correspondences including;
back→
point, back→
fore, fore→
null, null→
fore, back→
null, and null→
back; and(i) means for performing computation of discrimination functions to find a template character with a minimum distance. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer readable medium having a program for on-line recognition of Chinese characters which is constrained by neither the stroke order nor the stroke number of said Chinese character, said computer program comprising:
-
(a) code means for storing and retrieving data from a database of template characters, each template character being represented by a set of stroke correspondence rules for describing its constituent basic strokes; (b) code means for storing and retrieving data from a database of spatial relationships between strokes of characters and a database of character patterns; (c) code means for inputting a handwritten input script on an on-line basis; (d) code means for preprocessing said input script to select candidate template characters for matching against said input script; (e) code means for performing a basic stroke recognition procedure to identify all basic strokes contained in said input script using a database of basic strokes; (f) code means for classifying said strokes into fore strokes, back strokes, and points, wherein said fore strokes are strokes that actually appear in the character, said back strokes are fictitious strokes to allow for stroke connections that do not appear in said database of template characters, and said points are provided to allow for truncated back strokes in said input script; (g) code means for performing a stroke correspondence for each and every stroke correspondence rule contained in each candidate template character until all the stroke correspondence rules contained in said template character are exhausted, so as to identify all fore stroke→
fore stroke correspondence;(h) code means for performing stroke matchings in accordance with an input stroke order to find other strokes correspondences including;
back→
point, back→
fore, fore→
null, null→
fore, back→
null, and null→
back; and(i) code means for performing computation of discrimination functions to find a template character with a minimum distance. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification