Method and mechanism to reduce handwriting recognizer errors using multiple decision trees
First Claim
1. A method of recognizing chirographs input into a computer system, comprising the steps of:
- providing a primary recognizer for converting chirographs to code points;
training a plurality of secondary recognizers to differentiate chirographs which produce selected code points when provided to the primary recognizer by providing a first training set comprising a plurality of chirographs and actual code points, receiving chirographs from the first training set, providing each received chirograph to the primary recognizer and receiving a recognized code point therefrom, and grouping each chirograph and its actual code point into one of a plurality of sets, the set determined by the recognized code point returned by the primary recognizer, and associating a secondary recognizer with each selected code point;
receiving a chirograph;
providing the chirograph to the primary recognizer and receiving a code point corresponding thereto;
determining if the code point corresponds to a selected code point having a secondary recognizer associated therewith, and if so, passing the chirograph to the secondary recognizer and returning a code point from the secondary recognizer;
determining which secondary recognizers improve the recognition accuracy of the primary recognizer by individually providing chirographs from another training set to the primary recognizer and to the secondary recognizer and for each chirograph, receiving a recognized code point from each recognizer, comparing the code point recognized by the primary recognizer to the actual code point of the chirograph, and if the code points are equal, incrementing a primary match counter associated with the actual code point, comparing the code point recognized by the secondary recognizer to the actual code point of the chirograph, and if the code points are equal, incrementing a secondary match counter associated with the actual code point, comparing the primary match counter for each code point against the secondary match counter, and if the secondary match counter is less than or equal to the primary match counter, discarding the secondary recognizer for that code point; and
selecting as the selected code points those code points which correspond to the secondary recognizers that improve the recognition accuracy.
2 Assignments
0 Petitions
Accused Products
Abstract
An improved method and mechanism for recognizing chirographs (handwritten characters) input into a computer system. A primary recognizer is provided for converting chirographs to code points, and secondary recognizers such as binary CART trees are developed and trained to differentiate chirographs which produce certain code points at the primary recognizer. Each such secondary recognizer is associated with each selected code point. When a chirograph is received, the chirograph is provided to the primary recognizer whereby a code point corresponding thereto is received. If the code point corresponds to one of the secondary recognizers, the chirograph is passed to the secondary recognizer, and a code point is returned from the secondary recognizer. If not, the code point provided by the primary recognizer is returned. The invention sets forth an automated process for training the CART trees and for optimizing the recognition mechanism by discarding CART trees which do not improve on the recognition accuracy of the primary recognizer.
38 Citations
1 Claim
-
1. A method of recognizing chirographs input into a computer system, comprising the steps of:
-
providing a primary recognizer for converting chirographs to code points; training a plurality of secondary recognizers to differentiate chirographs which produce selected code points when provided to the primary recognizer by providing a first training set comprising a plurality of chirographs and actual code points, receiving chirographs from the first training set, providing each received chirograph to the primary recognizer and receiving a recognized code point therefrom, and grouping each chirograph and its actual code point into one of a plurality of sets, the set determined by the recognized code point returned by the primary recognizer, and associating a secondary recognizer with each selected code point; receiving a chirograph; providing the chirograph to the primary recognizer and receiving a code point corresponding thereto; determining if the code point corresponds to a selected code point having a secondary recognizer associated therewith, and if so, passing the chirograph to the secondary recognizer and returning a code point from the secondary recognizer; determining which secondary recognizers improve the recognition accuracy of the primary recognizer by individually providing chirographs from another training set to the primary recognizer and to the secondary recognizer and for each chirograph, receiving a recognized code point from each recognizer, comparing the code point recognized by the primary recognizer to the actual code point of the chirograph, and if the code points are equal, incrementing a primary match counter associated with the actual code point, comparing the code point recognized by the secondary recognizer to the actual code point of the chirograph, and if the code points are equal, incrementing a secondary match counter associated with the actual code point, comparing the primary match counter for each code point against the secondary match counter, and if the secondary match counter is less than or equal to the primary match counter, discarding the secondary recognizer for that code point; and selecting as the selected code points those code points which correspond to the secondary recognizers that improve the recognition accuracy.
-
Specification