Character-recognition systems and methods with means to measure endpoint features in character bit-maps
First Claim
1. A character recognition system that identifies an input character as being a unique member of a defined character set, said system comprising:
- a bit-map means for generating a character bit-map of an input character;
a character recognition means for processing said character bit-map and generating a set of (M) finite confidence measures one for each of (M) members of said character set, said confidence measures representing the degree of confidence that said input character corresponds to each of said (M) members of said character set;
a decision means for deciding if the confidence measure with the highest degree of confidence is an acceptable confidence measure;
a first output means for reporting as an output character the member of said character set with said acceptable confidence measure;
an augment means for identifying (N) of said (M) members with the (N) highest confidence measures, where (M) is greater than (N), and processing said bit-map if said decision means decides that there is no acceptable confidence measure, said augment means having a measuring means for measuring stroke endpoint locations and orientations of said bit-map and a second output means for reporting one of said (N) members as an output character based on said stroke endpoint locations and orientations; and
wherein said augment means further comprises a database means having a database of character strings, each said character string including a subset of said character set and wherein the members of each said subset represent different characters in said character set which have common stroke endpoint locations and orientations.
2 Assignments
0 Petitions
Accused Products
Abstract
Character recognition method and system that identifies an input character as being a unique member of a defined character set. Specifically, a character bit-map of an input character is first generated. Thereafter, a character recognition procedure processes the character bit-map to generate a set of confidence measures one for each of the members of the character set. The confidence measures represent the degree of confidence that the input character corresponds to the members of the character set. A test is then made to determine if the confidence measure with the highest degree of confidence is acceptable. If there is an acceptable confidence measure, the member of the character set with the acceptable confidence measure is reported as the output character. If there is no acceptable confidence measure, a number of characters with the highest confidence measures are identified as candidates. Also, the character bit-map is analyzed further to obtain stroke-endpoint information which is then compared to a learned endpoint database having a number of character string-signature pairs. If there is a match between a database string and a candidate character, the match is used to report an output character. Endpoint location and orientation are obtained by modeling the bit-map as a charge distribution. A potential profile is constructed and thresholded and the results clustered into regions to obtain endpoint location information- The gradient of the potential profile is used to obtain endpoint orientation information.
-
Citations
22 Claims
-
1. A character recognition system that identifies an input character as being a unique member of a defined character set, said system comprising:
-
a bit-map means for generating a character bit-map of an input character; a character recognition means for processing said character bit-map and generating a set of (M) finite confidence measures one for each of (M) members of said character set, said confidence measures representing the degree of confidence that said input character corresponds to each of said (M) members of said character set; a decision means for deciding if the confidence measure with the highest degree of confidence is an acceptable confidence measure; a first output means for reporting as an output character the member of said character set with said acceptable confidence measure; an augment means for identifying (N) of said (M) members with the (N) highest confidence measures, where (M) is greater than (N), and processing said bit-map if said decision means decides that there is no acceptable confidence measure, said augment means having a measuring means for measuring stroke endpoint locations and orientations of said bit-map and a second output means for reporting one of said (N) members as an output character based on said stroke endpoint locations and orientations; and wherein said augment means further comprises a database means having a database of character strings, each said character string including a subset of said character set and wherein the members of each said subset represent different characters in said character set which have common stroke endpoint locations and orientations. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A stroke endpoint detector that identifies which of (N) candidate characters represents a unique member of a defined character set comprising:
-
a bit-map means for reading an input bit-map; a database means having a database of character strings, each said character string comprises a subset of said character set and wherein the members of each said subset representing different characters in said character set which have common stroke endpoint features; measuring means for measuring the location and orientation of said stroke endpoints of said character bit-map, said measuring means including charge model means for constructing a potential profile based on a charge distribution model of said bit-map and location means for thresholding said potential profile and clustering said thresholded profile to determine the locations of regions of said stroke endpoints in said bit-map; search means, responsive to said measuring means, for searching said database to identify an output character string, and locate matches between said output string and said (N) candidate characters; and output means responsive to said search means for reporting a matching candidate character as an output character. - View Dependent Claims (8, 9, 10, 11)
-
-
12. A character recognition method for identifying an input character as being a unique member of a defined character set comprising the steps of:
-
generating a character bit-map of an input character; processing said character bit-map with a character recognition procedure to generate a set of (M) finite confidence measures one for each of (M) members of said character set, said confidence measures representing the degree of confidence that said input character corresponds to each of said (M) members of said character set; determining if the confidence measure with the highest degree of confidence is an acceptable confidence measure; if there is an acceptable confidence measure, reporting as an output character the member of said character set with said acceptable confidence measure; if there is no acceptable confidence measure, identifying (N) of said (M) members with the (N) highest confidence measures, where (M) is greater than (N), and analyzing said character bit-map to measure stroke endpoint locations and orientations of said bit-map; reporting one of said (N) members as an output character based on the measure of said stroke endpoint locations and orientations; and further comprising the step of;
constructing a database of character strings, each said character string comprising a subset of said character set and wherein the members of said subset represent different characters in said character set which have common stroke endpoint locations and orientations; and
wherein said analyzing step comprises the step of searching said database. - View Dependent Claims (13, 14, 15, 16, 17)
-
-
18. A stroke endpoint detection method that identifies which of (N) candidate characters represents a unique member of a defined character set comprising:
-
reading an input bit-map; constructing a learned database of character strings, each said character string comprising a subset of said character set and wherein the members of each said subset representing different characters in said character set which have common stroke endpoint locations and orientations; measuring stroke-endpoint location and orientation of said character bit-map by using a charge model means for constructing a potential profile based on a charge distribution model of said bit-map and thresholding said potential profile and clustering said thresholded profile to determine the locations of regions of said stroke endpoints in said bit-map; searching said database to identify an output character string, and locate matches between said output string and said (N) candidate characters; and reporting one of said matches as an output character. - View Dependent Claims (19, 20, 21, 22)
-
Specification