Character recognition apparatus
First Claim
1. A character recognition apparatus wherein one or more character category candidates for an unknown character pattern are automatically determined through a search in a dictionary depending on multivalued features extracted from said pattern, comprising:
- a tree-structured dictionary further comprising;
nonterminal nodes, each having an assigned multivalued feature and a corresponding threshold value, at least one of said nonterminal nodes having a corresponding limit value;
character category terminal nodes corresponding to at least one character category; and
at least one reject terminal node which causes termination of said search; and
means, at each nonterminal node responsive to the value of the assigned multivalued feature extracted from said unknown character pattern and corresponding limit and threshold values, for branching to the corresponding successor node, so that the branch will be made to the successor reject node if a limit value has been assigned to the nonterminal node and the value of the assigned multivalued feature is outside of the limit, and otherwise the branch will be made to the successor node based on the comparison of the value and the threshold value.
1 Assignment
0 Petitions
Accused Products
Abstract
This invention enables easy and fast selection of candidates for an inputted character image among the character categories by search of a tree-structured dictionary. Each ordinary node of the tree structure has three branches, that is two ordinary nodes and one reject node, and compares the multivalued feature extracted from the inputted character image for the node concerned, with a threshold, upper limit and lower limit values. If the feature value is out of the limited range, the search goes down to the reject node at the next level. If not, the search goes down to the right or left node at the next level based on the comparison of the feature value with the threshold value. Reject nodes have no branch. The tree structure has character category information as leaves connected to the bottom ordinary nodes and another character category information for all the reject nodes. Search of the dictionary gives character category information indicating the candidates, which are thereafter tested by detailed matching for outputting recognition results. In particular, this invention can use white pel run length as multivalued features. The white pel run length is defined as the run length of white pels from a point on an edge of the segmentation window to the nearest black pel or edge in a direction. It is easy to construct and optimize the dictionary, since many varieties of white run length are generated from little sampling data.
58 Citations
14 Claims
-
1. A character recognition apparatus wherein one or more character category candidates for an unknown character pattern are automatically determined through a search in a dictionary depending on multivalued features extracted from said pattern, comprising:
-
a tree-structured dictionary further comprising; nonterminal nodes, each having an assigned multivalued feature and a corresponding threshold value, at least one of said nonterminal nodes having a corresponding limit value; character category terminal nodes corresponding to at least one character category; and at least one reject terminal node which causes termination of said search; and means, at each nonterminal node responsive to the value of the assigned multivalued feature extracted from said unknown character pattern and corresponding limit and threshold values, for branching to the corresponding successor node, so that the branch will be made to the successor reject node if a limit value has been assigned to the nonterminal node and the value of the assigned multivalued feature is outside of the limit, and otherwise the branch will be made to the successor node based on the comparison of the value and the threshold value. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A character recognition apparatus wherein a subset of character category candidates for an unknown character pattern are automatically determined through searching a dictionary based on a plurality of white run lengths extracted from said pattern, comprising:
-
a tree-structured dictionary comprising a plurality of branch nodes, each having an assigned position and a threshold value for the white run length for that position, and each branching into two additional nodes; rejection means for aborting the search when the white run length at an assigned position is outside of a predetermined limit; and decision means for assigning at each branch node said unknown character pattern to one of the branches based on a comparison of the white run length at the assigned position of said unknown character pattern with the threshold value. - View Dependent Claims (9, 10)
-
-
11. In a method of character recognition, a method of automatically assigning an unknown character pattern to one of a plurality of character classes, comprising:
-
extracting from said unknown character pattern a plurality of multivalued features; searching a tree-structured dictionary comprising a plurality of branch nodes and a plurality of character category terminal nodes and at least one reject terminal node, wherein each branch node has assigned to it a multivalued feature and a threshold, and each character category terminal node has assigned to it a character class, said searching further comprising, at each of said branch nodes, (a) responding to the existence of assigned limits for the multivalued feature associated with the branch node, and selecting the successor reject terminal node if the multivalued feature is outside of any limit, thereby ending the search, (b) comparing a multivalued feature value extracted from said unknown character pattern and the threshold assigned for that branch node, (c) selecting one of the branches from said branch node based on said comparison, and (d) continuing said comparing and selecting until a terminal node is selected, thereby ending the search. - View Dependent Claims (12, 13, 14)
-
Specification