Hybrid feature-based and template matching optical character recognition system
First Claim
1. An optical character recognition system which uses (a) a predetermined set of reference vectors representing in a vector basis set of features a set of symbols and (b) a first set of reference templates corresponding to said set of symbols, said system comprising:
- character location means for locating a first region around an individual character image among a plurality of character images in a document image;
means for generating a character vector representing in said vector basis set of features said individual character;
means for determining which one of said reference vectors has a smallest distance from said character vector and which one of said reference vectors has a next-smallest distance from said character vector;
feature-based character recognition means for computing a confidence value based upon a comparison of said smallest distance and said next-smallest distance and for associating said character image with one of said symbols corresponding to the reference vector having said smallest distance from said character vector whenever said confidence value is at least a first predetermined threshold; and
template matching character recognition means responsive whenever said confidence value is below said first predetermined threshold for searching within said first region in said image around said one individual character image for a pattern of "ON" pixels matching to within at least a threshold number "ON" pixels of one of said first set of reference templates corresponding to said set of symbols.
1 Assignment
0 Petitions
Accused Products
Abstract
A feature-based character recognition identification and confidence level are determined for an unknown symbol. If the confidence level is within an intermediate range, the feature-based identification is confirmed by matching the unknown character with a reference template corresponding to the feature-based identification. If the confidence level is below the intermediate range, template matching character recognition is substituted in place of the feature-based identification. If the template matching recognition identifies more than one symbol, corresponding templates from a second set of templates having thicker character strokes are employed to resolve the ambiguity.
-
Citations
16 Claims
-
1. An optical character recognition system which uses (a) a predetermined set of reference vectors representing in a vector basis set of features a set of symbols and (b) a first set of reference templates corresponding to said set of symbols, said system comprising:
-
character location means for locating a first region around an individual character image among a plurality of character images in a document image; means for generating a character vector representing in said vector basis set of features said individual character; means for determining which one of said reference vectors has a smallest distance from said character vector and which one of said reference vectors has a next-smallest distance from said character vector; feature-based character recognition means for computing a confidence value based upon a comparison of said smallest distance and said next-smallest distance and for associating said character image with one of said symbols corresponding to the reference vector having said smallest distance from said character vector whenever said confidence value is at least a first predetermined threshold; and template matching character recognition means responsive whenever said confidence value is below said first predetermined threshold for searching within said first region in said image around said one individual character image for a pattern of "ON" pixels matching to within at least a threshold number "ON" pixels of one of said first set of reference templates corresponding to said set of symbols. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. An optical character recognition method which uses (a) a predetermined set of reference vectors representing in a vector basis set of features a set of symbols and (b) a first set of reference templates corresponding to said set of symbols, comprising:
-
locating a first region around an individual character image among a plurality of character images in a document image; generating a character vector representing in said vector basis set of features said individual character; determining which one of said reference vectors has a smallest distance from said character vector and which one of said reference vectors has a next-smallest distance from said character vector; computing a confidence value based upon a comparison of said smallest distance and said next-smallest distance and associating said character image with one of said symbols corresponding to the reference vector having said smallest distance from said character vector whenever said confidence value is at least a first predetermined threshold; and wherever said confidence value is below said first predetermined threshold, first searching within said first region in said image around said one individual character image for a pattern of "ON" pixels matching to within at least a threshold number "ON" pixels of one of said first set of reference templates corresponding to said set of symbols. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
Specification