Character recognition system and method multi-bit curve vector processing
First Claim
1. In an automatic character recognition system for identifying an unknown character wherein said unknown character is one of a class of a plurality of known characters, a scanning and training structure including:
- means for storing a binary indicia a plurality of separate segments of character shapes, said separate segments being conformable to segments included within the shapes of said characters within said plurality of known characters, said binary encoded separate segments being referred to as canonic shapes and each being identified in said storage means by a binary encoded shape index number,means for processing and manifesting a first known character from said character class in a digital matrix format of stored binary indicia,means connected to said processing and manifesting means for digitally scanning said stored first known character indicia using a first set of selected scan parameters, said selected scan parameters designating a specific start position, scan length, scan binary indicia transition and direction for said scanning function, said digital scanning function producing a first binary encoded path vector in the form of a multi-bit binary word including the starting coordinates of said scan and a plurality of further binary coordinates of said known character along the selected length of said scan,means connected to said digital scanning and encoding means and to said binary indicia storing means for performing the complex inner product function of said binary encoded path vector and each of said stored canonic shapes to produce a plurality of complex numbers encoded in binary form, one for each complex inner product performed with each canonic shape, each of said binary encoded complex inner product numbers including a magnitude value representative of the similarity between said binary encoded path vector and each of said canonic shapes, and an angle value representative of the angular offset of the encoded path vector relative to each of said canonic shapes,means connected to said complex inner product function means for determining the magnitude of each of said encoded complex inner product number and selecting the largest magnitude number and recording the shape index of the canonic shape which produced said largest magnitude complex inner product number,means connected to said magnitude determining means for encoding at least said recorded shape index and said angle value within an N bit vector,at least a first storage plane means including a separate storage column for each of different one of said characters in said class of a plurality of characters and 2N storage rows for each different one of possible N bit vectors,and means connected to said N bit vector encoding means and said first storage plane means of entering a first storage indicia into said at least first storage plane means at a discrete storage location corresponding to the column associated with said first known character and to the row associated with said encoded N bit vector associated with said recorded shape index and angle value.
0 Assignments
0 Petitions
Accused Products
Abstract
An automatic character recognition system and method for identifying an unknown character which is one of a class of known characters. The system is set up using known specimen characters from a large character training set which must first be selected based on the use to which the character recognition will be put. Using the selected set, features and shapes of the character vocabulary in the set are obtained using selected feature scan parameters, are processed as a plurality of representative and normalized pieces of curves and are then stored in the form of binary coded representations. The system set-up also includes selecting and storing canonic shape parameters. The canonic shapes are separate pieces or segments of lines and curves which are selected on the basis that their shapes can be found as component parts or within a significant number of the characters within the character set. Having selected a character training set, selected and stored feature scan parameters and selected and stored canonic shape parameters, the system set-up procedure is completed. The next procedure is referred to as "system training" in which the individual characters within the large character training set are processed with the prior knowledge of the identity of each character being processed. This consists of an individual curve following for each of the plurality of characters on the character set and recording the path coordinates resulting from the curve following operation. The path coordinates are matched against the stored canonic shape parameters and the "best match" features are encoded. Statistical tables are then formed based on the best match relationships between the known training characters and the canonic shapes. The character recognition system is now capable of hereinafter operating with and identifying unknown characters belonging to the recognition space character set. In this procedure, the unknown character is examined using feature scan parameters to extract the features from the unknown character, providing complex vectors of the measured path coordinates of the extracted features which are matched against the stored canonic shape parameters by computing complex inner products, and a best match feature is determined.
Finally, a plurality of row vectors are extracted from the statistical tables and combined to form a product vector. The largest component of the product vector is selected, and the column index j of the maximum component is noted. The unknown character is then identified as being a member of the character membership class whose column index is j.
-
Citations
10 Claims
-
1. In an automatic character recognition system for identifying an unknown character wherein said unknown character is one of a class of a plurality of known characters, a scanning and training structure including:
-
means for storing a binary indicia a plurality of separate segments of character shapes, said separate segments being conformable to segments included within the shapes of said characters within said plurality of known characters, said binary encoded separate segments being referred to as canonic shapes and each being identified in said storage means by a binary encoded shape index number, means for processing and manifesting a first known character from said character class in a digital matrix format of stored binary indicia, means connected to said processing and manifesting means for digitally scanning said stored first known character indicia using a first set of selected scan parameters, said selected scan parameters designating a specific start position, scan length, scan binary indicia transition and direction for said scanning function, said digital scanning function producing a first binary encoded path vector in the form of a multi-bit binary word including the starting coordinates of said scan and a plurality of further binary coordinates of said known character along the selected length of said scan, means connected to said digital scanning and encoding means and to said binary indicia storing means for performing the complex inner product function of said binary encoded path vector and each of said stored canonic shapes to produce a plurality of complex numbers encoded in binary form, one for each complex inner product performed with each canonic shape, each of said binary encoded complex inner product numbers including a magnitude value representative of the similarity between said binary encoded path vector and each of said canonic shapes, and an angle value representative of the angular offset of the encoded path vector relative to each of said canonic shapes, means connected to said complex inner product function means for determining the magnitude of each of said encoded complex inner product number and selecting the largest magnitude number and recording the shape index of the canonic shape which produced said largest magnitude complex inner product number, means connected to said magnitude determining means for encoding at least said recorded shape index and said angle value within an N bit vector, at least a first storage plane means including a separate storage column for each of different one of said characters in said class of a plurality of characters and 2N storage rows for each different one of possible N bit vectors, and means connected to said N bit vector encoding means and said first storage plane means of entering a first storage indicia into said at least first storage plane means at a discrete storage location corresponding to the column associated with said first known character and to the row associated with said encoded N bit vector associated with said recorded shape index and angle value. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. In a character recognition method for identifying an unknown character wherein said unknown character is one of a class of a plurality of known characters, a plurality of scanning and training steps including:
-
the step of storing in binary indicia a plurality of separate segments of character shapes, said separate segments being conformable to segments included within the shapes of said characters within said plurality of known characters, said binary encoded separate segments being referred to as canonic shapes and each being identified in said storage means by a binary encoded shape index number, the steps of processing and manifesting a first known character from said character class in a digital matrix format of stored binary indicia, the step of digitally scanning said stored first known character indicia using a first set of selected scan parameters, said selected scan parameters designating a specific start position, scan length, binary indicia transition and direction for said scanning step, said digital scanning step producing a first binary encoded path vector in the form of a multi-bit binary word including the starting coordinates of said scan and a plurality of further binary coordinates of said known characters along the selected length of said scan, the step of performing the complex inner product function of said binary encoded path vector and each of said stored canonic shapes to produce a plurality of complex numbers encoded in binary form, one for each complex inner product performed with each canonic shape, each of said binary encoded complex inner product numbers including a magnitude value representative of the similarity between said binary encoded path vector and each of said canonic shapes, and an angle value representative of the angular offset of the encoded path vector relative to each of said canonic shapes, the steps of determining the magnitude of each of said encoded complex under product number and selecting the largest magnitude number and recording the shape index of the canonic shape which produced said largest magnitude complex inner product number, the step of encoding at least said recorded shape index and said angle value within an N bit vector, the step of providing at least a first storage plane means including a separate storage column for each of different one of said characters in said class of a plurality of characters and 2N storage rows for each different one of possible N bit vectors, and the step of entering a first storage indicia into said at least first storage plane means at a discrete storage location corresponding to the column associated with said first known character and to the row associated with said encoded N bit vector associated with said recorded shape index and angle value. - View Dependent Claims (8, 9, 10)
-
Specification