Neural network based character position detector for use in optical character recognition
First Claim
1. An optical character recognition system for recognizing a character situated within an image field, wherein the image field contains image pixels and said system detects a position of a pre-defined element of said character within the image field, said system comprising:
- means for successively extracting pre-defined array portions from said field so as to define a plurality of corresponding windows, all of said windows having a pre-defined size, wherein each of the windows has a starting position relative to the field that is offset, along a horizontal field direction, by a pre-defined pixel amount with respect to the starting position of an adjacent one of the windows relative to the field;
a first neural network, responsive to said extracting means and having a plurality of first input nodes and a plurality of first output nodes, wherein each output node is associated with a different horizontal pixel location within an array portion of each window, and the first neural network;
generates an activation output at one of the first output nodes whenever the position of the pre-defined element, of the character and appearing in any of said windows, coincides with a horizontal pixel location associated with said one first output nodeforms, in response to all of the windows, a plurality of output activations for each of the first output nodes, and wherein arrays of the image pixels, each situated in a corresponding one of said windows, are each successively applied to corresponding ones of said plurality of first input nodes of said first neural network in order to generate said output activations, andmeans, responsive to plurality of activation outputs that occur for all of the windows, for ascertaining the position, with respect to the field, of the pre-defined element of the character wherein said ascertaining means has;
means for averaging over time the activation output generated by each of said first output nodes and produced for all of said windows to yield an average activation value for each of said first output nodes;
means for filtering the average activation value for each of the first output nodes to yield an index value for each of said first output nodes;
means for thresholding the index values, against a pre-defined threshold value, for said first nodes thereby yielding an activation group formed of thresholded index values; and
means responsive to thresholded index values within the activation group, for determining the position of the pre-defined element of the character.
12 Assignments
0 Petitions
Accused Products
Abstract
Apparatus, and an accompanying method, for use in an optical character recognition (OCR) system (5) for locating, e.g., center positions ("hearts") of all desired characters within a field (310; 510) of characters such that the desired characters can be subsequently recognized using an appropriate classification process. Specifically, a window (520) is slid in a step-wise convolutional-like fashion (5201, 5202, 5203) across a field of preprocessed, specifically uniformly scaled, characters. Each pixel in the window is applied as an input to a positioning neural network (152) that has been trained to produce an output activation whenever a character "heart" is spatially coincident with a pixel position within an array (430) centrally located within the window. As the window is successively moved across the field, in a stepped fashion, the activation outputs of the neural network are averaged, on a weighted basis, for each different window position and separately for each horizontal pixel position in the field. The resulting averaged activation output values, typically in the form of a Gaussian distribution for each character, are then filtered, thresholded and then used, via a weighted average calculation with horizontal pixel positions being used as the weights, to determine the character "heart" position as being the center pixel position in the distribution.
-
Citations
23 Claims
-
1. An optical character recognition system for recognizing a character situated within an image field, wherein the image field contains image pixels and said system detects a position of a pre-defined element of said character within the image field, said system comprising:
-
means for successively extracting pre-defined array portions from said field so as to define a plurality of corresponding windows, all of said windows having a pre-defined size, wherein each of the windows has a starting position relative to the field that is offset, along a horizontal field direction, by a pre-defined pixel amount with respect to the starting position of an adjacent one of the windows relative to the field; a first neural network, responsive to said extracting means and having a plurality of first input nodes and a plurality of first output nodes, wherein each output node is associated with a different horizontal pixel location within an array portion of each window, and the first neural network; generates an activation output at one of the first output nodes whenever the position of the pre-defined element, of the character and appearing in any of said windows, coincides with a horizontal pixel location associated with said one first output node forms, in response to all of the windows, a plurality of output activations for each of the first output nodes, and wherein arrays of the image pixels, each situated in a corresponding one of said windows, are each successively applied to corresponding ones of said plurality of first input nodes of said first neural network in order to generate said output activations, and means, responsive to plurality of activation outputs that occur for all of the windows, for ascertaining the position, with respect to the field, of the pre-defined element of the character wherein said ascertaining means has; means for averaging over time the activation output generated by each of said first output nodes and produced for all of said windows to yield an average activation value for each of said first output nodes; means for filtering the average activation value for each of the first output nodes to yield an index value for each of said first output nodes; means for thresholding the index values, against a pre-defined threshold value, for said first nodes thereby yielding an activation group formed of thresholded index values; and means responsive to thresholded index values within the activation group, for determining the position of the pre-defined element of the character. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. In an optical character recognition system for recognizing a character situated within an image field, wherein the image field contains image pixels and the system detects a position of a pre-defined element of said character with respect to the field, a method comprising the steps of:
-
successively extracting pre-defined array portions from said field so as no define a plurality of corresponding windows, all of said windows having a pre-defined size, wherein each of the windows has a starting position relative to the field that is offset, along a horizontal field direction, by a pre-defined pixel amount with respect to the starting position of an adjacent one of the windows relative to the field; through a first neural network, responsive to said extracting step and having plurality of first input nodes and a plurality of first output nodes, wherein each output node is associated with a different horizontal pixel location within an array portion of each windows the steps of; successively applying arrays of the image pixels, each array being situated in a corresponding one of said windows, to corresponding ones of first input nodes; and generating an activation output at one of the first output nodes whenever the position of the pre-defined element, of the character and appearing in any of said windows, coincides with a horizontal pixel location associated with said one first output node so as to form, in response to all of the windows, a plurality of output activations for each of the first output nodes; and ascertaining, in response to plurality of activation outputs that occur for all of the windows, the position, with respect to the field, of the pre-defined element of the character wherein said position ascertaining step further comprises the steps of; averaging over time, the activation outputs generated by each of said first output nodes and produced for all of said windows to yield an average activation value for each of said first output nodes; filtering the average activation value associated with each of the first output nodes to yield an index value for each of said first output nodes; thresholding the index values, against a pre-defined threshold value, for said first nodes thereby yielding an activation group formed of thresholded index values; and determining, in response to thresholded index values within the activation group, the position of the pre-defined element of the character. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23)
-
Specification