Neural network-based diacritical marker recognition system and method
First Claim
1. A diacritical marker recognition system which is connectable to receive a raster image and bounding box information, said bounding box information specifying locations and dimensions of a rectangle surrounding characters in said raster image, said diacritical marker recognition system comprising:
- extraction means for extracting a plurality of character images from said raster image based on said bounding box information;
subsampling means for subsampling an upper and lower region of each of said extracted character images, the subsampling in the upper region occurring at a different rate than the subsampling in the lower region;
neural network means for determining a probability of whether a diacritical marker may exist in each of said subsampled character images; and
controller means for determining whether a diacritical marker exists in one of said character images based on said determining by said neural network means and heuristics.
2 Assignments
0 Petitions
Accused Products
Abstract
A diacritical marker recognition system and method recognizes diacritical markers in a character image based upon an analysis by a neural network of the portion of the character image most likely to contain a diacritical marker. Once the neural network determines that a diacritical marker most likely exists in the character image, the system determines by using heuristics whether a diacritical marker exists or whether the character image appears to contain a diacritical marker which is actually a regular character.
-
Citations
11 Claims
-
1. A diacritical marker recognition system which is connectable to receive a raster image and bounding box information, said bounding box information specifying locations and dimensions of a rectangle surrounding characters in said raster image, said diacritical marker recognition system comprising:
-
extraction means for extracting a plurality of character images from said raster image based on said bounding box information; subsampling means for subsampling an upper and lower region of each of said extracted character images, the subsampling in the upper region occurring at a different rate than the subsampling in the lower region; neural network means for determining a probability of whether a diacritical marker may exist in each of said subsampled character images; and controller means for determining whether a diacritical marker exists in one of said character images based on said determining by said neural network means and heuristics. - View Dependent Claims (2, 3)
-
-
4. A diacritical marker recognition system which is connectable to receive a raster image and bounding box information, said bounding box information specifying locations and dimensions of a rectangle surrounding characters in said raster image, said diacritical marker recognition system comprising:
-
extraction means for extracting a plurality of character images from said raster image based on said bounding box information; first neural array means for subsampling an upper and lower region of each of said extracted character images, the subsampling in the upper region occurring at a different rate than the subsampling in the lower region; second neural array means for determining a probability of whether a diacritical marker may exist in each of said subsampled character images; and controller means for determining whether a diacritical marker exists in said character image based on said determining by said second neural array means and heuristics.
-
-
5. A diacritical marker recognition method executed on a computer as part of a computer program for identifying if any character in a raster image contains a diacritical marker, said computer connectable to receive a raster image and bounding box information, said bounding box information specifying locations and dimensions of a rectangle surrounding each character in said raster image, said method comprising the steps of:
-
(a) extracting a plurality of character images from said raster image based on said bounding box information; (b) subsampling an upper and lower region each of said extracted character images, the subsampling in the upper region occurring at a different rate than the subsampling in the lower region; (c) identifying if any of said subsampled character images has a diacritical marker by using a neural network; and (d) outputting said diacritical marker if said diacritical marker is identified in step (c). - View Dependent Claims (6, 7, 8, 9, 10)
-
-
11. A diacritical marker recognition method executed on a computer as part of a computer program for identifying if any character in a raster image contains a diacritical marker, said computer connectable to receive a raster image and bounding box information, said bounding box information specifying locations and dimensions of a rectangle surrounding each character in said raster image, said method comprising the steps of:
-
(a) extracting a plurality of character images from said raster image based on said bounding box information; (b) normalizing each of said character images to a standard character image size; (c) using a first neuron array to subsample an upper region of each of said normalized character images; (d) using a second neuron array to subsample an lower region of each of said normalized character images, the subsampling in the upper region occurring at a different rate than the subsampling in the lower region; (e) identifying if any of said subsampled character images from said first and second neuron arrays has a diacritical marker by using a neural network and heuristics; and (f) outputting said diacritical marker if said diacritical marker is identified in step (e).
-
Specification