Character discrimination system employing height-to-width ratio and vertical extraction position information
First Claim
Patent Images
1. A process of recognizing a character comprising the steps of:
- a. electronically scanning a document upon which character information appears in rows and electronically generating a character row signal, which signal represents character string image data;
b. from the character row signal, electronically extracting non-discrete, ordinary characters and special characters in the form of em characters, and further electronically discriminating between discrete characters and components of discrete characters by;
i. defining a series of rectangular areas which circumscribe a complete character or a component of a discrete character;
ii. judging whether or not a selected one of the rectangular areas is a component of a discrete character by electronically comparing the ratio of the height of the selected rectangular area to the width of the selected rectangular area with a predetermined height-to-width ratio, andiii. electronically comparing the vertical extraction position of the selected rectangular area to a predetermined vertical extraction position to thereby determine if the character within the selected rectangular area is a component of a discrete character;
c. from the results of steps a and b, generating electronic extracted character data representative of the extracted characters; and
d. electronically comparing the extracted character data with a dictionary of stored electronic standard character data, selecting as a recognized character a standard character whose data has the greatest similarity to an extracted character, and outputting electronic data corresponding to the recognized character.
0 Assignments
0 Petitions
Accused Products
Abstract
A character recognition system which is improved in accuracy of integration for discrete characters. The system discriminate a component of any discrete character in accordance with a height-to-width ratio and a vertical extraction position of a rectangular area which is formed from a character row signal delivered to the system. Rectangular areas or character areas to be integrated are decided in accordance with an average character pitch of square or em characters.
53 Citations
8 Claims
-
1. A process of recognizing a character comprising the steps of:
-
a. electronically scanning a document upon which character information appears in rows and electronically generating a character row signal, which signal represents character string image data; b. from the character row signal, electronically extracting non-discrete, ordinary characters and special characters in the form of em characters, and further electronically discriminating between discrete characters and components of discrete characters by; i. defining a series of rectangular areas which circumscribe a complete character or a component of a discrete character; ii. judging whether or not a selected one of the rectangular areas is a component of a discrete character by electronically comparing the ratio of the height of the selected rectangular area to the width of the selected rectangular area with a predetermined height-to-width ratio, and iii. electronically comparing the vertical extraction position of the selected rectangular area to a predetermined vertical extraction position to thereby determine if the character within the selected rectangular area is a component of a discrete character; c. from the results of steps a and b, generating electronic extracted character data representative of the extracted characters; and d. electronically comparing the extracted character data with a dictionary of stored electronic standard character data, selecting as a recognized character a standard character whose data has the greatest similarity to an extracted character, and outputting electronic data corresponding to the recognized character. - View Dependent Claims (2, 3, 4)
-
-
5. An apparatus for recognizing a character comprising:
-
a. document reader means for electronically scanning a document upon which character information appears in rows and electronically generating a character row signal, which signal represents character string image data; b. character extracting means supplied with the character row signal, for electronically extracting non-discrete, ordinary characters and special characters in the form of em characters, and further electronically discriminating between discrete characters and components of discrete characters by defining a series of rectangular areas which circumscribe a complete character or a component of a discrete character, judging whether or not a selected one of the rectangular areas is a component of a discrete character by comparing the ratio of the height of the selected rectangular area to the width of the selected rectangular area with a predetermined height-to-width ratio, and comparing the vertical extraction position of the selected rectangular area to a predetermined vertical extraction position to thereby determine if the character within the selected rectangular area is a component of a discrete character, and thereafter generating electronic extracted character data representative of the extracted characters; and c. character discriminator means for electronically comparing the extracted character data with a dictionary of stored electronic standard character data, selecting as a recognized character a standard character whose data has the greatest similarity to an extracted character, and outputting electronic data corresponding to the recognized character. - View Dependent Claims (6, 7, 8)
-
Specification