Method for optical recognition of a multi-language set of letters with diacritics
First Claim
1. A computerized method of identifying characters in a character recognition system having a processor, the method comprising:
- a) analyzing, in conjunction with the processor, a character image for separation and extraction of a base character and one or more diacritics;
b) applying, in conjunction with the processor, optical character recognition or intelligent character recognition algorithms to the base character;
c) processing, in conjunction with the processor, the diacritics with at least one of image analysis and pattern recognition algorithms; and
d) combining, in conjunction with the processor, the results of b) and c) so as to check for acceptable combinations of the base character and diacritics with respect to one or more specific languages.
4 Assignments
0 Petitions
Accused Products
Abstract
A method and system for recognizing alphabetic characters that contain diacritics is described. An image analysis separates the character into its constituent components. The one or more diacritic components are then distinguished and isolated from the base portion of the character. Optical recognition is performed separately on the base portion. The diacritic is recognized through a special image analysis and pattern recognition algorithms. The image analysis extracts geometric information from the one or more diacritic components. The extracted information is used as input for the pattern recognition algorithms. The output is a code that corresponds to a particular diacritic. The recognized base portion and diacritic are combined and a check is performed for acceptable combinations in a chosen language. By separately recognizing the base portion and diacritic, the character sets used by the recognizer can be narrowed, resulting in greater recognition.
-
Citations
32 Claims
-
1. A computerized method of identifying characters in a character recognition system having a processor, the method comprising:
-
a) analyzing, in conjunction with the processor, a character image for separation and extraction of a base character and one or more diacritics; b) applying, in conjunction with the processor, optical character recognition or intelligent character recognition algorithms to the base character; c) processing, in conjunction with the processor, the diacritics with at least one of image analysis and pattern recognition algorithms; and d) combining, in conjunction with the processor, the results of b) and c) so as to check for acceptable combinations of the base character and diacritics with respect to one or more specific languages. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computerized method of identifying characters having a diacritic component in a character recognition system having a processor, the method comprising:
-
segmenting, in conjunction with the processor, a base component and a diacritic component from a character image; recognizing, in conjunction with the processor, the base component; recognizing, in conjunction with the processor, the diacritic component; performing, in conjunction with the processor, a match analysis to determine whether the base component and the diacritic component is an acceptable combination for one or more particular languages; and recognizing, in conjunction with the processor, the combination of the base component and the diacritic component in response to the match analysis. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
-
27. A system for recognizing characters, the system comprising:
-
a processor; a computer readable medium storing algorithms operable to cause the processor to perform; a) an analysis process configured to analyze a character image so as to segment a base component and one or more diacritic components; b) a character recognition algorithm configured to recognize the base component; c) a diacritic recognition algorithm configured to process the diacritic components of the character image; and d) a diacritic matching algorithm configured to combine the results of b) and c) to check for acceptable combinations of the base component and diacritic components for specific languages. - View Dependent Claims (28)
-
-
29. A system of identifying characters having a diacritic component, the system comprising:
-
a processor; a computer readable medium storing algorithms operable to cause the processor to perform; a character parts segmentation process configured to segment a character image so as to extract a base component arid a diacritic component; a character recognition algorithm configured to recognize the base component; a diacritic recognition algorithm configured to recognize the diacritic component of the character image; and a diacritic matching algorithm configured to determine whether the base component and the diacritic component are an acceptable combination for a particular language and to recognize the combination. - View Dependent Claims (30, 31)
-
-
32. A computer readable medium having computer readable program code embodied therein for identifying characters having a diacritic component, the computer readable code comprising:
-
a character parts segmentation process configured to segment a character image so as to extract a base component and a diacritic component; a character recognition algorithm configured to recognize the base component; a diacritic recognition algorithm configured to recognize the diacritic component of the character image; and a diacritic matching algorithm configured to determine whether the base component and the diacritic component are an acceptable combination for a particular language and to recognize the combination.
-
Specification