Character string recognition apparatus, character string recognizing method, and storage medium therefor
First Claim
Patent Images
1. A character string recognition apparatus, comprising:
- a key character code extraction unit automatically extracting a code string of a key word which is a node of a character string from a character string category to be recognized and expressed as at least one character code, by extracting at least one of a first character having a first predetermined number of occurrences among a first set of character strings to be recognized, a second character having a second predetermined number of occurrences in a character string unit, and a second set of closely associated characters as the key word;
a key word extraction unit separating an image of the character string into images of individual characters, recognizing the individual character images and extracting as key word characters, a string of characters corresponding to the code string of the key word;
a partial area extraction unit extracting a partial area falling between extracted key words from the image of the character string; and
a recognition unit holistically recognizing a character string in the partial area extracted by said partial area extraction unit.
1 Assignment
0 Petitions
Accused Products
Abstract
A key word is first and automatically extracted from a character string group to be recognized, and entered. Then, a character is recognized by segmenting an individual character from a character string image to be recognized, and a character string corresponding to the extracted/entered key word id extracted. Then, a word area delimited by a key word is extracted from the character string image, and a word is recognized. Furthermore, a word recognition result is verified, and a final character string recognition result is output.
21 Citations
18 Claims
-
1. A character string recognition apparatus, comprising:
-
a key character code extraction unit automatically extracting a code string of a key word which is a node of a character string from a character string category to be recognized and expressed as at least one character code, by extracting at least one of a first character having a first predetermined number of occurrences among a first set of character strings to be recognized, a second character having a second predetermined number of occurrences in a character string unit, and a second set of closely associated characters as the key word; a key word extraction unit separating an image of the character string into images of individual characters, recognizing the individual character images and extracting as key word characters, a string of characters corresponding to the code string of the key word; a partial area extraction unit extracting a partial area falling between extracted key words from the image of the character string; and a recognition unit holistically recognizing a character string in the partial area extracted by said partial area extraction unit. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A character string recognition apparatus, comprising:
-
key character code extraction means for automatically extracting a code string of a key word, which is a node of a character string from a character string category to be recognized and expressed as a character code, by extracting at least one of a first character having a first predetermined number of occurrences among a first set of character strings to be recognized, a second character having a second predetermined number of occurrences in a character string unit, and a second set of closely associated characters as the key word; key word extraction means for separating an image of the character string into images of individual characters, recognizing the individual character images and extracting as key word characters, a string of characters corresponding to the code string of the key word; a partial area extraction means extracting a partial area falling between extracted key words from the image of the character string; and recognition means for holistically recognizing a character string in the partial area extracted by said partial area extraction unit.
-
-
15. A character string recognition apparatus, comprising:
-
a recognition target character string group storage unit storing a list of character strings in a category to be recognized; a key word determination unit searching said recognition target character string group storage unit for each character to obtain a number of occurrences of each character, defining a character having a large number of occurrences as a key character, and defining a character string having a large number of occurrences as a key word; and a key word extraction unit separating an image of a character string into images of individual characters, recognizing the individual character images in the character string and extracting as key word characters, a string of characters corresponding to the code string of the key word, by extracting at least one of a first character having a first predetermined number of occurrences among a first set of character strings to be recognized, a second character having a second predetermined number of occurrences in a character string unit, and a second set of closely associated characters as the key word.
-
-
16. A character string recognition apparatus, comprising:
-
a key character/word storage unit storing a determined key character or key word; and a key character/word extraction unit separating an image of the character string into images of individual characters, recognizing the individual character images and extracting a character string as a key word if a part of the character string in the key word is extracted when a key character or a key word stored in said key character/word storage unit is extracted from the image of the character string to be recognized by extracting at least one of a first character having a first predetermined number of occurrences among a first set of character strings to be recognized, a second character having a second predetermined number of occurrences in a character string unit, and a second set of closely associated characters as the key word.
-
-
17. A character string recognizing method, comprising:
-
obtaining a number of occurrences of each character in a list stored in advance based on the list of character strings in a category to be recognized, defining a character having a large number of occurrences as a key character, and defining a character string having a large number of occurrences as a key word; extracting the key character or the key word from a character string image to be recognized by extracting at least one of a first character having a first predetermined number of occurrences among a first set of character strings to be recognized, a second character having a second predetermined number of occurrences in a character string unit, and a second set of closely associated characters as the key word; and recognizing individual character images in an image of a character string to identify a word for each area delimited by each key character or key word in the character string image to be recognized.
-
-
18. A computer readable medium encoded with a computer program recognizing a character string image, said computer program controlling a processor to perform a method comprising:
-
automatically extracting a code string of a key word which is a node of a character string from a character string category to be recognized and expressed as a character code, by extracting at least one of a first character having a first predetermined number of occurrences among a first set of character strings to be recognized, a second character having a second predetermined number of occurrences in a character string unit and a second set of closely associated characters as the key word; separating an image of the character string into images of individual characters;
recognizing the individual character images;extracting the extracted key word or a part of the key word from a character string image; holistically recognizing character strings in partial areas determined by the extracted key and words.
-
Specification