Method and apparatus for segmenting a character and for extracting a character string based on a histogram
First Claim
1. A character segmenting apparatus segmenting a character based on connection data imparted to each segment pattern, and in which a character string pattern is formed by arranging a plurality of character segment patterns, each segment pattern comprising one of a pattern formed by one character and a small segment pattern formed by a part of one character, said character segmenting apparatus comprising:
- extracting means for extracting the character segment pattern on the basis of the connection data;
character size calculating means for calculating a first histogram of one of a lengthwise character size and a crosswise character size of rectangles circumscribing the character segment pattern extracted by said extracting means and, concurrently, calculating an average character size and a first variance value of the average character size based on the first histogram;
character pitch calculating means for calculating a second histogram of a pitch between the rectangles in said character size calculating means and, concurrently, calculating an average character pitch and a second variance value based on the second histogram;
integrating means for integrating together the character segment patterns forming the one character while changing character integrating conditions in accordance with the average character size and the first variance value and the average character pitch and the second variance value; and
segment integrating means for integrating the small segment pattern by distinguishing the small segment pattern in the character segment pattern based on the average character size obtained by said character size calculating means.
0 Assignments
0 Petitions
Accused Products
Abstract
In a character segmenting apparatus the extracting section extracts the character segment pattern on the basis of the connection data imparted to the segment pattern. The character size calculating section calculates a histogram of a lengthwise or crosswise character size of a circumscribed rectangle circumscribed with the extracted character segment pattern and also calculates an average character size and its variance value on the basis of the histogram of the character size. The character pitch calculating section calculates a histogram of a pitch between the circumscribed rectangles and also calculates an average character pitch and its variance value on the basis of the histogram of the character pitch. The integrating section integrates the character while changing character integrating conditions in accordance with the average character size, the size variance value, the average character pitch and the pitch variance value. The segment integrating section integrates the character by distinguishing the small segment patterns in the character segment pattern on the basis of the average character size.
50 Citations
33 Claims
-
1. A character segmenting apparatus segmenting a character based on connection data imparted to each segment pattern, and in which a character string pattern is formed by arranging a plurality of character segment patterns, each segment pattern comprising one of a pattern formed by one character and a small segment pattern formed by a part of one character, said character segmenting apparatus comprising:
-
extracting means for extracting the character segment pattern on the basis of the connection data; character size calculating means for calculating a first histogram of one of a lengthwise character size and a crosswise character size of rectangles circumscribing the character segment pattern extracted by said extracting means and, concurrently, calculating an average character size and a first variance value of the average character size based on the first histogram; character pitch calculating means for calculating a second histogram of a pitch between the rectangles in said character size calculating means and, concurrently, calculating an average character pitch and a second variance value based on the second histogram; integrating means for integrating together the character segment patterns forming the one character while changing character integrating conditions in accordance with the average character size and the first variance value and the average character pitch and the second variance value; and segment integrating means for integrating the small segment pattern by distinguishing the small segment pattern in the character segment pattern based on the average character size obtained by said character size calculating means. - View Dependent Claims (2, 3, 4)
-
-
5. A character segmenting apparatus segmenting a character based on connection data imparted to each segment pattern, and in which a character string pattern is formed by arranging a plurality of character segment patterns, each segment pattern comprising one of a pattern formed by one character and a small segment pattern formed by a part of one character, said character segmenting apparatus comprising:
-
extracting means for extracting the character segment pattern on the basis of the connection data; character size calculating means for calculating a first histogram of one of a lengthwise character size and a crosswise character size of rectangles circumscribing the character segment pattern extracted by said extracting means and, concurrently, calculating an average character size and a first variance value of the average character size based on the first histogram; character pitch calculating means for calculating a second histogram of a pitch between the rectangles in said character size calculating means and, concurrently, calculating an average character pitch and a second variance value based on the second histogram, wherein said character size calculating means comprises; size histogram means for calculating the first histograms of one of the lengthwise character size and the crosswise character size of the rectangle circumscribing the character segment pattern in the character string pattern, first average size means for calculating a tentative average character size based on the first histograms in the character string calculated by said size histogram means, size area determining means for determining a character size calculating area based on the tentative average character size calculated by said first average size means, and second average size means for calculating an average character size in the character size area determined by said size area determining means; integrating means for integrating together the character segment patterns forming the one character while changing character integrating conditions in accordance with the average character size and the first variance value and the average character pitch and the second variance value; and segment integrating means for integrating the small segment pattern by distinguishing the small segment pattern in the character segment pattern based on the average character size obtained by said character size calculating means.
-
-
6. A character segmenting apparatus segmenting a character based on connection data imparted to each segment pattern, and in which a character string pattern is formed by arranging a plurality of character segment patterns, each segment pattern comprising one of a pattern formed by one character and a small segment pattern formed by a part of one character, said character segmenting apparatus comprising:
-
extracting means for extracting the character segment pattern on the basis of the connection data; character size calculating means for calculating a first histogram of one of a lengthwise character size and a crosswise character size of rectangles circumscribing the character segment pattern extracted by said extracting means and, concurrently, calculating an average character size and a first variance value of the average character size based on the first histogram; character pitch calculating means for calculating a second histogram of a pitch between the rectangles in said character size calculating means and, concurrently, calculating an average character pitch and a second variance value based on the second histogram, wherein said character pitch calculating means comprises; pitch histogram means for calculating, as a pitch, a distance between the rectangles with respect to the segment pattern other than the small separation stroke in calculating a pitch between the characters and, concurrently, calculating a histogram of the pitch, first average pitch means for calculating a tentative average character pitch based on the histogram obtained by said pitch histogram means, pitch area determining means for determining a character calculating area based on the tentative average character pitch obtained by said first average pitch means, and second average pitch means for calculating an average character pitch in the character pitch area determined by said pitch area determining means; integrating means for integrating together the character segment patterns forming the one character while changing character integrating conditions in accordance with the average character size and the first variance value and the average character pitch and the second variance value; segment integrating means for integrating the small segment pattern by distinguishing the small segment pattern in the character segment pattern based on the average character size obtained by said character size calculating means; and stroke extracting means for extracting a small separation stroke by which a pattern being integrated to one character is separated in the small segment pattern from within the segment pattern in the character string pattern based on a result of obtaining one of an area ratio and a height ratio of the average character size to the character size of the rectangle by use of the average character size calculated by said character size calculating means.
-
-
7. A character segmenting apparatus segmenting a character based on connection data imparted to each segment pattern, and in which a character string pattern is formed by arranging a plurality of character segment patterns, each segment pattern comprising one of a pattern formed by one character and a small segment pattern formed by a part of one character, said character segmenting apparatus comprising:
-
extracting means for extracting the character segment pattern on the basis of the connection data; character size calculating means for calculating a first histogram of one of a lengthwise character size and a crosswise character size of rectangles circumscribing the character segment pattern extracted by said extracting means and, concurrently, calculating an average character size and a first variance value of the average character size based on the first histogram; character pitch calculating means for calculating a second histogram of a pitch between the rectangles in said character size calculating means and, concurrently, calculating an average character pitch and a second variance value based on the second histogram; integrating means for integrating together the character segment patterns forming the one character while changing character integrating conditions in accordance with the average character size and the first variance value and the average character pitch and the second variance value; segment integrating means for integrating the small segment pattern by distinguishing the small segment pattern in the character segment pattern based on the average character size obtained by said character size calculating means; and stroke extracting means for extracting a small separation stroke by which a pattern being integrated to one character is separated in the small segment pattern from within the segment pattern in the character string pattern based on a result of obtaining one of an area ratio and a height ratio of the average character size to the character size of the rectangle by use of the average character size calculated by said character size calculating means, wherein said segment integrating means comprises; line density means for calculating line densities with respect to the small separation stroke, the segment patterns located on the right and left sides thereof and also, in integrating them, a segment pattern; inclination calculating means for calculating an inclination of the small separation stroke; and distinguishing means for distinguishing which one of the segment patterns located on the right and left sides the small separation stroke should be integrated with based on the line densities obtained by said line density means and the inclination of the small separation stroke obtained by said inclination calculating means. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A character string extracting apparatus extracting a character string based on connection data imparted to each segment pattern, and in which a character string pattern is formed by arranging a plurality of character segment patterns, said apparatus comprising:
-
extracting means for extracting the character segment pattern based on the connection data; weighting projection means for obtaining a projection histogram by performing a weighting projection on one of a lengthwise line segment and a crosswise line segment of a rectangle circumscribing the segment pattern extracted by said extracting means; axis determining means for determining a character string axis based on the projection histogram obtained by said weighting projection means; character string extracting means for extracting a character string based on the character string axis determined by said axis determining means; rectangle integrating means for integrating, if the respective rectangles are overlapped with each other, overlapped rectangles; calculating means for calculating an average character size with respect to a result of integration by said rectangle integrating means; and eliminating means for eliminating, as a group of contact characters between upper-lower character strings, one of the rectangles that is not less than a predetermined-value-times as large as the average character size obtained by said calculating means and rectangles which spread over a plurality of character strings. - View Dependent Claims (14, 15, 16, 17)
-
-
18. A character segmenting method of segmenting a character based on connection data imparted to each segment pattern, and in which a character string pattern is formed by arranging a plurality of character segment patterns, and each segment pattern comprises one of a pattern formed by one character and a small segment pattern formed by a part of one character, said method comprising the steps of:
-
an extracting step of extracting the character segment pattern on the basis of the connection data; a character size calculating step of calculating a first histogram of one of a lengthwise character size and a crosswise character size of a rectangle circumscribing the extracted character segment pattern and, concurrently, calculating an average character size and a first variance value based on the first histogram; a character pitch calculating step of calculating second histogram of a pitch between the rectangles and, concurrently, calculating an average character pitch and a second variance value based on the second histogram; an integrating step of integrating together the character segment patterns forming the character while changing character integrating conditions in accordance with the average character size, the first value, the average character pitch and the second variance value; and a segment integrating step of integrating the small segment pattern by distinguishing the small segment pattern in the character segment pattern based on the average character size. - View Dependent Claims (19, 20, 21)
-
-
22. A character segmenting method of segmenting a character based on connection data imparted to each segment pattern, and in which a character string pattern is formed by arranging a plurality of character segment patterns, and each segment pattern comprises one of a pattern formed by one character and a small segment pattern formed by a part of one character, said method comprising the steps of:
-
an extracting step of extracting the character segment pattern on the basis of the connection data; a character size calculating step of calculating a first histogram of one of a lengthwise character size and a crosswise character size of a rectangle circumscribing the extracted character segment pattern and, concurrently, calculating an average character size and a first variance value based on the first histogram, wherein said character size calculating step comprises; a size histogram step of calculating the first histogram of one of the lengthwise character size and the crosswise character size of the rectangle, a first average size step of calculating a tentative average character size based on the calculated histograms in the character string, a size area determining step of determining a character size calculating area based on the calculated tentative average character size, and a second average size step of calculating an average character size in the determined character size area; a character pitch calculating step of calculating second histogram of a pitch between the rectangles and, concurrently, calculating an average character pitch and a second variance value based on the second histogram; an integrating step of integrating together the character segment patterns forming the character while changing character integrating conditions in accordance with the average character size, the first value, the average character pitch and the second variance value; and a segment integrating step of integrating the small segment pattern by distinguishing the small segment pattern in the character segment pattern based on the average character size.
-
-
23. A character segmenting method of segmenting a character based on connection data imparted to each segment pattern, and in which a character string pattern is formed by arranging a plurality of character segment patterns, and each segment pattern comprises one of a pattern formed by one character and a small segment pattern formed by a part of one character, said method comprising the steps of:
-
an extracting step of extracting the character segment pattern on the basis of the connection data; a character size calculating step of calculating a first histogram of one of a lengthwise character size and a crosswise character size of a rectangle circumscribing the extracted character segment pattern and, concurrently, calculating an average character size and a first variance value based on the first histogram; a character pitch calculating. Step of calculating second histogram of a pitch between the rectangles and, concurrently, calculating an average character pitch and a second variance value based on the second histogram; an integrating step of integrating together the character segment patterns forming the character while changing character integrating conditions in accordance with the average character size, the first value, the average character pitch and the second variance value; a segment integrating step of integrating the small segment pattern by distinguishing the small segment pattern in the character segment pattern based on the average character size; and a stroke extracting step of extracting a small separation stroke which a pattern being integrated to one character is separated in the small segment pattern from within the segment pattern in the character string pattern on the basis of a result of obtaining an area ratio or a height ratio of the average character size to the character size of the rectangle by use of the average character size, wherein said character pitch calculating step comprises; a pitch histogram step of calculating, as a pitch, a distance between the rectangles with respect to the segment pattern other than the small separation stroke when calculating a pitch between the characters and, calculating a histogram of the pitch, a first average pitch step of calculating a tentative average character pitch on the basis of the histogram of the pitch, a pitch area determining step of determining a character calculating area on the basis of the tentative average character pitch, and a second average pitch step of calculating an average character pitch in the determined character pitch area.
-
-
24. A character segmenting method segmenting a character based on connection data imparted to each segment pattern, and in which a character string pattern is formed by arranging a plurality of character segment patterns, and each segment pattern comprises one of a pattern formed by one character and a small segment pattern formed by a part of one character, said method comprising the steps of:
-
an extracting step of extracting the character segment pattern on the basis of the connection data; a character size calculating step of calculating a first histogram of one of a lengthwise character size and a crosswise character size of a rectangle circumscribing the extracted character segment pattern and, concurrently, calculating an average character size and a first variance value based on the first histogram; a character pitch calculating step of calculating second histogram of a pitch between the rectangles and, concurrently, calculating an average character pitch and a second variance value based on the second histogram; an integrating step of integrating together the character segment patterns forming the character while changing character integrating conditions in accordance with the average character size, the first value, the average character pitch and the second variance value; a segment integrating step of integrating the small segment pattern by distinguishing the small segment pattern in the character segment pattern based on the average character size; and a stroke extracting step of extracting a small separation stroke which a pattern being integrated to one character is separated in the small segment pattern from within the segment pattern in the character string pattern on the basis of a result of obtaining an area ratio or a height ratio of the average character size to the character size of the rectangle by use of the average character size, wherein said segment integrating step comprises; a line density step of calculating line densities with respect to the extracted small separation stroke, the segment patterns located on the right and left sides thereof and also, in integrating them, a segment pattern; an inclination calculating step of calculating an inclination of the small separation stroke; and a distinguishing step of distinguishing which one of the segment patterns located on the right and left sides the small separation stroke should be integrated with on the basis of the line densities and an inclination of the small separation stroke. - View Dependent Claims (25, 26, 27)
-
-
28. A character string extracting method of extracting a character string based on connection data imparted to each segment pattern, and in which a character string pattern is formed by arranging a plurality of character segment patterns, said method comprising the steps of:
-
an extracting step of extracting the character segment pattern based on the connection data; a weighting projection step of obtaining a projection histogram by performing a weighting projection on one of a lengthwise line segment and a crosswise line segment of a rectangle circumscribing the extracted segment pattern; an axis determining step of determining a character string axis based on the projection histogram; a character string extracting step of extracting a character string based on the character string axis determined in said axis determining step; a rectangle integrating step of integrating, if the respective rectangles are overlapped with each other, overlapped rectangles; a calculating step of calculating an average character size with respect to a result of integration by said rectangle integrating step; and an eliminating step of eliminating, as a group of contact characters between upper-lower character strings, one of the rectangles that is not less than a predetermined-value-times as large as the average character size obtained by said calculating step and rectangles which spread over a plurality of character strings. - View Dependent Claims (29, 30, 31, 32, 33)
-
Specification