Method of segmenting characters in lines which may be skewed, for allowing improved optical character recognition
First Claim
1. A method of segmenting characters of a document image which has lines of characters thereon, each of said lines extending generally in a first direction, said method comprising the steps of:
- dividing the document image in the first direction into a plurality of divided regions and setting a check width with respect to each of the divided regions taken along the first direction, each of the check width being greater than or equal to a width of a corresponding one of the divided regions taken along the first direction so that the check widths of two mutually adjacent divided regions partially overlap each other;
reading image data amounting to one line of the document image;
obtaining from the image data horizontal projections of each line data within each of the check widths, each horizontal projection being a number of black picture elements in a corresponding data line within a check width, each data line being made up of a plurality of picture elements arranged in the first direction;
segmenting a line based on the horizontal projections;
obtaining from the image data vertical projections, each vertical projection being a number of black picture elements in a second direction which is perpendicular to said first direction;
determining a character segmentation range based on the vertical projections; and
segmenting each character of the line within the character segmentation range.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of segmenting characters of a document image comprises the steps of dividing the document image into a plurality of divided regions and setting a check width with respect to each of the divided regions, where each check width is greater than or equal to a width of a corresponding one of the divided regions so that the check widths of two mutually adjacent divided regions partially overlap each other, reading image data amounting to one line of the document image, obtaining from the image data horizontal projections of each line data within each of the check widths, where each horizontal projection is a number of black picture elements in a corresponding data line within a check width and each data line is made up of a plurality of picture elements arranged horizontally, segmenting a line based on the horizontal projections, obtaining from the image data vertical projections, where each vertical projection is a number of black picture elements in a vertical direction, determining a character segmentation range based on the vertical projections, and segmenting each character of the line within the character segmentation range.
24 Citations
10 Claims
-
1. A method of segmenting characters of a document image which has lines of characters thereon, each of said lines extending generally in a first direction, said method comprising the steps of:
-
dividing the document image in the first direction into a plurality of divided regions and setting a check width with respect to each of the divided regions taken along the first direction, each of the check width being greater than or equal to a width of a corresponding one of the divided regions taken along the first direction so that the check widths of two mutually adjacent divided regions partially overlap each other; reading image data amounting to one line of the document image; obtaining from the image data horizontal projections of each line data within each of the check widths, each horizontal projection being a number of black picture elements in a corresponding data line within a check width, each data line being made up of a plurality of picture elements arranged in the first direction; segmenting a line based on the horizontal projections; obtaining from the image data vertical projections, each vertical projection being a number of black picture elements in a second direction which is perpendicular to said first direction; determining a character segmentation range based on the vertical projections; and segmenting each character of the line within the character segmentation range. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
Specification