Techniques in optical character recognition
First Claim
1. A computer-readable storage medium, not comprising a propagated data signal, encoded with computer-executable instructions which, when executed by a processor, perform a method for recognizing characters of an original set of characters displayed on a surface in an original linear orientation, the method comprising:
- within an image of the original set of characters, identifying an acquired set of characters represented by pixels of the image, the acquired set of characters having an acquired linear orientation skewed relative to the original linear orientation by a rotation angle;
applying an edge detection filter to the acquired set of characters to produce an edge map, the edge map identifying edge pixels comprising pixels of the image lying along a plurality of lines associated with the acquired set of characters;
inputting the edge map to a linear hough transform filter to produce a set of output lines in (r, Θ
) parametric form, where for each output line, r is a length of a normal line drawn perpendicular to the output line between a point of origin and a particular edge pixel through which the output line passes, and Θ
is an angle the normal line forms to a horizontal axis;
forming a matrix having rows and columns, each of the edge pixels passed through by a particular output line (r, Θ
) being represented by an element of the matrix located at a particular row corresponding to Θ
of the particular output line and a particular column corresponding to r of the particular output line;
assigning a score to each output line, the score based on a dispersion of edge pixels within the particular row corresponding to Θ
of the particular output line;
based on the scores, within the set of output lines, identifying at least two dominant output lines, (rdom1, Θ
dom1) and (rdom2, Θ
dom2);
calculating a first confidence value corresponding to a likelihood that Θ
dom1 estimates the rotation angle;
calculating a second confidence value corresponding to a likelihood that Θ
dom2 estimates the rotation angle; and
based on the first and second confidence values, determining whether Θ
dom1 or Θ
dom2 estimates the rotation angle.
2 Assignments
0 Petitions
Accused Products
Abstract
An image deskew system and techniques are used in the context of optical character recognition. An image is obtained of an original set of characters in an original linear (horizontal) orientation. An acquired set of characters, which is skewed relative to the original linear orientation by a rotation angle, is represented by pixels of the image. The rotation angle is estimated, and a confidence value may be associated with the estimation, to determine whether to deskew the image. In connection with rotation angle estimation, an edge detection filter is applied to the acquired set of characters to produce an edge map, which is input to a linear hough transform filter to produce a set of output lines in parametric form. The output lines are assigned scores, and based on the scores, at least one output line is determined to be a dominant line with a slope approximating the rotation angle.
-
Citations
20 Claims
-
1. A computer-readable storage medium, not comprising a propagated data signal, encoded with computer-executable instructions which, when executed by a processor, perform a method for recognizing characters of an original set of characters displayed on a surface in an original linear orientation, the method comprising:
-
within an image of the original set of characters, identifying an acquired set of characters represented by pixels of the image, the acquired set of characters having an acquired linear orientation skewed relative to the original linear orientation by a rotation angle; applying an edge detection filter to the acquired set of characters to produce an edge map, the edge map identifying edge pixels comprising pixels of the image lying along a plurality of lines associated with the acquired set of characters; inputting the edge map to a linear hough transform filter to produce a set of output lines in (r, Θ
) parametric form, where for each output line, r is a length of a normal line drawn perpendicular to the output line between a point of origin and a particular edge pixel through which the output line passes, and Θ
is an angle the normal line forms to a horizontal axis;forming a matrix having rows and columns, each of the edge pixels passed through by a particular output line (r, Θ
) being represented by an element of the matrix located at a particular row corresponding to Θ
of the particular output line and a particular column corresponding to r of the particular output line;assigning a score to each output line, the score based on a dispersion of edge pixels within the particular row corresponding to Θ
of the particular output line;based on the scores, within the set of output lines, identifying at least two dominant output lines, (rdom1, Θ
dom1) and (rdom2, Θ
dom2);calculating a first confidence value corresponding to a likelihood that Θ
dom1 estimates the rotation angle;calculating a second confidence value corresponding to a likelihood that Θ
dom2 estimates the rotation angle; andbased on the first and second confidence values, determining whether Θ
dom1 or Θ
dom2 estimates the rotation angle. - View Dependent Claims (2, 3)
-
-
4. A computer-readable storage medium, not comprising a propagated data signal, encoded with computer-executable instructions which, when executed by a processor, perform a method for recognizing characters of an original set of characters displayed on a surface in an original linear orientation, the method comprising:
-
identifying an image, acquired by an image capture device, of the original set of characters, the image having an acquired set of characters corresponding to the original set of characters, each character of the acquired set of characters represented by a group of pixels in the image, each pixel of the group having a grayscale value, the acquired set of characters having an acquired linear orientation skewed relative to the original linear orientation in an amount able to be expressed by a rotation angle; within the image, identifying a portion of the acquired set of characters; applying an edge detection filter to the portion of the acquired set of characters to produce an edge map, the edge map identifying, based on the grayscale values of pixels from groups of pixels representing characters within the portion, a set of edge pixels comprising pixels lying along a mean line or a base line of the characters within the portion; inputting the edge map to a linear hough transform filter, the linear hough transform filter configured to produce a set of output lines in (r, Θ
) parametric form, where for each output line, r is a length of a normal line drawn perpendicular to the output line between a point of origin and a particular edge pixel through which the output line passes, and Θ
is an angle the normal line forms to a horizontal axis;forming a matrix having a plurality of rows identified by row indices and a plurality of columns identified by column indices, each of the edge pixels passed through by a particular output line (r, Θ
) being represented by an element of the matrix located at a particular row index corresponding to Θ
of the particular output line and a particular column index corresponding to r of the particular output line;based on a distribution of the elements within the matrix, assigning a score to at least some of the output lines; based on the scores of the at least some of the output lines, identifying a dominant output line; identifying the row index of the matrix associated with the dominant output line; based on the identified row index, identifying the angle Θ
of the dominant output line; andestimating the rotation angle based on the identified angle Θ
,the original set of characters able to be decoded based at least in part on the estimated rotation angle. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A system for recognizing characters of an original set of characters displayed on a surface in an original linear orientation, the system comprising:
-
a computer-readable storage medium, not comprising a propagated data signal; and a processor responsive to the computer-readable storage medium and to one or more computer programs stored in the computer-readable storage medium, the one or more computer programs, when loaded into the processor and executed, operable to perform a method comprising; identifying an image, acquired by an image capture device, of the original set of characters, the image having an acquired set of characters corresponding to the original set of characters, each character of the acquired set of characters represented by a group of pixels in the image, each pixel of the group having a grayscale value, the acquired set of characters having an acquired linear orientation skewed relative to the original linear orientation in an amount able to be expressed by a rotation angle, within the image, identifying a portion of the acquired set of characters; applying an edge detection filter to the portion of the acquired set of characters to produce an edge map, the edge map identifying, based on the grayscale values of pixels from groups of pixels representing characters within the portion, a set of edge pixels comprising pixels lying along a mean line or a base line of the characters within the portion, inputting the edge map to a linear hough transform filter, the linear hough transform filter configured to produce a set of output lines in (r, Θ
) parametric form, where for each output line, r is a length of a normal line drawn perpendicular to the output line between a point of origin and a particular edge pixel through which the output line passes, and Θ
is an angle the normal line forms to a horizontal axis,forming a matrix having a plurality of rows identified by row indices and a plurality of columns identified by column indices, each of the edge pixels passed through by a particular output line (r, Θ
) being represented by an element of the matrix located at a particular row index corresponding to Θ
of the particular output line and a particular column index corresponding to r of the particular output line,based on a distribution of the elements within the matrix, assigning a score to at least some of the output lines, based on the assigned scores, identifying a dominant output line; identifying the row index of the matrix associated with the dominant output line, based on the identified row index, identifying the angle Θ
of the dominant output line, andestimating the rotation angle based on the identified angle Θ
,the original set of characters able to be decoded based at least in part on the estimated rotation angle. - View Dependent Claims (20)
-
Specification