CAMERA-BASED DOCUMENT IMAGING
First Claim
Patent Images
1. A method for processing a photographed image containing text lines comprising text characters having vertical strokes comprising:
- (a) binarization using pixel normalized thresholding to identify pixels in the image that make up the text;
(b) detecting typographical features indicative of the orientation of text;
(c) fitting one or more curves to a text line;
(d) building a grid of quadrilaterals using vectors that are parallel to the direction of the text lines and vectors parallel to the direction of the vertical stroke lines;
(e) dewarping the document by stretching the image so that vectors parallel to the text lines and vectors parallel to the direction of the vertical stroke lines become orthogonal; and
(f) processing the dewarped document with an optical character recognition software.
10 Assignments
0 Petitions
Accused Products
Abstract
A process and system to transform a digital photograph of a text document into a scan-quality image is disclosed. By extracting the document text from the image, and analyzing visual clues from the text, a grid is constructed over the image representing the distortions in the image. Transforming the image to straighten this grid removes distortions introduced by the camera image-capture process. Variations in lighting, the extraction of text line information, and the modeling of curved lines in the image may be corrected.
138 Citations
20 Claims
-
1. A method for processing a photographed image containing text lines comprising text characters having vertical strokes comprising:
-
(a) binarization using pixel normalized thresholding to identify pixels in the image that make up the text; (b) detecting typographical features indicative of the orientation of text; (c) fitting one or more curves to a text line; (d) building a grid of quadrilaterals using vectors that are parallel to the direction of the text lines and vectors parallel to the direction of the vertical stroke lines; (e) dewarping the document by stretching the image so that vectors parallel to the text lines and vectors parallel to the direction of the vertical stroke lines become orthogonal; and (f) processing the dewarped document with an optical character recognition software. - View Dependent Claims (2, 3)
-
-
4. A method for processing a photographed image containing text lines, the text lines comprise text characters having vertical strokes and top and bottom tip points, the method comprising:
-
(a) detecting the top and bottom tip points of the text lines; (b) fitting one curve to the top tip points and one curve to the bottom tip points for each of the text lines; (c) determining the page orientation of the photographed image by distinguishing the top and bottom portions of text lines; (d) computing approximate orientation for each text line and removing outliners among text lines; (e) finding vertical paragraph boundaries by determining whether the start points or end points of text lines are lined up; (f) detecting vertical strokes in text characters by scanning in local vertical direction to obtain vertical blocks of pixels at each of the intersection point of a centroid spline of a text line with the text pixels of text characters; (g) building a grid of quadrilaterals using vectors that are parallel to the direction of the text lines and vectors parallel to the direction of the vertical stroke lines; and (h) dewarping the document by stretching the image so that vectors parallel to the text lines and vectors parallel to the direction of the vertical stroke lines become orthogonal. - View Dependent Claims (5)
-
-
6. A method for processing a photographed image containing text lines comprising text characters having vertical strokes comprising:
-
(a) detecting typographical features indicative of the orientation of text; (b) fitting one or more curves to a text line; (c) building a grid of quadrilaterals using vectors that are parallel to the direction of the text lines and vectors parallel to the direction of the vertical stroke lines; and (d) dewarping the document by computing for each pixel location of the output image, its corresponding location in the input image; and
its pixel color and/or intensity by using one or more pixels near the corresponding location in the input image. - View Dependent Claims (7, 8, 9, 10, 11, 12)
-
-
13. A method for processing a photographed image containing text lines comprising text characters having tip points and vertical strokes comprising:
-
(a) detecting text regions by finding a set of pixels in the photographed image that correspond to the text characters and creating a binary image containing only said set of pixels, the set of pixels are grouped into character regions, the characters regions are grouped into text lines; (b) detecting shape by identifying the tip points and vertical strokes of the text characters; (c) detecting orientation of the text; and (d) transforming based on a grid building process where the identified tip points and vertical strokes are used as a basis to identify the warping of the document. - View Dependent Claims (14, 15, 16, 17)
-
-
18. A computer system for processing a photographed image containing text lines comprising text characters having vertical strokes, the computer system carrying one or more sequences of one or more instructions which, when executed by one or more processors, cause the one or more processors to perform the computer-implemented steps of:
-
(a) binarization using pixel normalized thresholding to identify pixels in the image that make up the text; (b) detecting typographical features indicative of the orientation of text; (c) fitting one or more curves to a text line; (d) building a grid of quadrilaterals using vectors that are parallel to the direction of the text lines and vectors parallel to the direction of the vertical stroke lines; (e) dewarping the document by stretching the image so that vectors parallel to the text lines and vectors parallel to the direction of the vertical stroke lines become orthogonal; and (f) processing the dewarped document with an optical character recognition software.
-
-
19. A computer system for processing a photographed image containing text lines comprising text characters having vertical strokes, the computer system carrying one or more sequences of one or more instructions which, when executed by one or more processors, cause the one or more processors to perform the computer-implemented steps of:
-
(a) detecting the top and bottom tip points of the text lines; (b) fitting one curve to the top tip points and one curve to the bottom tip points for each of the text lines; (c) determining the page orientation of the photographed image by distinguishing the top and bottom portions of text lines; (d) computing approximate orientation for each text line and removing outliners among text lines; (e) finding vertical paragraph boundaries by determining whether the start points or end points of text lines are lined up; (f) detecting vertical strokes in text characters by scanning in local vertical direction to obtain vertical blocks of pixels at each of the intersection point of a centroid spline of a text line with the text pixels of text characters; (g) building a grid of quadrilaterals using vectors that are parallel to the direction of the text lines and vectors parallel to the direction of the vertical stroke lines; and (h) dewarping the document by stretching the image so that vectors parallel to the text lines and vectors parallel to the direction of the vertical stroke lines become orthogonal.
-
-
20. A computer system for processing a photographed image containing text lines comprising text characters having vertical strokes, the computer system carrying one or more sequences of one or more instructions which, when executed by one or more processors, cause the one or more processors to perform the computer-implemented steps of:
-
(a) detecting text regions by finding a set of pixels in the photographed image that correspond to the text characters and creating a binary image containing only said set of pixels, the set of pixels are grouped into character regions, the characters regions are grouped into text lines; (b) detecting shape by identifying the tip points and vertical strokes of the text characters; (c) detecting orientation of the text; and (d) transforming based on a grid building process where the identified tip points and vertical strokes are used as a basis to identify the warping of the document.
-
Specification