CAMERA-BASED DOCUMENT IMAGING

US 20100073735A1
Filed: 05/06/2009
Published: 03/25/2010
Est. Priority Date: 05/06/2008
Status: Abandoned Application

First Claim

Patent Images

1. A method for processing a photographed image containing text lines comprising text characters having vertical strokes comprising:

(a) binarization using pixel normalized thresholding to identify pixels in the image that make up the text;

(b) detecting typographical features indicative of the orientation of text;

(c) fitting one or more curves to a text line;

(d) building a grid of quadrilaterals using vectors that are parallel to the direction of the text lines and vectors parallel to the direction of the vertical stroke lines;

(e) dewarping the document by stretching the image so that vectors parallel to the text lines and vectors parallel to the direction of the vertical stroke lines become orthogonal; and

(f) processing the dewarped document with an optical character recognition software.

View all claims

10 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A process and system to transform a digital photograph of a text document into a scan-quality image is disclosed. By extracting the document text from the image, and analyzing visual clues from the text, a grid is constructed over the image representing the distortions in the image. Transforming the image to straighten this grid removes distortions introduced by the camera image-capture process. Variations in lighting, the extraction of text line information, and the modeling of curved lines in the image may be corrected.

138 Citations

View as Search Results

20 Claims

1. A method for processing a photographed image containing text lines comprising text characters having vertical strokes comprising:
- (a) binarization using pixel normalized thresholding to identify pixels in the image that make up the text;
  
  (b) detecting typographical features indicative of the orientation of text;
  
  (c) fitting one or more curves to a text line;
  
  (d) building a grid of quadrilaterals using vectors that are parallel to the direction of the text lines and vectors parallel to the direction of the vertical stroke lines;
  
  (e) dewarping the document by stretching the image so that vectors parallel to the text lines and vectors parallel to the direction of the vertical stroke lines become orthogonal; and
  
  (f) processing the dewarped document with an optical character recognition software.
- View Dependent Claims (2, 3)
- - 2. The method of claim 1, wherein the binarization process includes artifact removal that discards whole connected regions of black pixels if such a region exceeds a maximum area parameter.
  - 3. The method of claim 1, wherein the binarization process includes artifact removal that discards whole connected regions of black pixels if such a region is less than a minimum area parameter.

4. A method for processing a photographed image containing text lines, the text lines comprise text characters having vertical strokes and top and bottom tip points, the method comprising:
- (a) detecting the top and bottom tip points of the text lines;
  
  (b) fitting one curve to the top tip points and one curve to the bottom tip points for each of the text lines;
  
  (c) determining the page orientation of the photographed image by distinguishing the top and bottom portions of text lines;
  
  (d) computing approximate orientation for each text line and removing outliners among text lines;
  
  (e) finding vertical paragraph boundaries by determining whether the start points or end points of text lines are lined up;
  
  (f) detecting vertical strokes in text characters by scanning in local vertical direction to obtain vertical blocks of pixels at each of the intersection point of a centroid spline of a text line with the text pixels of text characters;
  
  (g) building a grid of quadrilaterals using vectors that are parallel to the direction of the text lines and vectors parallel to the direction of the vertical stroke lines; and
  
  (h) dewarping the document by stretching the image so that vectors parallel to the text lines and vectors parallel to the direction of the vertical stroke lines become orthogonal.
- View Dependent Claims (5)
- - 5. The method of claim 4 wherein the determining the page orientation of the photographed image by distinguishing the top and bottom portions of text lines step further includes choosing a representative sample of text lines whose length is close to the median length of all text lines and, for each text line in the sample, checking which side has more outliers.

6. A method for processing a photographed image containing text lines comprising text characters having vertical strokes comprising:
- (a) detecting typographical features indicative of the orientation of text;
  
  (b) fitting one or more curves to a text line;
  
  (c) building a grid of quadrilaterals using vectors that are parallel to the direction of the text lines and vectors parallel to the direction of the vertical stroke lines; and
  
  (d) dewarping the document by computing for each pixel location of the output image, its corresponding location in the input image; and
  
  its pixel color and/or intensity by using one or more pixels near the corresponding location in the input image.
- View Dependent Claims (7, 8, 9, 10, 11, 12)
- - 7. The method of claim 6 wherein the corresponding location in the input image in step (d) is computed by modeling its x-coordinate with one mathematical function and its y-coordinate with another mathematical function.
  - 8. The method of claim 7 wherein the two mathematical functions are generated using a Thin Plate Splines technique.
  - 9. The method of claim 6 wherein the computation of correspondence for every pixel location is preceded by the generation of control points in which the correspondence is computed for a subset of pixel locations.
  - 10. The method of claim 9 wherein the subset of pixel locations consists of one or more points lying on one or more text lines.
  - 11. The method of claim 9 wherein the subset of pixel locations consists of the left and right endpoints of one or more text lines.
  - 12. The method of claim 6 wherein the output pixel color or intensity is computed from the four nearest pixels in the input image.

13. A method for processing a photographed image containing text lines comprising text characters having tip points and vertical strokes comprising:
- (a) detecting text regions by finding a set of pixels in the photographed image that correspond to the text characters and creating a binary image containing only said set of pixels, the set of pixels are grouped into character regions, the characters regions are grouped into text lines;
  
  (b) detecting shape by identifying the tip points and vertical strokes of the text characters;
  
  (c) detecting orientation of the text; and
  
  (d) transforming based on a grid building process where the identified tip points and vertical strokes are used as a basis to identify the warping of the document.
- View Dependent Claims (14, 15, 16, 17)
- - 14. The method of claim 13 wherein the detecting shape step fits splines to the top and bottom of text lines to approximate the original document shape.
  - 15. The method of claim 13 wherein the detecting text regions step further comprising the following steps:
    - (a1) estimating the foreground text by a standard or naï
      
      ve thresholding method;
      
      (a2) removing these foreground pixels from the original image;
      
      (a3) filling the holes left by the removal by interpolating from the remaining values that provides a new estimate for the background by removing the initial thresholding and interpolating over the holes;
      
      (a4) thresholding based on the improved estimate of the background.
  - 16. The method of claim 13 wherein the transform step relies on a grid building process where the extracted features are used as a basis to identify the warping of the document.
  - 17. The method of claim 13 wherein the transform step relies on an optimization-problem.

18. A computer system for processing a photographed image containing text lines comprising text characters having vertical strokes, the computer system carrying one or more sequences of one or more instructions which, when executed by one or more processors, cause the one or more processors to perform the computer-implemented steps of:
- (a) binarization using pixel normalized thresholding to identify pixels in the image that make up the text;
  
  (b) detecting typographical features indicative of the orientation of text;
  
  (c) fitting one or more curves to a text line;
  
  (d) building a grid of quadrilaterals using vectors that are parallel to the direction of the text lines and vectors parallel to the direction of the vertical stroke lines;
  
  (e) dewarping the document by stretching the image so that vectors parallel to the text lines and vectors parallel to the direction of the vertical stroke lines become orthogonal; and
  
  (f) processing the dewarped document with an optical character recognition software.

19. A computer system for processing a photographed image containing text lines comprising text characters having vertical strokes, the computer system carrying one or more sequences of one or more instructions which, when executed by one or more processors, cause the one or more processors to perform the computer-implemented steps of:
- (a) detecting the top and bottom tip points of the text lines;
  
  (b) fitting one curve to the top tip points and one curve to the bottom tip points for each of the text lines;
  
  (c) determining the page orientation of the photographed image by distinguishing the top and bottom portions of text lines;
  
  (d) computing approximate orientation for each text line and removing outliners among text lines;
  
  (e) finding vertical paragraph boundaries by determining whether the start points or end points of text lines are lined up;
  
  (f) detecting vertical strokes in text characters by scanning in local vertical direction to obtain vertical blocks of pixels at each of the intersection point of a centroid spline of a text line with the text pixels of text characters;
  
  (g) building a grid of quadrilaterals using vectors that are parallel to the direction of the text lines and vectors parallel to the direction of the vertical stroke lines; and
  
  (h) dewarping the document by stretching the image so that vectors parallel to the text lines and vectors parallel to the direction of the vertical stroke lines become orthogonal.

20. A computer system for processing a photographed image containing text lines comprising text characters having vertical strokes, the computer system carrying one or more sequences of one or more instructions which, when executed by one or more processors, cause the one or more processors to perform the computer-implemented steps of:
- (a) detecting text regions by finding a set of pixels in the photographed image that correspond to the text characters and creating a binary image containing only said set of pixels, the set of pixels are grouped into character regions, the characters regions are grouped into text lines;
  
  (b) detecting shape by identifying the tip points and vertical strokes of the text characters;
  
  (c) detecting orientation of the text; and
  
  (d) transforming based on a grid building process where the identified tip points and vertical strokes are used as a basis to identify the warping of the document.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Compulink Management Center Incorporated
Original Assignee
Compulink Management Center Incorporated
Inventors
Hunt, Martin G., Gu, Weiqing, Pham, Trang T., Tipton, William W., Yong, Darryl H., Egan, James O., Gordon, Logan M.K., Wong, Kin-Chung, Pavlovskaia, Maria A., Wu, Liangnan

Application Number

US12/436,775
Publication Number

US 20100073735A1
Time in Patent Office

Days
Field of Search
US Class Current

358/462
CPC Class Codes

G06T 3/06   Topological mapping of high...

G06V 30/10   Character recognition

G06V 30/1463   Orientation detection or co...

G06V 30/1478   of characters or characters...

H04N 1/00251   with an apparatus for takin...

CAMERA-BASED DOCUMENT IMAGING

First Claim

10 Assignments

0 Petitions

Accused Products

Abstract

138 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

CAMERA-BASED DOCUMENT IMAGING

First Claim

10 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

138 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links