Positionally encoded document image analysis and labeling

US 7,581,171 B2
Filed: 01/06/2004
Issued: 08/25/2009
Est. Priority Date: 01/06/2004
Status: Expired due to Fees

First Claim

Patent Images

1. A method of labeling a document image containing positionally encoded maze patterns for computationally efficient decoding, the method comprising:

obtaining the document image;

analyzing the document image to determine a number of position encoding bits that can be extracted from the document image, the analyzing including;

dividing the document image into blocks having substantially a same size as maze pattern cells;

determining whether the blocks are occluded by document content;

counting, for each pixel in the document image, a number of completely visible blocks in a neighboring window with the pixel being as a center of the window; and

labeling the pixel based on the number;

performing a thresholding algorithm on the document image to determine if the document image is of a type selected from at least;

a first type containing sufficient amount of visible positionally encoded maze patterns for a computationally efficient algorithm to decode the document image, anda second type containing document content that occludes at least a portion of the positionally encoded maze patterns, wherein the occlusion results in insufficient amount of maze patterns being visible for the computationally efficient algorithm to decode the document image;

labeling the document image with the type based on the number of position encoding bits; and

performing a search algorithm on the document image when an insufficient amount of maze patterns are visible.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed embodiments of the invention relate to analyzing document images, which contain positionally encoded information, such as a maze-pattern watermark, and labeling the images based on a degree to which the document'"'"'s content, such as text, occludes the position-encoding information. Depending on the degree of such occlusion, it may not be possible to extract enough position-encoding bits from a camera-captured image of the document to determine the camera-captured image'"'"'s location within the document. An analysis-and-labeling module receives, as input, image data output by an image-generation-and-capturing module and off-line training data; performs analysis-and-labeling processing; and outputs image-label information. The results of document-analysis-and-labeling processing may be used for efficiently determining a location of a camera-captured image within a positionally encoded document.

Citations

39 Claims

1. A method of labeling a document image containing positionally encoded maze patterns for computationally efficient decoding, the method comprising:
- obtaining the document image;
  
  analyzing the document image to determine a number of position encoding bits that can be extracted from the document image, the analyzing including;
  
  dividing the document image into blocks having substantially a same size as maze pattern cells;
  
  determining whether the blocks are occluded by document content;
  
  counting, for each pixel in the document image, a number of completely visible blocks in a neighboring window with the pixel being as a center of the window; and
  
  labeling the pixel based on the number;
  
  performing a thresholding algorithm on the document image to determine if the document image is of a type selected from at least;
  
  a first type containing sufficient amount of visible positionally encoded maze patterns for a computationally efficient algorithm to decode the document image, anda second type containing document content that occludes at least a portion of the positionally encoded maze patterns, wherein the occlusion results in insufficient amount of maze patterns being visible for the computationally efficient algorithm to decode the document image;
  
  labeling the document image with the type based on the number of position encoding bits; and
  
  performing a search algorithm on the document image when an insufficient amount of maze patterns are visible.
- View Dependent Claims (2, 3, 4)
- - 2. The method of claim 1, wherein obtaining the document image further comprises:
    - rendering an electronic document to a bitmap representation corresponding to a printed document.
  - 3. The method of claim 1, wherein obtaining the document image further comprises:
    - processing a scanned paper document.
  - 4. The method of claim 1, wherein the second type is divided into a plurality of subtypes that represent distinct degrees of occlusion of the positionally encoded maze patterns by the document content.

5. A method of labeling a camera-captured image containing positionally encoded maze patterns for computationally efficient decoding, the method comprising:
- obtaining the camera-captured image;
  
  analyzing the camera-captured image to determine a number of position encoding bits that can be extracted from the camera-captured image, the analyzing including;
  
  dividing the camera-captured image into blocks having substantially the same size as maze pattern cells;
  
  determining whether the blocks are occluded by document content;
  
  counting, for each pixel in the camera-captured image, a number of completely visible blocks in a neighboring window with the pixel being as a center of the window; and
  
  labeling the pixel based on the number;
  
  performing a thresholding algorithm on the camera-captured image to determine if the camera captured image is of a type selected from at least;
  
  a first type containing positionally encoded maze patterns for a computationally efficient algorithm to decode the camera-captured image, anda second type containing document content that occludes at least a portion of the positionally encoded maze patterns, wherein the occlusion results in insufficient amount of maze patterns being visible for the computationally efficient algorithm to decode the camera-captured image;
  
  labeling the camera-captured image with the type based on the number of position encoding bits; and
  
  performing a search algorithm on the camera-captured image when an insufficient amount of maze patterns are visible.
- View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13)
- - 6. The method of claim 5, wherein a support interval of gradient image histogram is used for determining whether the camera-captured image is of the first type or of the second type.
  - 7. The method of claim 6, further comprising:
    - applying a gradient operator to the camera-captured image to obtain a gradient image.
  - 8. The method of claim 7, wherein the gradient operator is a Sobel edge operator.
  - 9. The method of claim 8, further comprising:
    - generating a histogram of the gradient image.
  - 10. The method of claim 9, further comprising:
    - using a largest number along the x-axis of the histogram that has a non-zero value as the support interval of gradient image histogram.
  - 11. The method of claim 6, wherein an offline training session and an online labeling session are used for determining whether the camera-captured image is of the first type or of the second type.
  - 12. The method of claim 11, wherein a threshold, for use in distinguishing between images of the first type and images of the second type, is selected based on results of the offline training session performed on training-data images.
  - 13. The method of claim 12, wherein, during the online labeling session, the threshold is compared to a support interval of gradient image histogram of the camera-captured image to determine whether the camera-captured image is of the first type or of the second type.

14. A system, implemented at least in part by a computing device, that labels a document image containing positionally encoded maze patterns for computationally efficient decoding, the system comprising:
- an image-generation-and-capturing module including an image capturing pen that obtains the document image; and
  
  an analysis-and-labeling module that analyzes the document image to determine a number of position encoding bits that can be extracted from the document image, the analyzing including;
  
  dividing the document image into blocks having substantially the same size as maze pattern cells;
  
  determining whether the blocks are occluded by document content;
  
  counting, for each pixel in the document image, a number of completely visible blocks in a neighboring window with the pixel being as a center of the window; and
  
  labeling the pixel based on the number;
  
  wherein the analysis-and-labeling module further performs a thresholding algorithm and labels the document image as being of a type selected from at least;
  
  a first type containing sufficient amount of visible positionally encoded maze patterns for a computationally efficient algorithm to decode the document image, anda second type containing document content that occludes at least a portion of the positionally encoded maze patterns, wherein the occlusion results in insufficient amount of maze patterns being visible for the computationally efficient algorithm to decode the document image, wherein the analysis-and-labeling module further performs a search algorithm on the document image when an insufficient amount of maze patterns are visible.
- View Dependent Claims (15, 16, 17)
- - 15. The system of claim 14, wherein the image-generation-and-capturing module renders an electronic document to a bitmap representation corresponding to a printed document.
  - 16. The system of claim 14, wherein the image-generation-and-capturing module processes a scanned paper document.
  - 17. The system of claim 14, wherein the second type is divided into a plurality of subtypes that represent distinct degrees of occlusion of the positionally encoded maze patterns by the document content.

18. A system, implemented at least in part by a computing device, that labels a camera-captured image containing positionally encoded maze patterns for computationally efficient decoding, the system comprising:
- an image-generation-and-capturing module including an image capturing pen that obtains the camera-captured image; and
  
  an analysis-and-labeling module that that analyzes the camera-captured image to determine a number of position encoding bits that can be extracted from the camera-captured image, the analyzing including;
  
  dividing the camera-captured image into blocks having substantially the same size as maze pattern cells;
  
  determining whether the blocks are occluded by document content;
  
  counting, for each pixel in the camera-captured image, a number of completely visible blocks in a neighboring window with the pixel being as a center of the window; and
  
  labeling the pixel based on the number;
  
  wherein the analysis-and-labeling module further performs a thresholding algorithm on the camera-captured image and labels the camera-captured image as being of a type selected from at least;
  
  a first type containing sufficient amount of visible positionally encoded maze patterns for a computationally efficient algorithm to decode the camera-captured image, anda second type containing document content that occludes at least a portion of the positionally encoded maze patterns, wherein the occlusion results in insufficient amount of maze patterns being visible for the computationally efficient algorithm to decode the camera-captured image,wherein the analysis-and-labeling module further performs a search algorithm on the camera-captured image when an insufficient amount of maze patterns are visible.
- View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26)
- - 19. The system of claim 18, wherein a support interval of gradient image histogram is used for determining whether the image is of the first type or of the second type.
  - 20. The system of claim 19, wherein the analysis-and-labeling module applies a gradient operator to the camera-captured image to obtain a gradient image.
  - 21. The system of claim 20, wherein the gradient operator is a Sobel edge operator.
  - 22. The system of claim 19, wherein the analysis-and-labeling module generates a histogram of the gradient image.
  - 23. The system of claim 22, wherein the analysis-and-labeling module uses a largest number along the x-axis of the histogram that has a non-zero value as the support interval of gradient image histogram.
  - 24. The system of claim 23, wherein the analysis-and-labeling module performs an offline training session and an online labeling session to determine whether the camera-captured image is of the first type or of the second type.
  - 25. The system of claim 24, wherein a threshold, for use in distinguishing between images of the first type and images of the second type, is selected based on results of the offline training session performed on training-data images.
  - 26. The system of claim 25, wherein, during the online session, the analysis-and-labeling module compares the threshold to a support interval of gradient image histogram of the camera-captured image to determine whether the camera-captured image is of the first type or of the second type.

27. A computer-readable medium containing computer-readable instructions for labeling a document image containing positionally encoded maze patterns for computationally efficient decoding, by performing steps comprising:
- obtaining the document image;
  
  analyzing the document image to determine a number of position encoding bits that can be extracted from the document image, the analyzing including;
  
  dividing the document image into blocks having substantially the same size as maze pattern cells;
  
  determining whether the blocks are occluded by document content;
  
  counting, for each pixel in the document imager a number of completely visible blocks in a neighboring window with the pixel being as a center of the window; and
  
  labeling the pixel based on the number;
  
  performing a thresholding algorithm on the document image to determine if the image is of a type selected from at least;
  
  a first type containing sufficient amount of visible positionally encoded maze patterns for a computationally efficient algorithm to decode the document image, anda second type containing document content that occludes at least a portion of the positionally encoded maze patterns, wherein the occlusion results in insufficient amount of maze patterns being visible for the computationally efficient algorithm to decode the document image;
  
  labeling the document image with the type based on the number of position encoding bits; and
  
  performing a search algorithm on the document image when an insufficient amount of maze patterns are visible.
- View Dependent Claims (28, 29, 30)
- - 28. The computer-readable medium of claim 27, wherein obtaining the document image further comprises:
    - rendering an electronic document to a bitmap representation corresponding to a printed document.
  - 29. The computer-readable medium of claim 27, wherein obtaining the document image further comprises:
    - processing a scanned paper document.
  - 30. The computer-readable medium of claim 27, wherein the second type is divided into a plurality of subtypes that represent distinct degrees of occlusion of the positionally encoded maze patterns by the document content.

31. A computer-readable medium containing computer-readable instructions for labeling a camera-captured image containing positionally encoded maze patterns for computationally efficient decoding, by performing steps comprising:
- obtaining the camera-captured image;
  
  analyzing the camera-captured image to determine a number of position encoding bits that can be extracted from the camera-captured image, the analyzing including;
  
  dividing the camera-captured image into blocks having substantially the same size as maze pattern cells;
  
  determining whether the blocks are occluded by document content;
  
  counting, for each pixel in the camera-captured image, a number of completely visible blocks in a neighboring window with the pixel being as a center of the window; and
  
  labeling the pixel based on the number;
  
  performing a thresholding algorithm on the camera-captured image to determine if the camera-captured image is of a type selected from at least;
  
  a first type containing sufficient amount of visible positionally encoded maze patterns for a computationally efficient algorithm to decode the camera-captured image anda second type containing document content that occludes at least a portion of the positionally encoded maze patterns, wherein the occlusion results in insufficient amount of maze patterns being visible for the computationally efficient algorithm to decode the camera-captured image;
  
  labeling the camera-captured image with the type based on the number of position encoding bits; and
  
  performing a search algorithm on the camera-captured image when an insufficient amount of maze patterns are visible.
- View Dependent Claims (32, 33, 34, 35, 36, 37, 38, 39)
- - 32. The computer-readable medium of claim 31, wherein a support interval of gradient image histogram is used for determining whether the image is of the first type or of the second type.
  - 33. The computer-readable medium of claim 32, containing further computer-executable instructions for performing steps comprising:
    - applying a gradient operator to the image to obtain a gradient image.
  - 34. The computer-readable medium of claim 33, wherein the gradient operator is a Sobel edge operator.
  - 35. The computer-readable medium of claim 34, containing further computer-executable instructions for performing steps comprising:
    - generating a histogram of the gradient image.
  - 36. The computer-readable medium of claim 35, containing further computer-executable instructions for performing steps comprising:
    - using a largest number along the x-axis of the histogram that has a non-zero value as the support interval of gradient image histogram.
  - 37. The computer-readable medium of claim 32, wherein an offline training session and an online labeling session are used for determining whether the camera-captured image is of the first type or of the second type.
  - 38. The computer-readable medium of claim 37, wherein a threshold, for use in distinguishing between images of the first type and images of the second type, is selected based on results of the offline training session performed on training-data images.
  - 39. The computer-readable medium of claim 38, wherein, during the online labeling session, the threshold is compared to a support interval of gradient image histogram of the camera-captured image to determine whether the camera-captured image is of the first type or of the second type.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Dang, Yingnong, Wang, Jian, Chen, Liyong
Primary Examiner(s)
Ries; Laurie
Assistant Examiner(s)
Debrow; James J

Application Number

US10/753,176
Publication Number

US 20050149865A1
Time in Patent Office

2,058 Days
Field of Search

715/513, 715/512, 715/234, 705/50
US Class Current

715/234
CPC Class Codes

G06F 3/0321 by optically sensing the ab...

G06F 3/03542 Light pens for emitting or ...

Positionally encoded document image analysis and labeling

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

39 Claims

Specification

Solutions

Use Cases

Quick Links

Positionally encoded document image analysis and labeling

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

39 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links