Two-dimensional document processing
First Claim
Patent Images
1. A computer implemented method, comprising:
- performing optical character recognition on a document;
generating a character grid using character information obtained from the optical character recognition, wherein the character grid is a two-dimensional down-sampled version of the document;
applying a machine learning algorithm to the character grid;
in response to the applying, generating a segmentation mask depicting semantic data of the document; and
wherein generating the character grid further comprises;
identifying a character of the document;
determining a pixel area for the character;
assigning an index value to represent the pixel area in the character grid; and
down-sampling the document by a factor equal to the pixel area covering a character of the document.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed herein are system, method, and computer program product embodiments for processing a document. In an embodiment, a document processing system may receive a document. The document processing system may perform optical character recognition to obtain character information and positioning information for the characters. The document processing system may generate a down-sampled two-dimensional character grid for the document. The document processing system may apply a convolutional neural network to the character grid to obtain semantic meaning for the document. The convolutional neural network may produce a segmentation mask and bounding boxes to correspond to the document.
10 Citations
20 Claims
-
1. A computer implemented method, comprising:
-
performing optical character recognition on a document; generating a character grid using character information obtained from the optical character recognition, wherein the character grid is a two-dimensional down-sampled version of the document; applying a machine learning algorithm to the character grid; in response to the applying, generating a segmentation mask depicting semantic data of the document; and wherein generating the character grid further comprises; identifying a character of the document; determining a pixel area for the character; assigning an index value to represent the pixel area in the character grid; and down-sampling the document by a factor equal to the pixel area covering a character of the document. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system, comprising:
-
a memory; and at least one processor coupled to the memory and configured to; perform optical character recognition on a document; generate a character grid using character information obtained from the optical character recognition, wherein the character grid is a two-dimensional down-sampled version of the document; apply a machine leaning algorithm to the character grid; in response to the applying, generate a segmentation mask depicting semantic data of the document; and wherein to generate the character grid, the at least one processor is further configured to; identify a character of the document; determine a pixel area for the character; assign an index value to represent the pixel area in the character grid; and down-sample the document by a factor equal to the pixel area covering a character of the document. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory computer-readable device having instructions stored thereon that, when executed by at least one computing device, cause the at least one computing device to perform operations comprising:
-
performing optical character recognition on a document; generating a character grid using character information obtained from the optical character recognition, wherein the character grid is a two-dimensional down-sampled version of the document; applying a machine learning algorithm to the character grid; in response to the applying, generating a segmentation mask depicting semantic data of the document; and wherein generating the character grid comprises; identifying a character of the document; determining a pixel area for the character; assigning an index value to represent the pixel area in the character grid; and down-sampling the document by a factor equal to the pixel area covering a character of the document. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification