Automatic identification of fields and labels in forms
First Claim
1. A computer-implemented method for generating symbolic information for a first set of field images associated with a first field, the method comprising:
- receiving the first set of field images associated with the first field, the first set of field images associated with the first field cropped from a plurality of form images;
retrieving a first label image associated with the first field, the first label image associated with the first field cropped from one of the plurality of form images;
determining a match for the first label image from a classification dictionary;
associating, with one or more processors, symbolic information corresponding to the match for the first label image with the first label image;
determining a subject matter associated with the first label image using the symbolic information associated with the first label image;
identifying a subset of the classification dictionary using the subject matter associated with the first label image;
determining a match for each field image of the first set of field images from the subset of the classification dictionary; and
associating symbolic information corresponding to the match for each field image with each corresponding field image of the first set of field images.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for processing form images including strokes. A controller receives a plurality of form images including a plurality of strokes. A stroke identification module identifies the position of each stroke in each of the form images. A geometry engine generates an overlay of the plurality of form images and identifies a group of overlapping strokes from the overlay. The geometry engine generates a field bounding box encompassing the group of strokes, the field bounding box representing a field in the plurality of form images. The geometry engine crops a field image from each form image based on the size and position of the field bounding box. A label detector analyzes an area around the field image in the form image to determine a label and generates a label image.
23 Citations
20 Claims
-
1. A computer-implemented method for generating symbolic information for a first set of field images associated with a first field, the method comprising:
-
receiving the first set of field images associated with the first field, the first set of field images associated with the first field cropped from a plurality of form images; retrieving a first label image associated with the first field, the first label image associated with the first field cropped from one of the plurality of form images; determining a match for the first label image from a classification dictionary; associating, with one or more processors, symbolic information corresponding to the match for the first label image with the first label image; determining a subject matter associated with the first label image using the symbolic information associated with the first label image; identifying a subset of the classification dictionary using the subject matter associated with the first label image; determining a match for each field image of the first set of field images from the subset of the classification dictionary; and associating symbolic information corresponding to the match for each field image with each corresponding field image of the first set of field images. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system for generating symbolic information for a first set of field images associated with a first field, the system comprising:
-
a processor; a table generator stored on a memory and executable by the processor, the table generator for receiving the first set of field images associated with the first field, the first set of field images associated with the first field cropped from a plurality of form images; and a symbolic representation module coupled to the table generator, the symbolic representation module for retrieving a first label image associated with the first field, the first label image associated with the first field cropped from one of the plurality of form images, determining a match for the first label image from a classification dictionary, associating symbolic information corresponding to the match for the first label image with the first label image, determining a subject matter associated with the first label image using the symbolic information associated with the first label image, identifying a subset of the classification dictionary using the subject matter associated with the first label image, determining a match for each field image of the first set of field images from the subset of the classification dictionary and associating symbolic information corresponding to the match for each field image with each corresponding field image of the first set of field images. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer program product stored on a non-transitory computer readable medium including a computer readable program, wherein the computer readable program when executed on a computer causes the computer to:
-
receive a first set of field images associated with a first field, the first set of field images associated with the first field cropped from a plurality of form images; retrieve a first label image associated with the first field, the first label image associated with the first field cropped from one of the plurality of form images; determine a match for the first label image from a classification dictionary; associate symbolic information corresponding to the match for the first label image with the first label image; determine a subject matter associated with the first label image using the symbolic information associated with the first label image; identify a subset of the classification dictionary using the subject matter associated with the first label image; determine a match for each field image of the first set of field images from the subset of the classification dictionary; and associate symbolic information corresponding to the match for each field image with each corresponding field image of the first set of field images. - View Dependent Claims (18, 19, 20)
-
Specification