Building classification and extraction models based on electronic forms
First Claim
1. A computer-implemented method for building a classification and/or data extraction knowledge base using an electronic form, the method comprising:
- receiving an electronic form having associated therewith a plurality of metadata labels, each metadata label corresponding to at least one element of interest represented within the electronic form;
parsing the plurality of metadata labels to determine characteristic features of the element(s) of interest;
building a representation of the electronic form based on the plurality of metadata labels;
generating a plurality of permutations of the representation of the electronic form by applying a predetermined set of variations to the representation; and
training either a classification model, an extraction model, or both using;
the representation of the electronic form, andthe plurality of permutations of the representation of the electronic form.
5 Assignments
0 Petitions
Accused Products
Abstract
According to one embodiment, a computer-implemented method is configured for building a classification and/or data extraction knowledge base using an electronic form. The method includes: receiving an electronic form having associated therewith a plurality of metadata labels, each metadata label corresponding to at least one element of interest represented within the electronic form; parsing the plurality of metadata labels to determine characteristic features of the element(s) of interest; building a representation of the electronic form based on the plurality of metadata labels; generating a plurality of permutations of the representation of the electronic form by applying a predetermined set of variations to the representation; and training either a classification model, an extraction model, or both using: the representation of the electronic form, and the plurality of permutations of the representation of the electronic form. Corresponding systems and computer program products are also disclosed.
36 Citations
20 Claims
-
1. A computer-implemented method for building a classification and/or data extraction knowledge base using an electronic form, the method comprising:
-
receiving an electronic form having associated therewith a plurality of metadata labels, each metadata label corresponding to at least one element of interest represented within the electronic form; parsing the plurality of metadata labels to determine characteristic features of the element(s) of interest; building a representation of the electronic form based on the plurality of metadata labels; generating a plurality of permutations of the representation of the electronic form by applying a predetermined set of variations to the representation; and training either a classification model, an extraction model, or both using; the representation of the electronic form, and the plurality of permutations of the representation of the electronic form. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system comprising a processor configured to execute logic, the logic being configured, upon execution thereof by the processor, to cause the processor to perform a computer-implemented method comprising:
-
receiving an electronic form having associated therewith a plurality of metadata labels, each metadata label corresponding to at least one element of interest represented within the electronic form; parsing the plurality of metadata labels to determine characteristic features of the element(s) of interest; building a representation of the electronic form based on the plurality of metadata labels; generating a plurality of permutations of the representation of the electronic form by applying a predetermined set of variations to the representation; and training either a classification model, an extraction model, or both using; the representation of the electronic form, and the plurality of permutations of the representation of the electronic form.
-
-
12. A computer program product comprising a computer readable storage medium having embodied thereon computer readable program instructions configured to cause a mobile device, upon execution of the computer readable program instructions, to perform operations comprising:
-
receiving an electronic form having associated therewith a plurality of metadata labels, each metadata label corresponding to at least one element of interest represented within the electronic form; parsing the plurality of metadata labels to determine characteristic features of the element(s) of interest; building a representation of the electronic form based on the plurality of metadata labels; generating a plurality of permutations of the representation of the electronic form by applying a predetermined set of variations to the representation; and training either a classification model, an extraction model, or both using; the representation of the electronic form, and the plurality of permutations of the representation of the electronic form. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
-
Specification