System and method for processing and identifying content in form documents
First Claim
1. A process for processing and identifying content in a form, the process comprising the steps of:
- receiving a form set comprising a plurality of data sets, wherein each of the plurality of data sets populate at least a first form in the form set, and wherein each form in the form set and each data set in the plurality of data sets comprises a plurality of elements;
processing at least a portion of the form set and a portion of the plurality of data sets that populate the portion of the form set through a first artificial entity; and
identifying any combination of one or more noise, background data, and content data without character recognition for each form in the portion of the form set and the portion of the plurality of data sets, wherein the identifying labels each element of the portion of the form set and the portion of the plurality of data sets as at least one of noise, background data, or content data.
2 Assignments
0 Petitions
Accused Products
Abstract
The present disclosure generally provides a system and method for processing and identifying data in form. The system and method may distinguish between content data and background data in a form. In some aspects, the content data or background data may be removed, wherein the remaining data may be processed separately. Removal of the background data or the content data may allow for more effective or efficient character recognition of the data. In some embodiments, data may be processed on an element basis, wherein each element of the form may be labeled as background data, content data, noise, or combinations thereof. This system and method may significantly increase the ability to capture and extract relevant information from a form.
17 Citations
19 Claims
-
1. A process for processing and identifying content in a form, the process comprising the steps of:
-
receiving a form set comprising a plurality of data sets, wherein each of the plurality of data sets populate at least a first form in the form set, and wherein each form in the form set and each data set in the plurality of data sets comprises a plurality of elements; processing at least a portion of the form set and a portion of the plurality of data sets that populate the portion of the form set through a first artificial entity; and identifying any combination of one or more noise, background data, and content data without character recognition for each form in the portion of the form set and the portion of the plurality of data sets, wherein the identifying labels each element of the portion of the form set and the portion of the plurality of data sets as at least one of noise, background data, or content data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
Specification