System and method of generating automated document analysis tools
First Claim
Patent Images
1. A method of generating an automated document analyst, the method comprising:
- receiving a plurality of source documents including text strings;
performing an automated computer executable build operation on the plurality of source documents with respect to at least one target field associated with data to be extracted from the plurality of source documents; and
performing a linguistic analysis on an output file produced as a result of performing the automated computer executable build operation.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of generating an automated document analyst is disclosed and includes receiving a plurality of source documents including text strings and performing an automated computer executable build operation on the plurality of source documents with respect to at least one target field associated with data to be extracted from the plurality of source documents. Further, the method includes performing a linguistic analysis, a statistical analysis, and a document structure analysis on an output file produced as a result of performing the automated computer executable build operation.
-
Citations
49 Claims
-
1. A method of generating an automated document analyst, the method comprising:
-
receiving a plurality of source documents including text strings;
performing an automated computer executable build operation on the plurality of source documents with respect to at least one target field associated with data to be extracted from the plurality of source documents; and
performing a linguistic analysis on an output file produced as a result of performing the automated computer executable build operation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 17, 18, 19, 20, 21)
-
- 12. The method of 11, further comprising modifying the pre-production automated text-based document analyst after determining that the tested accuracy measure is below a threshold.
-
22. A system for generating at least one virtual analyst, the system comprising:
-
a data build module;
a data analysis module coupled to the data build module;
a development module coupled to the data analysis module; and
a test module, wherein the test module determines a performance metric associated with a test of a pre-production automated text-based document. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38)
-
-
39. A library system comprising:
-
at least a first automated text-based document analyst associated with a first document type; and
at least a second automated text-based document analyst associated with a second document type, wherein the first automated text-based document analyst and the second automated text-based analyst have a precision rate that is greater than 85 percent. - View Dependent Claims (40, 41, 42, 43, 44, 45, 46, 47, 48, 49)
-
Specification