Method and apparatus for template-based processing of electronic documents
First Claim
1. A computer implemented method of processing electronic documents, comprising:
- analyzing text content of the electronic documents to identify whether each of the electronic documents matches any of a plurality of predefined document templates, wherein one or more of the electronic documents conforms to a structure of at least one of the plurality of predefined document templates, and wherein the step of analyzing comprises executing at least one machine learning algorithm, the at least one machine learning algorithm trained using at least one sample electronic document having a predefined template;
generating a template index that relates at least one of the electronic documents with at least one of the plurality of predefined document templates based at least in part upon an identified match between the at least one of the electronic documents and the at least one of the plurality of predefined document templates;
generating a search query using at least one of the plurality of predefined document templates as at least one search parameter;
searching an archive having the electronic documents using the template index to locate one or more of the electronic documents that match the at least one predefined document template of the search query; and
providing access to the one or more of the electronic documents that match the at least one predefined document template of the search query.
7 Assignments
0 Petitions
Accused Products
Abstract
Method and apparatus for template-based processing of electronic documents is described. In some examples, text content of the electronic documents is analyzed to identify whether each of the electronic documents matches any of a plurality of document templates. A template index is generated that relates at least one of the electronic documents with at least one of the plurality of document templates associated therewith. A search query is generated using at least one of the plurality of document templates as a respective at least one search parameter. An archive having the electronic documents is searched using the template index to locate any of the electronic documents that match the at least one document template of the search query.
41 Citations
9 Claims
-
1. A computer implemented method of processing electronic documents, comprising:
-
analyzing text content of the electronic documents to identify whether each of the electronic documents matches any of a plurality of predefined document templates, wherein one or more of the electronic documents conforms to a structure of at least one of the plurality of predefined document templates, and wherein the step of analyzing comprises executing at least one machine learning algorithm, the at least one machine learning algorithm trained using at least one sample electronic document having a predefined template; generating a template index that relates at least one of the electronic documents with at least one of the plurality of predefined document templates based at least in part upon an identified match between the at least one of the electronic documents and the at least one of the plurality of predefined document templates; generating a search query using at least one of the plurality of predefined document templates as at least one search parameter; searching an archive having the electronic documents using the template index to locate one or more of the electronic documents that match the at least one predefined document template of the search query; and providing access to the one or more of the electronic documents that match the at least one predefined document template of the search query. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. An apparatus for processing electronic documents, comprising:
-
means for analyzing text content of the electronic documents to identify whether each of the electronic documents matches any of a plurality of predefined document templates, wherein one or more of the electronic documents conforms to a structure of at least one of the plurality of predefined document templates, and wherein the means for analyzing comprises means for executing at least one machine learning algorithm, the at least one machine learning algorithm trained using at least one sample electronic document having a predefined template; means for generating a template index that relates at least one of the electronic documents with at least one of the plurality of predefined document templates based at least in part upon an identified match between the at least one of the electronic documents and the at least one of the plurality of predefined document templates; means for generating a search query using at least one of the plurality of predefined document templates as at least one search parameter; means for searching an archive having the electronic documents using the template index to locate one or more of the electronic documents that match the at least one predefined document template of the search query; and means for providing access to the one or more of the electronic documents that match the at least one predefined document template of the search query. - View Dependent Claims (8, 9)
-
Specification