Processing of an electronic document, apparatus and system for processing the document, and storage medium containing computer executable instructions for processing the document
First Claim
Patent Images
1. A method for processing an electronic document, which comprises the steps of:
- extracting information relating to the electronic document via a local database, wherein the electronic document is an electronically scanned optical character recognized preprocessed document;
extracting the information relating to the electronic document when a predefined condition is met via a super ordinate database, the predefined condition includes a fact that an extraction of the information via the local database does not provide any results or does not provide any good results, the extracting step including the following substeps;
providing the predefined condition with a fact that a quality is determined for an extraction of the information using the local database;
comparing the quality with a predefined threshold value; and
using the super ordinate database to extract the information relating to the electronic document when the quality does not reach the predefined threshold value;
using information relating to which fields are intended to be extracted to extract the information from the electronic document; and
determining the information relating to which fields are intended to be extracted using at least one training document.
1 Assignment
0 Petitions
Accused Products
Abstract
In a method for processing an electronic document, a local database is used to extract information relating to the document, and a super ordinate database is used to extract information relating to the document if a predefined condition is met. An apparatus, a computer program product and a storage medium can execute the method.
-
Citations
19 Claims
-
1. A method for processing an electronic document, which comprises the steps of:
-
extracting information relating to the electronic document via a local database, wherein the electronic document is an electronically scanned optical character recognized preprocessed document; extracting the information relating to the electronic document when a predefined condition is met via a super ordinate database, the predefined condition includes a fact that an extraction of the information via the local database does not provide any results or does not provide any good results, the extracting step including the following substeps; providing the predefined condition with a fact that a quality is determined for an extraction of the information using the local database; comparing the quality with a predefined threshold value; and using the super ordinate database to extract the information relating to the electronic document when the quality does not reach the predefined threshold value; using information relating to which fields are intended to be extracted to extract the information from the electronic document; and determining the information relating to which fields are intended to be extracted using at least one training document. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. An apparatus for processing an electronic document, the apparatus comprising:
-
a computer set up such that a local database can be used to extract information relating to the electronic document, and a super ordinate database can be used to extract the information relating to the electronic document when a predefined condition is met, the predefined condition including a fact that an extraction of the information via the local database does not provide any results or does not provide any good results, wherein the electronic document is an electronically scanned optical character recognized preprocessed document, said computer being programmed to; provide the predefined condition with a fact that a quality is determined for an extraction of the information using the local database; compare the quality with a predefined threshold value; use the super ordinate database to extract the information relating to the electronic document when the quality does not reach the predefined threshold value; use information relating to which fields are intended to be extracted to extract the information from the electronic document; and determine the information relating to which fields are intended to be extracted using at least one training document. - View Dependent Claims (16)
-
-
17. A system for processing an electronic document, comprising:
-
at least one apparatus having a computer set up such that a local database can be used to extract information relating to the electronic document, and a super ordinate database can be used to extract the information relating to the electronic document when a predefined condition is met, the predefined condition including a fact that an extraction of the information via the local database does not provide any results or does not provide any good results, wherein the electronic document is an electronically scanned optical character recognized preprocessed document, said computer being programmed to; provide the predefined condition with a fact that a quality is determined for an extraction of the information using the local database; compare the quality with a predefined threshold value; use the super ordinate database to extract the information relating to the electronic document when the quality does not reach the predefined threshold value; use information relating to which fields are intended to be extracted to extract the information from the electronic document; and determine the information relating to which fields are intended to be extracted using at least one training document.
-
-
18. Computer executable instructions to be loaded into a non-transitory memory of a digital computer, for performing a method for processing an electronic document, which comprises the steps of:
-
extracting information relating to the electronic document via a local database, wherein the electronic document is an electronically scanned optical character recognized preprocessed document; extracting the information relating to the electronic document when a predefined condition is met via a super ordinate database, the predefined condition including a fact that an extraction of the information via the local database does not provide any results or does not provide any good results, the extracting step including the following substeps; providing the predefined condition with a fact that a quality is determined for an extraction of the information using the local database; comparing the quality with a predefined threshold value; and using the super ordinate database to extract the information relating to the electronic document when the quality does not reach the predefined threshold value; using information relating to which fields are intended to be extracted to extract the information from the electronic document; and determine the information relating to which fields are intended to be extracted using at least one training document.
-
-
19. A non-transitory computer-readable storage medium having computer executable instructions to be executed by a computer for performing a method for processing an electronic document, which comprises the steps of:
-
extracting information relating to the electronic document via a local database, wherein the electronic document is an electronically scanned optical character recognized preprocessed document; extracting the information relating to the electronic document when a predefined condition is met via a super ordinate database, the predefined condition including a fact that an extraction of the information via the local database does not provide any results or does not provide any good results, the extracting step including the following substeps; providing the predefined condition with a fact that a quality is determined for an extraction of the information using the local database; comparing the quality with a predefined threshold value; and using the super ordinate database to extract the information relating to the electronic document when the quality does not reach the predefined threshold value; using information relating to which fields are intended to be extracted to extract the information from the electronic document; and determining the information relating to which fields are intended to be extracted using at least one training document.
-
Specification