Method and apparatus for automatically extracting metadata from electronic documents using spatial rules
First Claim
1. An apparatus for automatically extracting metadata from electronic documents comprising a first processing element, a second processing element, a reasoning element, and a database, wherein, i) said first processing element is further configured to convert electronic documents into files;
- ii) said first processing element is configured to provide the files to a second processing element;
iii) said second processing element is configured to receive said files and extract predetermined information;
iv) said second processing element is further configured to provide said extracted predetermined information to said reasoning element;
v) said database is configured to also provide input to said reasoning element;
vi) said reasoning element is configured to use a set of rules to extract metadata from the files; and
vii) said reasoning element provides an output of metadata.
1 Assignment
0 Petitions
Accused Products
Abstract
A spatial knowledge base approach for the automatic extraction of metadata 116 from electronic documents 100. The electronic document 100 is converted to a substantially format invariant data file 104 by an intermediate language conversion element 102. Spatial layout facts 108 are extracted and combined with spatial layout rules 114 from a knowledge engineer 112 in a spatial metadata-reasoning element 110 to provide the metadata 116. The invention is based on mimicking the visual and spatial knowledge that humans make use of when reading a document.
55 Citations
16 Claims
-
1. An apparatus for automatically extracting metadata from electronic documents comprising a first processing element, a second processing element, a reasoning element, and a database, wherein,
i) said first processing element is further configured to convert electronic documents into files; -
ii) said first processing element is configured to provide the files to a second processing element;
iii) said second processing element is configured to receive said files and extract predetermined information;
iv) said second processing element is further configured to provide said extracted predetermined information to said reasoning element;
v) said database is configured to also provide input to said reasoning element;
vi) said reasoning element is configured to use a set of rules to extract metadata from the files; and
vii) said reasoning element provides an output of metadata. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for automatically extracting metadata from electronic documents providing a first processing element, a second processing element, a reasoning element, and a database and comprising the steps of:
-
a) using said first processing element to convert electronic documents to files;
b) further using said first processing element to provide the files to said second processing element;
c) using said second processing element to receive said files and extract predetermined information;
d) further using said second processing element to provide extracted predetermined information to said reasoning element;
e) using said database to provide input to said reasoning element;
f) using a set of rules in said reasoning element to extract metadata from the files;
g) providing an out put of metadata from said reasoning element. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
Specification