Method for automatically indexing documents
First Claim
1. A method for automatically indexing a set of base documents, said method comprising:
- inputting, with a computer device, data defining all elements respectively contained within the base documents which meet predefined criteria and defining a corresponding meaning, the elements comprising an element to be checked and surrounding elements that surround the element to be checked;
evaluating, with a trainable classifying apparatus in communication with the computer device which has been trained to recognize whether inputted data belongs to a corresponding classification category or not, the received data as to whether the element to be checked has the corresponding meaning, wherein the training has been performed based on a training sample generated for base documents in which the element to be checked is surrounded by the surrounding elements and has the corresponding meaning, the received data comprising data coding absolute or relative positions of the surrounding elements by corresponding text strings; and
for those base documents where elements have been found to have the corresponding meaning, building, with the computer device, an index indexing the base documents, the index comprising the elements having the corresponding meaning with a corresponding reference to the document in which each element is contained.
13 Assignments
0 Petitions
Accused Products
Abstract
Methods and Systems for automatically indexing a set of base documents. Data defining all elements respectively contained within the base documents is indexed. The elements comprise an element to be checked and surrounding elements that surround the element to be checked. The received data is evaluated as to whether the element to be checked has the corresponding meaning, wherein training has been performed based on a training sample generated for base documents in which the element to be checked is surrounded by the surrounding elements and has the corresponding meaning. For those base documents where elements have been found to have the corresponding meaning, building an index indexing the base documents, the index comprising the elements having the corresponding meaning with a corresponding reference to the document in which each element is contained.
-
Citations
9 Claims
-
1. A method for automatically indexing a set of base documents, said method comprising:
-
inputting, with a computer device, data defining all elements respectively contained within the base documents which meet predefined criteria and defining a corresponding meaning, the elements comprising an element to be checked and surrounding elements that surround the element to be checked; evaluating, with a trainable classifying apparatus in communication with the computer device which has been trained to recognize whether inputted data belongs to a corresponding classification category or not, the received data as to whether the element to be checked has the corresponding meaning, wherein the training has been performed based on a training sample generated for base documents in which the element to be checked is surrounded by the surrounding elements and has the corresponding meaning, the received data comprising data coding absolute or relative positions of the surrounding elements by corresponding text strings; and for those base documents where elements have been found to have the corresponding meaning, building, with the computer device, an index indexing the base documents, the index comprising the elements having the corresponding meaning with a corresponding reference to the document in which each element is contained. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A system for automatically indexing a set of base documents, said system comprising:
a trainable classifying apparatus configured for; inputting, with a computer device, data defining all elements respectively contained within the base documents which meet predefined criteria and defining a corresponding meaning, the elements comprising an element to be checked and surrounding elements that surround the element to be checked; evaluating, with the trainable classifying apparatus, which has been trained to recognize whether inputted data belongs to a corresponding classification category or not, the received data as to whether the element to be checked has the corresponding meaning, wherein the training has been performed based on a training sample generated for base documents in which the element to be checked is surrounded by the surrounding elements and has the corresponding meaning, the received data comprising data coding absolute or relative positions of the surrounding elements by corresponding text strings; and for those base documents where elements have been found to have the corresponding meaning, building, with the computer device, an index indexing the base documents, the index comprising the elements having the corresponding meaning with a corresponding reference to the document in which each element is contained. - View Dependent Claims (7, 8, 9)
Specification