Method for Automatically Indexing Documents
First Claim
1. A method for retrieving based on a search term together with its corresponding meaning from a set of base documents those documents which contain said search term and in which said certain search term has said certain meaning to enable the building of an index on said retrieved documents, said method comprising:
- searching for those base documents among said set of base documents which contain said certain search term;
evaluating the found base documents as to whether said search term contained in said found base documents, respectively, has a certain meaning, said evaluation comprising;
generating a text document to represent elements surrounding the search term and their corresponding absolute or relative position with respect to said search term, the elements of said text document coding said absolute or relative positions of said surrounding elements by corresponding text strings;
inputting said text document into a trainable classifying apparatus which has been trained to recognize whether an inputted text document belongs to a certain classification category or not, whereas said training has been performed based on a training sample of text documents which have been generated for documents in which the term surrounded by the surrounding elements has said meaning inputted by said user; and
classifying said inputted text document to judge whether said search term has said inputted meaning.
13 Assignments
0 Petitions
Accused Products
Abstract
A method for retrieving based on a search term together with its corresponding meaning from a set of base documents those documents which contain the search term and in which the certain search term has the certain meaning to enable the building of an index on the retrieved documents. The method includes searching for those base documents among the set of base documents which contain the certain search term and evaluating the found base documents as to whether the search term contained in the found base documents, respectively, has a certain meaning. Evaluation includes generating a text document to represent elements surrounding the search term and their corresponding absolute or relative position with respect to the search term; inputting the text document into a trainable classifying apparatus; classifying the inputted text document to judge whether the search term has the inputted meaning.
50 Citations
11 Claims
-
1. A method for retrieving based on a search term together with its corresponding meaning from a set of base documents those documents which contain said search term and in which said certain search term has said certain meaning to enable the building of an index on said retrieved documents, said method comprising:
-
searching for those base documents among said set of base documents which contain said certain search term; evaluating the found base documents as to whether said search term contained in said found base documents, respectively, has a certain meaning, said evaluation comprising; generating a text document to represent elements surrounding the search term and their corresponding absolute or relative position with respect to said search term, the elements of said text document coding said absolute or relative positions of said surrounding elements by corresponding text strings; inputting said text document into a trainable classifying apparatus which has been trained to recognize whether an inputted text document belongs to a certain classification category or not, whereas said training has been performed based on a training sample of text documents which have been generated for documents in which the term surrounded by the surrounding elements has said meaning inputted by said user; and classifying said inputted text document to judge whether said search term has said inputted meaning. - View Dependent Claims (2, 3, 4, 5, 7, 10)
-
-
6. A method of training a classifying apparatus to retrieve based on a search term together with its corresponding meaning from a set of base documents those documents which contain said search term and in which said certain search term, has said certain meaning to enable the building of an index on said retrieved documents, said method of training comprising:
-
looking for base documents in which an element has a certain meaning; selecting said element by the user; repeating said steps of looking and selecting until a sufficient set of base documents has been selected to generate a training sample; generating the text documents for the respective base documents; and using said generated text documents as a training set for training said classifying apparatus by running said classifying apparatus in the training mode.
-
-
8. A method for automatically indexing a set of base documents based on a set of training examples, said automatic indexing comprising:
-
evaluating said base documents by checking for some or all elements respectively contained therein whether they have a certain meaning, said evaluation comprising; for those elements to be checked, generating a text document based on said element to be checked and its surrounding elements coding for their corresponding absolute or relative position with respect to said element to be checked; inputting said text documents into a trainable classifying apparatus which has been trained to recognize whether an inputted text document belongs to a certain classification category or not, whereas said training has been performed based on a training sample of text documents which have been generated for documents in which the element surrounded by the surrounding elements has said certain meaning; judging by said classifying apparatus whether said element has said certain meaning; and for those base documents where elements have been found to have said certain meaning, using said elements mid a corresponding reference to the document in which they are contained to build an index indexing said large volume of base documents. - View Dependent Claims (9)
-
-
11. A computer program comprising computer program code for enabling a computer to retrieve based on a search terms together with its corresponding meaning from a set of base documents those documents which contain said search term and in which said certain search term has said certain meaning to enable the building of an index on said retrieved documents, said computer program code comprising instructions for:
-
searching for those base documents among said set of base documents which contain said certain search term; evaluating the found base documents as to whether said search term contained in said found base documents, respectively, has a certain meaning, said evaluation comprising; generating a text document to represent elements surrounding the search term and their corresponding absolute or relative position with respect to said search term, the elements of said text document coding said absolute or relative positions of said surrounding elements by corresponding text strings; inputting said text document into a trainable classifying apparatus which has been trained to recognize whether an inputted text document belongs to a certain classification category or not, whereas said training has been performed based on a training sample of text documents which have been generated for documents in which the term surrounded by the surrounding elements has said meaning inputted by said user; and classifying said inputted text document to judge whether said search term has said inputted meaning.
-
Specification