Index term extraction device for document-to-be-surveyed
First Claim
1. An index term extraction device comprising:
- input means for inputting a document-to-be-surveyed, documents-to-be-compared that are compared with the document-to-be-surveyed, and similar documents that are similar to the document-to-be-surveyed;
index term extraction means for extracting index terms from the document-to-be-surveyed;
first appearance frequency calculation means for calculating a function value of an appearance frequency of each of the extracted index terms in the documents-to-be-compared;
second appearance frequency calculation means for calculating a function value of an appearance frequency of each of the extracted index terms in the similar documents; and
output means for outputting each index term and its positioning data based on the combination of the function value of the appearance frequency in the documents-to-be-compared and the function value of the appearance frequency in the similar documents, respectively calculated for each index term,wherein at least one of the function value of the appearance frequency in the documents-to-be-compared calculated by the first appearance frequency calculation means and the function value of the appearance frequency in the similar documents calculated by the second appearance frequency calculation means has a global frequency IDF as its variable.
1 Assignment
0 Petitions
Accused Products
Abstract
A device comprises input means (1) for inputting a document (d) to be examined, a group of documents (P) to be compared, and a group of similar documents (S), index word extracting means (120) for extracting an index word in the document (d), first frequency calculating means (143) for calculating in GFIDF(P) of the extracted index word in the document group (P), second frequency calculating means (171) for calculating in GFIDF(S) of the extracted index word in the similar document group (S), and output means (4) for outputting the index words and their positioning data according to the combination of the calculated ln GFIDF(P) and ln GFIDF(S) in the document group to be compared and the similar document group. With this, when a document to be examined is given, the assertion of the document can be easily grasped.
-
Citations
8 Claims
-
1. An index term extraction device comprising:
-
input means for inputting a document-to-be-surveyed, documents-to-be-compared that are compared with the document-to-be-surveyed, and similar documents that are similar to the document-to-be-surveyed; index term extraction means for extracting index terms from the document-to-be-surveyed; first appearance frequency calculation means for calculating a function value of an appearance frequency of each of the extracted index terms in the documents-to-be-compared; second appearance frequency calculation means for calculating a function value of an appearance frequency of each of the extracted index terms in the similar documents; and output means for outputting each index term and its positioning data based on the combination of the function value of the appearance frequency in the documents-to-be-compared and the function value of the appearance frequency in the similar documents, respectively calculated for each index term, wherein at least one of the function value of the appearance frequency in the documents-to-be-compared calculated by the first appearance frequency calculation means and the function value of the appearance frequency in the similar documents calculated by the second appearance frequency calculation means has a global frequency IDF as its variable. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. An index term extraction method comprising:
-
an input step for inputting a document-to-be-surveyed, documents-to-be-compared that are compared with the document-to-be-surveyed, and similar documents that are similar to the document-to-be-surveyed; an index term extraction step for extracting index terms from the document-to-be-surveyed; a first appearance frequency calculation step for calculating a function value of an appearance frequency of each of the extracted index terms in the documents-to-be-compared; a second appearance frequency calculation step for calculating a function value of an appearance frequency of each of the extracted index terms in the similar documents; and an output step for outputting each index term and its positioning data based on the combination of the function value of the appearance frequency in the documents-to-be-compared and the function value of the appearance frequency in the similar documents, respectively calculated for each index term, wherein at least one of the function value of the appearance frequency in the documents-to-be-compared calculated by the first appearance frequency calculation step and the function value of the appearance frequency in the similar documents calculated by the second appearance frequency calculation step has a global frequency IDF as its variable.
-
-
8. An index term extraction program for causing a computer to execute:
-
an input step for inputting a document-to-be-surveyed, documents-to-be-compared that are compared with the document-to-be-surveyed, and similar documents that are similar to the document-to-be-surveyed; an index term extraction step for extracting index terms from the document-to-be-surveyed; a first appearance frequency calculation step for calculating a function value of an appearance frequency of each of the extracted index terms in the documents-to-be-compared; a second appearance frequency calculation step for calculating a function value of an appearance frequency of each of the extracted index terms in the similar documents; and an output step for outputting each index term and its positioning data based on the combination of the function value of the appearance frequency in the documents-to-be-compared and the function value of the appearance frequency in the similar documents, respectively calculated for each index term, wherein at least one of the function value of the appearance frequency in the documents-to-be-compared calculated by the first appearance frequency calculation step and the function value of the appearance frequency in the similar documents calculated by the second appearance frequency calculation step has a global frequency IDF as its variable.
-
Specification