Document matching degree operating system, document matching degree operating method and document matching degree operating program
First Claim
1. A document matching degree operating system for obtaining a document matching degree as an index value indicating a matching degree of a target document with one or more search terms from information on document set to which the one or more search terms are input and includes a plurality of documents including the target document to be a search target, the document matching degree operating system comprising:
- a plural documents information storing part for storing the information on document set;
a TF term operating part for calculating a TF term reflecting a frequency of the input search term in the target document by retrieving a specific information from the plural documents information storing part;
an IDF term operating part for calculating an IDF term reflecting an importance of the input search term in the target document by retrieving a specific information from the plural documents information storing part; and
a document matching degree operating part for calculating the document matching degree from calculation results of the TF term operating part and the IDF term operating part, wherein the TF term operating part calculates an expectation value of a number of appearances of the search term in the target document in the case of including the target document in an appropriate document set for the search term, by approximating the document set by an appearing document set which is all documents in which the search term appears, and reflects, in the TF term, a difference of the expectation value with an actual number of appearances of the search term in the target document.
1 Assignment
0 Petitions
Accused Products
Abstract
In the present invention, a document matching degree indicating a matching degree of a target document with one or more search terms is calculated based on information in a plural documents information storing part, by calculating a TF term reflecting a frequency of the input search term in the target document and an IDF term reflecting an importance of the input search term in the target document, and from the TF term and the IDF term for each search term. Then there is calculated an expectation value of a number of appearances of a search term t in a target document d, by approximating the document set σ(t) by an appearing document set κ(t), and there is reflected, in the TF term, a disagreement of the expectation value with an actual number of appearances of the search term t in the target document d.
21 Citations
9 Claims
-
1. A document matching degree operating system for obtaining a document matching degree as an index value indicating a matching degree of a target document with one or more search terms from information on document set to which the one or more search terms are input and includes a plurality of documents including the target document to be a search target, the document matching degree operating system comprising:
-
a plural documents information storing part for storing the information on document set;
a TF term operating part for calculating a TF term reflecting a frequency of the input search term in the target document by retrieving a specific information from the plural documents information storing part;
an IDF term operating part for calculating an IDF term reflecting an importance of the input search term in the target document by retrieving a specific information from the plural documents information storing part; and
a document matching degree operating part for calculating the document matching degree from calculation results of the TF term operating part and the IDF term operating part, wherein the TF term operating part calculates an expectation value of a number of appearances of the search term in the target document in the case of including the target document in an appropriate document set for the search term, by approximating the document set by an appearing document set which is all documents in which the search term appears, and reflects, in the TF term, a difference of the expectation value with an actual number of appearances of the search term in the target document. - View Dependent Claims (2, 3, 4)
-
-
5. A document matching degree operating system for obtaining a document matching degree as an index value indicating a matching degree of a target document with one or more search terms from information on document set to which the one or more search terms are input and includes a plurality of documents including the target document to be a search target, the document matching degree operating system comprising:
-
a plural documents information storing part for storing the information on document set;
a TF term operating part for calculating a TF term reflecting a frequency of the input search term in the target document by retrieving a specific information from the plural documents information storing part;
an IDF term operating part for calculating an IDF term reflecting an importance of the input search term in the target document by retrieving a specific information from the plural documents information storing part; and
a document matching degree operating part for calculating the document matching degree from calculation results of the TF term operating part and the IDF term operating part, wherein the IDF term operating part sets an average number of appearances of the search term per document in the document in which the search term appears as a repeatability of the search term in a document, and obtains the IDF term by the repeatability.
-
-
6. A document matching degree operating method for obtaining a document matching degree as an index value indicating a matching degree of a target document with one or more search terms from information on document set to which the one or more search terms are input and includes a plurality of documents including the target document to be a search target, the document matching degree operating method comprising:
-
a TF term operating step for calculating a TF term reflecting a frequency of the input search term in the target document by retrieving specific information from a plural documents information storing part for storing information on document set;
an IDF term operating step for calculating an IDF term reflecting an importance of the input search term in the target document by retrieving a specific information from the plural documents information storing part; and
a document matching degree operating step for calculating the document matching degree from calculation results of the TF term operating step and the IDF term operating step. - View Dependent Claims (7, 8, 9)
-
Specification