Method and device for relevant document search
First Claim
1. A document search method for finding a document relevant to a search condition from object documents as search objects, comprising the steps of:
- acquiring a seed text which is inputted as the search condition;
partitioning the object document into a plurality of blocks;
calculating similarity of each block of the object document to the seed text;
comparing the calculated similarity with a preset threshold value and thereby judging whether or not each block is relevant to the seed text; and
calculating an inclusion degree of the object document including the blocks regarding the seed text based on the result of the judgment.
1 Assignment
0 Petitions
Accused Products
Abstract
Character strings are extracted from a seed text which is inputted as a search condition for searching prestored object documents for a relevant document. Each object document is partitioned into a plurality of blocks, and character strings are extracted from each block. Similarity of each block to the seed text is calculated by comparing the character strings extracted from the block and the character strings extracted from the seed text. Whether or not each block is relevant to the seed text is judged by comparing the calculated similarity of the block with a preset threshold value. Based on the judgment, an “inclusion degree” of each object document (including the blocks) regarding the seed text is calculated, by which object documents relevant to the seed text are outputted.
-
Citations
20 Claims
-
1. A document search method for finding a document relevant to a search condition from object documents as search objects, comprising the steps of:
-
acquiring a seed text which is inputted as the search condition;
partitioning the object document into a plurality of blocks;
calculating similarity of each block of the object document to the seed text;
comparing the calculated similarity with a preset threshold value and thereby judging whether or not each block is relevant to the seed text; and
calculating an inclusion degree of the object document including the blocks regarding the seed text based on the result of the judgment. - View Dependent Claims (2, 12, 13, 14)
-
-
3. A document search device for finding a relevant document from object documents as search objects, comprising:
-
a seed text acquisition module which acquires a seed text as a search condition;
a partitioning module which partitions the object document into a plurality of blocks;
a similarity calculation module which calculates similarity of each block of the object document to the seed text;
an inclusion degree calculation module which compares the calculated similarity of each block with a preset threshold value, thereby judges whether or not each block is relevant to the seed text, and calculates an inclusion degree of the object document including the blocks regarding the seed text based on the result of the judgment. - View Dependent Claims (4, 5, 15)
-
-
6. A computer-readable record medium storing a program for instructing a computer etc. to execute a relevant document search method for finding a relevant document from object documents as search objects, wherein the relevant document search method comprises the steps of:
-
acquiring a seed text as a search condition for searching the object documents;
partitioning the object document into a plurality of blocks;
calculating similarity of each block of the object document to the seed text;
comparing the calculated similarity with a preset threshold value;
judging whether or not each block is relevant to the seed text based on the comparison and thereby counting the number of blocks relevant to the seed text; and
calculating an inclusion degree of the object document regarding the seed text based on the counted number of the relevant blocks.
-
-
7. A document relevancy judgment method for judging relevancy of a previously stored object document to a seed text as a search condition, comprising the steps of:
-
partitioning the object document into a plurality of blocks;
calculating similarity of each block of the object document to the seed text;
comparing the calculated similarity with a preset threshold value and thereby judging whether or not each block is relevant to the seed text;
counting the number of blocks relevant to the seed text based on the judgment; and
calculating an inclusion degree of the object document including the blocks regarding the seed text based on the counted number of the relevant blocks. - View Dependent Claims (8, 9)
-
-
10. A relevant document search method for finding a document from object documents as search objects, comprising the steps of:
-
acquiring a full-text search condition which is inputted as a search condition;
partitioning the object document into a plurality of blocks;
calculating similarity of each block of the object document to the full-text search condition;
comparing the calculated similarity with a preset threshold value and thereby judging whether or not each block is relevant to the full-text search condition; and
calculating an inclusion degree of the object document including the blocks regarding the full-text search condition based on the result of the judgment. - View Dependent Claims (11)
-
-
16. A relevant document search device for finding a relevant document from object documents as previously registered search objects, comprising:
-
a partitioning module which partitions the object document into a plurality of blocks;
a characteristic string extraction module which extracts characteristic strings from each block of the object document;
a block characteristic string storage module which stores the extracted characteristic strings associating them with each block;
a seed text acquisition module which acquires a seed text as a search condition;
a similarity calculation module which calculates similarity of each block to the seed text by comparing the characteristic strings of the block stored in the block characteristic string storage module with characteristic strings extracted from the seed text by the characteristic string extraction module;
an inclusion degree calculation module which counts the number of blocks having the similarity higher than a preset value and calculates an inclusion degree of the object document regarding the seed text based on the counted number of blocks and the total number of blocks included in the object document. - View Dependent Claims (17)
-
-
18. A program for letting a document search system execute a process for finding a document relevant to a search condition from object documents as search objects, wherein the process comprises the steps of:
-
acquiring a seed text as the search condition;
partitioning the object document into a plurality of blocks;
calculating similarity of each block of the object document to the acquired seed text;
calculating an inclusion degree of the object document regarding the seed text by judging whether or not the similarity of each block of the object document is higher than a preset value. - View Dependent Claims (19, 20)
-
Specification