Document search method and system, and document search result display system
First Claim
Patent Images
1. A document retrieval method comprising the steps of:
- searching a document database according to a search request;
representing each of a plurality of documents obtained by the search with a word vector having as elements words that appear;
classifying the multiple documents into a plurality of document groups by a clustering method using the word vectors;
representing each of the multiple document groups with a word vector having as elements words that appear;
calculating the degree of belonging of each document to each of the multiple document groups by using the word vector representing the document and the word vector representing the document group; and
outputting information identifying the multiple documents obtained by the search in association with the degree of belonging of each document to each of the multiple document groups.
1 Assignment
0 Petitions
Accused Products
Abstract
A system for classification is automatically determined in accordance with search results, and the search results are displayed in a list according to the classification system, thereby assisting an interactive search, such as one for refining the search results. A group of categories representing a group of documents retrieved is automatically extracted by clustering, the degree of belonging of each of the retrieved documents to each of the categories is calculated, and the proportions of the degrees of belonging are displayed by a bar graph. The search results can be rearranged according to the degree of belonging to a designated category.
295 Citations
19 Claims
-
1. A document retrieval method comprising the steps of:
-
searching a document database according to a search request;
representing each of a plurality of documents obtained by the search with a word vector having as elements words that appear;
classifying the multiple documents into a plurality of document groups by a clustering method using the word vectors;
representing each of the multiple document groups with a word vector having as elements words that appear;
calculating the degree of belonging of each document to each of the multiple document groups by using the word vector representing the document and the word vector representing the document group; and
outputting information identifying the multiple documents obtained by the search in association with the degree of belonging of each document to each of the multiple document groups. - View Dependent Claims (2, 3, 4)
-
-
5. A document retrieval system comprising:
-
a document retrieval unit for searching a document database in accordance with a search request;
a classification means for classifying a plurality of documents obtained by the search into a predetermined number of document groups according to similarity among the documents; and
a belonging-degree calculating unit for calculating the degree of belonging of each of the documents obtained by the search to each of the document groups. - View Dependent Claims (6, 7, 8, 9, 10, 11)
-
- 12. A document retrieval result display system for displaying information about a plurality of documents obtained by a search, wherein the degree of belonging of each of the documents obtained by the search to a plurality of categories that are dynamically calculated based on the degree of similarity among the multiple documents obtained by the search is displayed.
Specification