Document retrieval apparatus
First Claim
1. A document retrieval apparatus connected through a network to a plurality of computers having documents, said apparatus comprising:
- a contents database for storing index information of the respective documents;
a cluster database for storing a plurality of node information elements which are linked as nodes of a cluster with a hierarchical tree structure of the documents arranged based on similarity of the documents, each said node information element including end addresses where an update of the cluster generated by a modification or an addition of an individual document located at a lower side of the node is to be posted; and
a control means for posting the modification or addition of the individual document to the end address in the node information element which encounters while following the link of the cluster when the cluster is updated.
4 Assignments
0 Petitions
Accused Products
Abstract
A document retrieval apparatus is connected to a network, and includes a cluster database storing a cluster of node information elements linked for clustering the documents to a hierarchical tree structure based on degree of similarity in all of the documents. The apparatus can post to an end address in the node information element encountered on the way to follow links of the cluster by whenever one of the documents is updated. Also, the apparatus selects a specific number of documents, clusters those, assigns the remaining non-selected documents respectively to a leaf node to be similar to the documents in the cluster, and repeats recursively the above operations toward a direction of the leaf node of cluster.
-
Citations
13 Claims
-
1. A document retrieval apparatus connected through a network to a plurality of computers having documents, said apparatus comprising:
-
a contents database for storing index information of the respective documents; a cluster database for storing a plurality of node information elements which are linked as nodes of a cluster with a hierarchical tree structure of the documents arranged based on similarity of the documents, each said node information element including end addresses where an update of the cluster generated by a modification or an addition of an individual document located at a lower side of the node is to be posted; and a control means for posting the modification or addition of the individual document to the end address in the node information element which encounters while following the link of the cluster when the cluster is updated. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A document retrieval apparatus including a database storing a plurality of documents, and a control means for constituting the documents in a cluster and retrieving the document, the control means comprising;
-
a leaf node selection means for random selecting a specific number of the plurality of documents as the documents assigned to leaf nodes, and clustering the selected documents; a partial clustering means for assigning the remaining non-selected documents of the plurality respectively to the leaf node assigning one selected document to which each of the non-selected documents is similar; and a recursively clustering means for recursively repeating the operations of the leaf node selection means and the clustering means toward a direction of the leaf node of the cluster. - View Dependent Claims (7, 8, 9)
-
-
10. A clustering method for constituting a plurality of documents as a cluster, and retrieving a particular document, the method comprising:
-
randomly selecting a specific number of documents as the documents assigned to leaf nodes from the plurality of documents, and clustering the selected documents; partially clustering and assigning the remaining non-selected documents of the plurality respectively to the leaf node assigning one selected document to which each non-selected document is similar; and recursively repeating the leaf node selection and the clustering toward a direction of the leaf node of the cluster. - View Dependent Claims (11, 12, 13)
-
Specification