Clustering method for multilingual documents
0 Assignments
0 Petitions
Accused Products
Abstract
The present invention relates to a technical field of information retrieval, and more particularly to a clustering method for multilingual documents, comprising steps of: step 1: establishing a similar words bank comprising multilingual words; step 2: extracting eight eigenvalues; step 3: calculating a similarity of any two documents i and j; step 4: selecting accumulation points from a set of the documents to establish a cluster; step 5: adding residual documents which are not selected in the set to the cluster; and step 6: disposing the cluster in a circular ring structure. The method of the present invention without limiting categories of languages in the documents, the accumulation points are selected according to judgments of similarity to establish clusters and classify multilingual documents in the clusters. The method of the present invention is suitable for clustering multilingual documents.
14 Citations
31 Claims
-
1-11. -11. (canceled)
-
12. :
- A clustering method for multilingual documents, comprising following steps of;
step 1;
establishing a similar words bank comprising multilingual words;step 2;
extracting eight eigenvalues;step 3;
calculating a similarity of any two documents i and j according to the eight eigenvalues;step 4;
selecting accumulation points from a set of the documents to establish a cluster;step 5;
adding residual documents which are not selected in the set to the cluster; andstep 6;
disposing the cluster in a circular ring structure. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31)
- A clustering method for multilingual documents, comprising following steps of;
Specification