Clustering based text classification
First Claim
Patent Images
1. A method for text classification, the method comprising:
- clustering text comprising labeled data and unlabeled data in view of the labeled data to generate cluster(s);
generating expanded labeled data as a function of the cluster(s), the expanded label data comprising the labeled data and at least a portion of unlabeled data; and
training discriminative classifier(s) based on the expanded labeled data and remaining ones of the unlabeled data.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for clustering-based text classification are described. In one aspect text is clustered as a function of labeled data to generate cluster(s). The text includes the labeled data and unlabeled data. Expanded labeled data is then generated as a function of the cluster(s). The expanded label data includes the labeled data and at least a portion of unlabeled data. Discriminative classifier(s) are then trained based on the expanded labeled data and remaining ones of the unlabeled data.
125 Citations
36 Claims
-
1. A method for text classification, the method comprising:
-
clustering text comprising labeled data and unlabeled data in view of the labeled data to generate cluster(s);
generating expanded labeled data as a function of the cluster(s), the expanded label data comprising the labeled data and at least a portion of unlabeled data; and
training discriminative classifier(s) based on the expanded labeled data and remaining ones of the unlabeled data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer-readable medium having stored thereon computer-program instructions for text classification, the computer-program instructions being executable by a processor, the computer-program instructions comprising instructions for:
-
clustering text comprising labeled data and unlabeled data in view of the labeled data to generate cluster(s);
generating expanded labeled data as a function of the cluster(s), the expanded label data comprising the labeled data and at least a portion of unlabeled data; and
training discriminative classifer(s) based on the expanded labeled data and remaining ones of the unlabeled data. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A computing device comprising:
-
a processor; and
a memory coupled to the processor, the memory comprising computer-program instructions executable by the processor for text classification, the computer-program instructions comprising instructions for;
clustering text comprising labeled data and unlabeled data in view of the labeled data to generate cluster(s);
generating expanded labeled data as a function of the cluster(s), the expanded label data comprising the labeled data and at least a portion of unlabeled data; and
training discriminative classifer(s) based on the expanded labeled data and remaining ones of the unlabeled data. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. A computing device comprising:
-
clustering means to cluster text comprising labeled data and unlabeled data in view of the labeled data to generate cluster(s);
generating means to generate expanded labeled data as a function of the cluster(s), the expanded label data comprising the labeled data and at least a portion of unlabeled data; and
training means to train discriminative classifer(s) based on the expanded labeled data and remaining ones of the unlabeled data. - View Dependent Claims (32, 33, 34, 35, 36)
-
Specification