Reducing human overhead in text categorization
First Claim
1. A multi-stage classification system that facilitates minimizing human effort in text classification while obtaining a desired level of accuracy comprising:
- a pattern-based classifier that initially processes input to classify and assign the input a label; and
a learning-based classifier that processes the input for classification purposes when no label is assigned to it by the pattern-based classifier.
2 Assignments
0 Petitions
Accused Products
Abstract
A unique multi-stage classification system and method that facilitates reducing human resources or costs associated with text classification while still obtaining a desired level of accuracy is provided. The multi-stage classification system and method involve a pattern-based classifier and a machine learning classifier. The pattern-based classifier is trained on discriminative patterns as identified by humans rather than machines which allow a smaller training set to be employed. Given humans'"'"' superior abilities to reason over text, discriminative patterns can be more accurately and more readily identified by them. Unlabeled items can be initially processed by the pattern-based classifier and if no pattern match exists, then the unlabeled data can be processed by the machine learning classifier. By employing the classifiers in this manner, less human involvement is required in the classification process. Even more, classification accuracy is maintained and/or improved.
46 Citations
20 Claims
-
1. A multi-stage classification system that facilitates minimizing human effort in text classification while obtaining a desired level of accuracy comprising:
-
a pattern-based classifier that initially processes input to classify and assign the input a label; and
a learning-based classifier that processes the input for classification purposes when no label is assigned to it by the pattern-based classifier. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A classification method that facilitates minimizing human effort in text classification while obtaining a desired level of accuracy comprising:
-
receiving unlabeled input; and
classifying the unlabeled input using a multi-stage classifier, the multi-stage classifier comprising at least a pattern-based classifier. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A classification method that facilitates minimizing human effort in text classification while obtaining a desired level of accuracy comprising:
-
running one or more unlabeled text items through a pattern-based classifier to determine whether at least a portion of text in the item matches a discriminative pattern;
assigning a label to the one or more text items corresponding to the pattern matched when a pattern match is found; and
subsequently running the one or more unlabeled text items through a machine learning classifier when no pattern match is found. - View Dependent Claims (18, 19, 20)
-
Specification