METHODS AND SYSTEMS FOR TRANSDUCTIVE DATA CLASSIFICATION
First Claim
1. In a computer-based system, a method for classification of data comprising:
- receiving labeled data points, each of said labeled data points having at least one label indicating whether the data point is a training example for data points for being included in a designated category or a training example for data points being excluded from a designated category;
receiving unlabeled data points;
receiving at least one predetermined cost factor of the labeled data points and unlabeled data points;
training a transductive classifier using Maximum Entropy Discrimination (MED) through iterative calculation using said at least one cost factor and the labeled data points and the unlabeled data points as training examples, wherein for each iteration of the calculations the unlabeled data point cost factor is adjusted as a function of an expected label value and a data point label prior probability is adjusted according to an estimate of a data point class membership probability;
applying the trained classifier to classify at least one of the unlabeled data points, the labeled data points, and input data points; and
outputting a classification of the classified data points, or derivative thereof, to at least one of a user, another system, and another process.
9 Assignments
0 Petitions
Accused Products
Abstract
A system, method, data processing apparatus, and article of manufacture are provided for classifying data. Labeled data points are received, each of the labeled data points having at least one label indicating whether the data point is a training example for data points for being included in a designated category or a training example for data points being excluded from a designated category; receiving unlabeled data points; receiving at least one predetermined cost factor of the labeled data points and unlabeled data points; training a transductive classifier using MED through iterative calculation using the at least one cost factor and the labeled data points and the unlabeled data points as training examples; applying the trained classifier to classify at least one of the unlabeled data points, the labeled data points, and input data points; and outputting a classification of the classified data points, or derivative thereof.
-
Citations
30 Claims
-
1. In a computer-based system, a method for classification of data comprising:
-
receiving labeled data points, each of said labeled data points having at least one label indicating whether the data point is a training example for data points for being included in a designated category or a training example for data points being excluded from a designated category; receiving unlabeled data points; receiving at least one predetermined cost factor of the labeled data points and unlabeled data points; training a transductive classifier using Maximum Entropy Discrimination (MED) through iterative calculation using said at least one cost factor and the labeled data points and the unlabeled data points as training examples, wherein for each iteration of the calculations the unlabeled data point cost factor is adjusted as a function of an expected label value and a data point label prior probability is adjusted according to an estimate of a data point class membership probability; applying the trained classifier to classify at least one of the unlabeled data points, the labeled data points, and input data points; and outputting a classification of the classified data points, or derivative thereof, to at least one of a user, another system, and another process. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. An article of manufacture comprising a program storage medium readable by a computer, the medium tangibly embodying one or more programs of instructions executable by a computer to perform a method of data classification comprising:
-
receiving labeled data points, each of said labeled data points having at least one label indicating whether the data point is a training example for data points for being included in a designated category or a training example for data points being excluded from a designated category; receiving unlabeled data points; receiving at least one predetermined cost factor of the labeled data points and unlabeled data points; training a transductive classifier with iterative Maximum Entropy Discrimination (MED) calculation using said at least one stored cost factor and stored labeled data points and stored unlabeled data points as training examples wherein at each iteration of the MED calculation the unlabeled data point cost factor is adjusted as a function of an expected label value and a data point prior probability is adjusted according to an estimate of a data point class membership probability; applying the trained classifier to classify at least one of the unlabeled data points, the labeled data points, and input data points; and outputting a classification of the classified data points, or derivative thereof, to at least one of a user, another system, and another process. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. In a computer-based system, a method for classification of data comprising:
-
receiving labeled data points; receiving unlabeled data points; receiving at least one predetermined cost factor of the labeled data points and unlabeled data points; training a transductive classifier using Maximum Entropy Discrimination (MED) through iterative calculation using said at least one cost factor and using the labeled data points and the unlabeled data points as training examples, wherein for each iteration of the calculations the unlabeled data point cost factor is adjusted as a function of an absolute value of an expected label value of a data point; applying the trained classifier to classify at least one of the unlabeled data points, the labeled data points, and input data points; and outputting a classification of the classified data points, or derivative thereof, to at least one of a user, another system, and another process. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30)
-
Specification