Accelerating the boosting approach to training classifiers
First Claim
1. A method for training a classifier, the method comprising:
- receiving a training set that includes data samples that correspond to an object of interest (positive samples) and data samples that do not correspond to an object of interest (negative samples);
receiving a restricted set of linear operators; and
using a boosting process to train a classifier to discriminate between the positive and negative samples in the training set, the classifier being an aggregate of multiple individual classifiers, the boosting process being an iterative process, the iterations including;
a first iteration where an individual classifier in the aggregate is trained by;
(1) testing some, but not all linear operators in the restricted set against a weighted version of the training set, wherein testing is performed by a computer;
(2) selecting for use by the individual classifier the linear operator with the lowest error rate (error-minimizing operator); and
(3) generating a re-weighted version of the training set that is weighted such that data samples that were misclassified by the error-minimizing operator are weighted more than data samples that were classified correctly by the error-minimizing operator; and
subsequent iterations during which another individual classifier in the aggregate is trained by repeating steps (1), (2), and (3), but using in step (1) the re-weighted version of the training set generated during a previous iteration.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems, methods, and computer program products implementing techniques for training classifiers. The techniques include receiving a training set that includes positive samples and negative samples, receiving a restricted set of linear operators, and using a boosting process to train a classifier to discriminate between the positive and negative samples. The boosting process is an iterative process. The iterations include a first iteration where a classifier is trained by (1) testing some, but not all linear operators in the restricted set against a weighted version of the training set, (2) selecting for use by the classifier the linear operator with the lowest error rate, and (3) generating a re-weighted version of the training set. The iterations also include subsequent iterations during which another classifier is trained by repeating steps (1), (2), and (3), but using in step (1) the re-weighted version of the training set generated during a previous iteration.
21 Citations
21 Claims
-
1. A method for training a classifier, the method comprising:
-
receiving a training set that includes data samples that correspond to an object of interest (positive samples) and data samples that do not correspond to an object of interest (negative samples); receiving a restricted set of linear operators; and using a boosting process to train a classifier to discriminate between the positive and negative samples in the training set, the classifier being an aggregate of multiple individual classifiers, the boosting process being an iterative process, the iterations including; a first iteration where an individual classifier in the aggregate is trained by; (1) testing some, but not all linear operators in the restricted set against a weighted version of the training set, wherein testing is performed by a computer; (2) selecting for use by the individual classifier the linear operator with the lowest error rate (error-minimizing operator); and (3) generating a re-weighted version of the training set that is weighted such that data samples that were misclassified by the error-minimizing operator are weighted more than data samples that were classified correctly by the error-minimizing operator; and subsequent iterations during which another individual classifier in the aggregate is trained by repeating steps (1), (2), and (3), but using in step (1) the re-weighted version of the training set generated during a previous iteration. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer program product, tangibly embodied in a computer-readable storage medium, for training a classifier, the product being operable to cause data processing apparatus to perform operations comprising:
-
receiving a training set that includes data samples that correspond to an object of interest (positive samples) and data samples that do not correspond to an object of interest (negative samples); receiving a restricted set of linear operators; and using a boosting process to train a classifier to discriminate between the positive and negative samples in the training set, the classifier being an aggregate of multiple individual classifiers, the boosting process being an iterative process, the iterations including; a first iteration where an individual classifier in the aggregate is trained by; (1) testing some, but not all linear operators in the restricted set against a weighted version of the training set; (2) selecting for use by the individual classifier the linear operator with the lowest error rate (error-minimizing operator); and (3) generating a re-weighted version of the training set that is weighted such that data samples that were misclassified by the error-minimizing operator are weighted more than data samples that were classified correctly by the error-minimizing operator; and subsequent iterations during which another individual classifier in the aggregate is trained by repeating steps (1), (2), and (3), but using in step (1) the re-weighted version of the training set generated during a previous iteration. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A system for training a classifier comprising:
one or more computers operable to perform instructions to; receive a training set that includes data samples that correspond to an object of interest (positive samples) and data samples that do not correspond to an object of interest (negative samples); receive a restricted set of linear operators; and use a boosting process to train a classifier to discriminate between the positive and negative samples in the training set, the classifier being an aggregate of multiple individual classifiers, the boosting process being an iterative process, the iterations including; a first iteration where an individual classifier in the aggregate is trained by; (1) testing some, but not all linear operators in the restricted set against a weighted version of the training set; (2) selecting for use by the individual classifier the linear operator with the lowest error rate (error-minimizing operator); and (3) generating a re-weighted version of the training set that is weighted such that data samples that were misclassified by the error-minimizing operator are weighted more than data samples that were classified correctly by the error-minimizing operator; and subsequent iterations during which another individual classifier in the aggregate is trained by repeating steps (1), (2), and (3), but using in step (1) the re-weighted version of the training set generated during a previous iteration. - View Dependent Claims (16, 17, 18, 19, 20, 21)
Specification