Reliability measure for a classifier
First Claim
Patent Images
1. A method for computing a classification threshold for a classifier, the method comprising:
- applying a classification model to a data item stored in a memory and having a known classification to produce a first classification score for the data item;
determining a first classification of the data item based on the first classification score;
determining an amount of retraining needed to change the first classification of the data item to a second, different classification;
determining a reliability measure based on the amount of retraining needed to change the first classification of the data item to a second, different classification;
modifying the first classification score based on the reliability measure to compute a second classification score;
computing, using a processor, a threshold value that minimizes misclassification costs based, at least in part, on the second classification score and the known classification; and
setting the classification threshold to the determined threshold value.
9 Assignments
0 Petitions
Accused Products
Abstract
In one aspect, a data item is input into a scoring classifier such that the scoring classifier indicates that the data item belongs to a first class. A determination is made as to the amount of retraining of the scoring classifier, based on the data item, that is required to cause the scoring classifier to indicate that the data item belongs to a second class. A reliability measure is determined based on the required amount of retraining and a class of the data item is determined based, at least in part, on the reliability measure.
26 Citations
21 Claims
-
1. A method for computing a classification threshold for a classifier, the method comprising:
-
applying a classification model to a data item stored in a memory and having a known classification to produce a first classification score for the data item; determining a first classification of the data item based on the first classification score; determining an amount of retraining needed to change the first classification of the data item to a second, different classification; determining a reliability measure based on the amount of retraining needed to change the first classification of the data item to a second, different classification; modifying the first classification score based on the reliability measure to compute a second classification score; computing, using a processor, a threshold value that minimizes misclassification costs based, at least in part, on the second classification score and the known classification; and setting the classification threshold to the determined threshold value. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A non-transitory computer-readable storage medium storing a program for determining a classification threshold for a classifier, the program comprising code for causing a processing device to perform the following operations:
-
apply a classification model to a data item with a known classification to produce a first classification score for the data item; determine a first classification of the data item based on the first classification score; determine an amount of retraining needed to change the first classification of the data item to a second, different classification; determine a reliability measure based on the amount of retraining needed to change the first classification of the data item to a second, different classification; modify the first classification score based on the reliability measure to compute a second classification score; compute a threshold value that minimizes misclassification costs based, at least in part, on the second classification score and the known classification; and set the classification threshold to the determined threshold value. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A method for computing a classification threshold for a classifier, the method comprising:
-
determining a first classification of a data item stored in a memory, wherein the first classification is based on a first classification score; determining an amount of retraining needed to change the first classification of the data item to a second, different classification; determining a reliability measure based on the amount of retraining needed to change the first classification of the data item to a second, different classification; modifying the first classification score based on the reliability measure to compute a second classification score; computing, using a processor, a threshold value that minimizes misclassification costs based, at least in part, on the second classification score and a known classification; and setting the classification threshold to the determined threshold value. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification