Data Analysis Computer System and Method for Organizing, Presenting, and Optimizing Predictive Modeling
First Claim
1. A computer-implemented method and system for visually-assisted, thresholding of predictive models:
- a) incorporating a memory device, that contains objects (“
examples”
) and one or more predictive models;
b) incorporating a processing device which applies the model to the examples to generate a score for each;
c) the processing device outputting graphical displays on a display device;
d) the graphical displays containing a series of icons, each denoting an individual object to be classified;
e) the graphical displays including strips and vertical bars denoting the score of each object and the value of the currently chosen threshold while random vertical jittering is used to separate the presented examples;
f) the graphical displays containing tables showing expected predictive performance metrics at each threshold and for the currently chosen threshold value, calculated from performance statistics of the predictive model;
g) the tables being dynamically updated once a user changes values of score threshold;
h) the tables being configurable so that they show only metrics of interest to the system user;
i) the graphical displays allowing simultaneous depiction one or more thresholds, one or more classes and one or more predictive models;
j) the system allowing a user input preferences of trade-offs among false positives, false negatives, true positives and true negatives;
k) the system allowing a user input preferences for maximum number of objects classified as positive that can be examined by the user; and
l) outputting a final threshold that satisfies user preferences and is consistent with data and model operating characteristics.
0 Assignments
0 Petitions
Accused Products
Abstract
Predictive modeling is an important class of data analytics with applications in numerous fields. Once a predictive model is built, validated, and applied on a set of objects, by a data analytics system (or even by manual modeling), consumers of the model information need assistance to navigate through the results. This is because both regression and classification models that output continuous values (eg, probability of belonging to a class) are often used to rank objects and then a thresholding of the ranked scores needs to be used to separate objects into a “positive” and a “negative” class. The choice of threshold greatly affects the true positive, false positive, true negative, and false negative results of the model'"'"'s application. An ideal data analytics system should allow the user to understand the tradeoffs of different threshold values for different thresholds. The user interface should convey this information in an intuitive manner and provide the ability to vary the threshold interactively while simultaneously presenting the effects of thresholding on predictivity. This is precisely the function of the present invention. In addition to manual thresholding, the invention also allows for the thresholding to be performed by fully automated means (via standard statistical optimization methods) once a user has identified the desired balance of false positives and false negatives (or other predictivity metrics of interest). The invention can be applied to any application field of predictive modeling.
-
Citations
4 Claims
-
1. A computer-implemented method and system for visually-assisted, thresholding of predictive models:
-
a) incorporating a memory device, that contains objects (“
examples”
) and one or more predictive models;b) incorporating a processing device which applies the model to the examples to generate a score for each; c) the processing device outputting graphical displays on a display device; d) the graphical displays containing a series of icons, each denoting an individual object to be classified; e) the graphical displays including strips and vertical bars denoting the score of each object and the value of the currently chosen threshold while random vertical jittering is used to separate the presented examples; f) the graphical displays containing tables showing expected predictive performance metrics at each threshold and for the currently chosen threshold value, calculated from performance statistics of the predictive model; g) the tables being dynamically updated once a user changes values of score threshold; h) the tables being configurable so that they show only metrics of interest to the system user; i) the graphical displays allowing simultaneous depiction one or more thresholds, one or more classes and one or more predictive models; j) the system allowing a user input preferences of trade-offs among false positives, false negatives, true positives and true negatives; k) the system allowing a user input preferences for maximum number of objects classified as positive that can be examined by the user; and l) outputting a final threshold that satisfies user preferences and is consistent with data and model operating characteristics. - View Dependent Claims (2, 3, 4)
-
Specification