Data analysis computer system and method for organizing, presenting, and optimizing predictive modeling
First Claim
1. A computer aided technique for visually analyzing the results of the application of predictive models to data sets, including big data, comprising the following steps:
- a) storing data and predictive models;
b) selecting a predictive model;
c) applying the predictive model to the data to generate a score for each data object;
d) selecting an analytical threshold to employ with the scores;
e) outputting a graphical display that shows the data objects arranged along an axis according to their predictive model score along with an indication of the value of a currently chosen threshold;
f) outputting a table with a row associated with the threshold and columns showing expected predictive performance metrics calculated from performance statistics of the predictive model; and
g) repeating steps d) through f) for a different threshold and dynamically updating the display and tables with the data for each thresholdwherein the density of values in the graphical display and the tabular output informs the user choice in selecting an appropriate threshold.
0 Assignments
0 Petitions
Accused Products
Abstract
Predictive modeling is an important class of data analytics with applications in numerous fields. Once a predictive model is built, validated, and applied on a set of objects, by a data analytics system (or even by manual modeling), consumers of the model information need assistance to navigate through the results. This is because both regression and classification models that output continuous values (eg, probability of belonging to a class) are often used to rank objects and then a thresholding of the ranked scores needs to be used to separate objects into a “positive” and a “negative” class. The choice of threshold greatly affects the true positive, false positive, true negative, and false negative results of the model'"'"'s application. An ideal data analytics system should allow the user to understand the tradeoffs of different threshold values for different thresholds. The user interface should convey this information in an intuitive manner and provide the ability to vary the threshold interactively while simultaneously presenting the effects of thresholding on predictivity. This is precisely the function of the present invention. In addition to manual thresholding, the invention also allows for the thresholding to be performed by fully automated means (via standard statistical optimization methods) once a user has identified the desired balance of false positives and false negatives (or other predictivity metrics of interest). The invention can be applied to any application field of predictive modeling.
23 Citations
12 Claims
-
1. A computer aided technique for visually analyzing the results of the application of predictive models to data sets, including big data, comprising the following steps:
-
a) storing data and predictive models; b) selecting a predictive model; c) applying the predictive model to the data to generate a score for each data object; d) selecting an analytical threshold to employ with the scores; e) outputting a graphical display that shows the data objects arranged along an axis according to their predictive model score along with an indication of the value of a currently chosen threshold; f) outputting a table with a row associated with the threshold and columns showing expected predictive performance metrics calculated from performance statistics of the predictive model; and g) repeating steps d) through f) for a different threshold and dynamically updating the display and tables with the data for each threshold wherein the density of values in the graphical display and the tabular output informs the user choice in selecting an appropriate threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
Specification