Adaptive Bayes Network data mining modeling
First Claim
1. A method of generating an Adaptive Bayes Network data mining model comprising the steps of:
- receiving a data table having a plurality of predictor columns and a target column and comprising a plurality of rows of data;
constructing a plurality of single-predictor models, comprising the steps of;
computing a description length of a baseline model based on unconditional target probabilities among the plurality of rows;
determining which predictor columns are correlated to the target column based on minimum description length;
computing probabilities of at least two target values of the target column conditioned on at least two predictor values of at least one correlated predictor column; and
computing a probability of at least one correlated predictor column conditioned on the at least two target values;
ranking each predictor column by ranking each single-predictor model using minimum description length and selecting a best single predictor model;
performing feature selection based on a minimum of a specified number of predictors and as a function of a reduction in entropy attributable to the best single predictor model;
constructing a Naï
ve Bayes model using a top-ranked portion of the plurality of predictor columns;
comparing a description length of the Naive Bayes model with a description length of a baseline model;
replacing the baseline model with the Naï
ve Bayes model, if the description length of the Naive Bayes model is less than the description length of the baseline model;
extending a plurality of single-predictor models in rank order, stepwise, to multi-predictor features; and
testing whether each new feature should be included in or should replace a current model state using minimum description length.
2 Assignments
0 Petitions
Accused Products
Abstract
A method, system, and computer program product for generating an Adaptive Bayes Network data mining model includes receiving a data table having a plurality of predictor columns and a target column, constructing a plurality of single-predictor models, ranking each single-predictor model using minimum description length and selecting a best single predictor model, performing feature selection, constructing a Naïve Bayes model, comparing a description length of the Naive Bayes model with a description length of a baseline model, replacing the baseline model with the Naïve Bayes model, if the description length of the Naive Bayes model is less than the description length of the baseline model, extending a plurality of single-predictor models in rank order, stepwise, to multi-predictor features, and testing whether each new feature should be included in or should replace a current model state using minimum description length.
-
Citations
21 Claims
-
1. A method of generating an Adaptive Bayes Network data mining model comprising the steps of:
-
receiving a data table having a plurality of predictor columns and a target column and comprising a plurality of rows of data; constructing a plurality of single-predictor models, comprising the steps of; computing a description length of a baseline model based on unconditional target probabilities among the plurality of rows; determining which predictor columns are correlated to the target column based on minimum description length; computing probabilities of at least two target values of the target column conditioned on at least two predictor values of at least one correlated predictor column; and computing a probability of at least one correlated predictor column conditioned on the at least two target values; ranking each predictor column by ranking each single-predictor model using minimum description length and selecting a best single predictor model; performing feature selection based on a minimum of a specified number of predictors and as a function of a reduction in entropy attributable to the best single predictor model; constructing a Naï
ve Bayes model using a top-ranked portion of the plurality of predictor columns;comparing a description length of the Naive Bayes model with a description length of a baseline model; replacing the baseline model with the Naï
ve Bayes model, if the description length of the Naive Bayes model is less than the description length of the baseline model;extending a plurality of single-predictor models in rank order, stepwise, to multi-predictor features; and testing whether each new feature should be included in or should replace a current model state using minimum description length. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for generating an Adaptive Bayes Network data mining model comprising:
-
a processor operable to execute computer program instructions; a memory operable to store computer program instructions executable by the processor; and computer program instructions stored in the memory and executable to perform the steps of; receiving a data table having a plurality of predictor columns and a target column and comprising a plurality of rows of data; constructing a plurality of single-predictor models, comprising the steps of; computing a description length of a baseline model based on unconditional target probabilities among the plurality of rows; determining which predictor columns are correlated to the target column based on minimum description length; computing probabilities of at least two target values of the target column conditioned on at least two predictor values of at least one correlated predictor column; and computing a probability of at least one correlated predictor column conditioned on the at least two target values; ranking each predictor column by ranking each single-predictor model using minimum description length and selecting a best single predictor model; performing feature selection based on a minimum of a specified number of predictors and as a function of a reduction in entropy attributable to the best single predictor model; constructing a Naï
ve Bayes model using a top-ranked portion of the plurality of predictor columns;comparing a description length of the Naive Bayes model with a description length of a baseline model; replacing the baseline model with the Naï
ve Bayes model, if the description length of the Naive Bayes model is less than the description length of the baseline model;extending a plurality of single-predictor models in rank order, stepwise, to multi-predictor features; and testing whether each new feature should be included in or should replace a current model state using minimum description length. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer program product for generating an Adaptive Bayes Network data mining model, comprising:
-
a computer readable medium; computer program instructions, recorded on the computer readable medium, executable by a processor, for performing the steps of receiving a data table having a plurality of predictor columns and a target column and comprising a plurality of rows of data; constructing a plurality of single-predictor models, comprising the steps of; computing a description length of a baseline model based on unconditional target probabilities among the plurality of rows; determining which predictor columns are correlated to the target column based on minimum description length; computing probabilities of at least two target values of the target column conditioned on at least two predictor values of at least one correlated predictor column; and computing a probability of at least one correlated predictor column conditioned on the at least two target values; ranking each predictor column by ranking each single-predictor model using minimum description length and selecting a best single predictor model; performing feature selection based on a minimum of a specified number of predictors and as a function of a reduction in entropy attributable to the best single predictor model; constructing a Naï
ve Bayes model using a top-ranked portion of the plurality of predictor columns;comparing a description length of the Naive Bayes model with a description length of a baseline model; replacing the baseline model with the Naï
ve Bayes model, if the description length of the Naive Bayes model is less than the description length of the baseline model;extending a plurality of single-predictor models in rank order, stepwise, to multi-predictor features; and testing whether each new feature should be included in or should replace a current model state using minimum description length. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification