Process database entries to provide predictions of future data values
First Claim
1. A method of operating a data processing apparatus to process variables stored in an electronic database so as to provide prediction of future values of the variables, the method comprising the steps of:
- accessing a prior distribution over the parameters of a first of the variables;
accessing data values of the first variable to derive a posterior distribution of the parameters of the first variable;
accessing a prior distribution over the parameters of the second of the variables, the second of the variables being correlated with the first;
accessing data values of the second variable to derive a posterior distribution of the parameters of the second variable;
taking statistical samples of the parameters of each posterior distribution to provide estimates of the parameters of the posterior distributions of each of the first and the second variables;
computing a predictive distribution of the first variable from said estimates, both the estimate of the parameters of the posterior distribution of the first variable and the estimate of the parameters of the posterior distribution of the second variable being used in computing the predictive distribution of the first variable;
computing a predictive distribution of the second variable from the said estimates, both the estimate of the parameters of the posterior distribution of the first variable and the estimate of the parameters of the posterior distribution of the second variable being used in computing the predictive distribution of the second variable; and
formulating predicted data values for the first and second correlated variables from said predictive distributions and known values for the variables.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention relates to a method and apparatus for processing entries stored in an electronic database where each entry comprises a succession of data values of correlated variables. The entries are processed in order to provide predictions of future data values of the correlated variables. The invention has particular application to processing data entries that relate to customers of a bank or retail business so as to predict future data values of attributes of the customers. A prior distribution over the parameters of a first of the variables is accessed from the database and data values of the first variable are used to derive a posterior distribution of the parameters of the first variable. A prior distribution over the parameters of a second of the variables is accessed from the database, the second of the variables being correlated with the first. Data values of the second variable are accessed to derive a posterior distribution of the parameters of the second variable. A statistical sampler samples the parameters of each posterior distribution to provide estimates of the parameters of the posterior distributions. Predictive distributions of the first and second variables are computed from the estimates. Predicted data values for the first and second correlated variables are then formulated from the predictive distributions and known values for the variables.
-
Citations
18 Claims
-
1. A method of operating a data processing apparatus to process variables stored in an electronic database so as to provide prediction of future values of the variables, the method comprising the steps of:
-
accessing a prior distribution over the parameters of a first of the variables;
accessing data values of the first variable to derive a posterior distribution of the parameters of the first variable;
accessing a prior distribution over the parameters of the second of the variables, the second of the variables being correlated with the first;
accessing data values of the second variable to derive a posterior distribution of the parameters of the second variable;
taking statistical samples of the parameters of each posterior distribution to provide estimates of the parameters of the posterior distributions of each of the first and the second variables;
computing a predictive distribution of the first variable from said estimates, both the estimate of the parameters of the posterior distribution of the first variable and the estimate of the parameters of the posterior distribution of the second variable being used in computing the predictive distribution of the first variable;
computing a predictive distribution of the second variable from the said estimates, both the estimate of the parameters of the posterior distribution of the first variable and the estimate of the parameters of the posterior distribution of the second variable being used in computing the predictive distribution of the second variable; and
formulating predicted data values for the first and second correlated variables from said predictive distributions and known values for the variables. - View Dependent Claims (2, 3, 4, 11, 12, 13, 14)
-
-
5. A method of operating a data processing apparatus to process variables stored in an electronic database so as to provide prediction of future values of the variables, the method comprising the steps of:
-
accessing a prior distribution over the parameters of a first plurality of the variables, the first plurality of the variables, the first variables being continuous variables;
accessing data values of the first variables to derive a posterior distribution of the parameters of the first variables;
accessing a prior distribution over the parameters of a second plurality of the variables, the second variables being discrete variables correlated with the first variables;
accessing data values of the second variables to derive a posterior distribution of the parameters of the second variables;
taking statistical samples of the parameters of each posterior distribution to provide estimates of the parameters of the posterior distributions of each of the first and the second plurality of variables;
computing a predictive distribution of the first variables from said estimates, both the estimate of the parameters of the posterior distribution of the first variables and the estimate of the parameters of the posterior distribution of the second variables being used in computing the predictive distribution of the first variables;
computing a predictive distribution of the second variables from said estimates, both the estimate of the parameters of the posterior distribution of the first variables and the estimate of the parameters of the posterior distribution of the second variables being used in computing the predictive distribution of the second variables; and
formulating predicted data values for the first and second correlated variables from said predictive distributions and known values for the variables. - View Dependent Claims (15, 16, 17, 18)
-
-
6. Data processing apparatus to process variables stored in an electronic database so as to provide prediction of future values of the variables, the apparatus comprising:
-
means for accessing a prior distribution over the parameters of a first of the variables;
means for accessing data values of the first variable to derive a posterior distribution of the parameters of the first variable;
means for accessing a prior distribution over the parameters of a second of the variables, the second of the variables being correlated with the first;
means for accessing data values of the second variable to derive a posterior distribution of the parameters of the second variable;
a statistical sampler of the parameters of each posterior distribution for providing estimates of the parameters of the posterior distributions of each of the first and the second variables;
computing means for computing a predictive distribution of the first variable and a predictive distribution of the second variable from the said estimates, the computing means employing both the estimate of the parameters of the posterior distribution of the first variable and the estimate of the parameters of the posterior distribution of the second variable to compute the predictive distribution of the first variable, the computing means further employing both the estimate of the parameters of the posterior distribution of the first variable and the estimate of the parameters of the posterior distribution of the second variable to compute the predictive distribution of the second variable; and
a predictor for predicting data values for the first and second correlated variables from said predictive distributions and known values for the variables. - View Dependent Claims (7, 8, 9)
-
-
10. Data processing apparatus to process variables stored in an electronic database so as to provide prediction of future values of the variables, the apparatus comprising:
-
means for accessing a prior distribution over the parameters of a first plurality of the variables, the first variables being continuous variables;
means for accessing data values of the first variables to derive posterior distributions of the parameters of the first variables;
means for accessing prior distributions over the parameters of a second plurality of the variables, the second variables being discrete variables correlated with the first;
means for accessing data values of the second variables to derive posterior distributions of the parameters of the second variables;
a statistical sampler for sampling the parameters of each posterior distribution for providing estimates of the parameters of the posterior distributions of each of the first and the second plurality of variables;
computing means for computing a predictive distribution of the first variables and a predictive distribution of the second variables from said estimates, the computing means employing both the estimate of the parameters of the Posterior distribution of the first variables and the estimate of the parameters of the posterior distribution of the second variables to compute the predictive distribution of the first variables, the computing means further employing both the estimate of the parameters of the Posterior distribution of the first variables and the estimate of the parameters of the posterior distribution of the second variables to compute the predictive distribution of the second variables; and
a predictor for predicting data values for the first and second correlated variables from said predictive distributions and known values for the variables.
-
Specification