Efficient gradient computation for conditional Gaussian graphical models
First Claim
1. A system that facilitates statistical modeling in an artificial intelligence application, comprisinga processor;
- a memory communicatively coupled to the processor, the memory having stored therein computer-executable instructions configured to implement the system, including;
a gradient determination component that utilizes a data set for a set of variables and probabilistic inference to determine parameter gradients for a log-likelihood of a conditional Gaussian (CG) graphical model over those variables with at least one continuous variable and with incomplete observation data for at least one of the variables, the CG graphical model employed to deduce a cause of a given outcome represented by the data set;
wherein the parameter gradients comprise conditional multinomial local gradients and at least one conditional Gaussian local gradient, the gradient determination component determines the conditional multinomial local gradients by performing a line search to update the parameters of an exponential model representation, converting the updated parameterization to a non-exponential representation, and utilizing a propagation scheme on the non-exponential representation to compute the next gradient.
2 Assignments
0 Petitions
Accused Products
Abstract
The subject invention leverages standard probabilistic inference techniques to determine a log-likelihood for a conditional Gaussian graphical model of a data set with at least one continuous variable and with data not observed for at least one of the variables. This provides an efficient means to compute gradients for CG models with continuous variables and incomplete data observations. The subject invention allows gradient-based optimization processes to employ gradients to iteratively adapt parameters of models in order to improve incomplete data log-likelihoods and identify maximum likelihood estimates (MLE) and/or local maxima of the incomplete data log-likelihoods. Conditional Gaussian local gradients along with conditional multinomial local gradients determined by the subject invention can be utilized to facilitate in providing parameter gradients for full conditional Gaussian models.
52 Citations
27 Claims
-
1. A system that facilitates statistical modeling in an artificial intelligence application, comprising
a processor; a memory communicatively coupled to the processor, the memory having stored therein computer-executable instructions configured to implement the system, including; a gradient determination component that utilizes a data set for a set of variables and probabilistic inference to determine parameter gradients for a log-likelihood of a conditional Gaussian (CG) graphical model over those variables with at least one continuous variable and with incomplete observation data for at least one of the variables, the CG graphical model employed to deduce a cause of a given outcome represented by the data set; wherein the parameter gradients comprise conditional multinomial local gradients and at least one conditional Gaussian local gradient, the gradient determination component determines the conditional multinomial local gradients by performing a line search to update the parameters of an exponential model representation, converting the updated parameterization to a non-exponential representation, and utilizing a propagation scheme on the non-exponential representation to compute the next gradient. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 27)
-
12. A method for facilitating statistical modeling in an artificial intelligence application, comprising:
employing a processor executing computer-executable instructions stored on a computer-readable storage medium to implement the following acts; receiving at least one data set representing an outcome, the at least one data set having incomplete observation data for a set of variables with at least one continuous variable; utilizing the data set and probabilistic inference to determine parameter gradients for a log-likelihood of a conditional Gaussian (CG) graphical model over those variables; representing the CG graphical model as a recursive exponential mixed model (REMM); determining a conditional Gaussian regression in the REMM representation model; utilizing a chain rule to transform a resulting parameter gradient expression into a parameter representation of c, β
, and σ
, representing an intercept, coefficients, and variance, respectively for a regression on continuous parents;employing probabilistic inference to determine quantities in the parameter gradient expression; converting a resulting parameterization into a non-exponential representation; utilizing a propagation scheme on the non-exponential parameterization in order to compute at least one gradient; assigning, for all families (Xv,Xpa(v)) where Xv is a continuous variable, each configuration of discrete parent variables xd pa(v) a sum vector and a sum of squares matrix for expected posterior statistics associated with conditional distributions for the continuous variable and its continuous parents (Xv,Xcpa(v)) given configurations of discrete parents; For each data case in a data set; utilizing a propagation scheme for Bayesian networks with conditional Gaussian distributions to determine a mean vector and a covariance matrix associated with a posterior marginal distribution for all families (Xv,Xpa(v)), where Xv is a continuous variable; employing posterior conditional means and variances to determine expected values (xv*,xcpa(v)*) and ((xvxv)*, (xv xcpa(v))*, (xv xcpa(v)′
)*, (xcpa(v)′
Xcpa(v))*) for continuous variables in each of these families given each configuration of discrete variables in the respective families;adding the expected values to the sum and the sum of squares expected posterior statistics; and determining the parameter gradients utilizing a formula for a sample of incomplete observations; employing the CG graphical model to deduce a cause of the given outcome in an artificial intelligence application; and displaying the deduced cause on a display device. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 26)
-
25. A system that facilitates statistical modeling in an artificial intelligence application, comprising:
-
a processor; a memory communicatively coupled to the processor, the memory having stored therein computer-executable instructions configured to implement the system, including; means for receiving at least one data set representing a given outcome, the at least one data set containing a set of variables; means for utilizing the data set for the set of variables and probabilistic inference to determine parameter gradients for a log-likelihood of a conditional Gaussian (CG) graphical model over those variables with at least one continuous variable and with incomplete observation data for at least one of the variables, the CG graphical model employed to deduce a cause of the given outcome in an artificial intelligence application; means for optimizing at least one gradient parameter via utilization of an exponential model representation that automatically enforces that ps≧
0 andto facilitate in optimizing at least one gradient parameter; means for converting the resulting parameterization into a non-exponential representation; means for utilizing a propagation scheme on the non-exponential parameterization in order to compute at least one gradient; and means for displaying the parameter gradients on a display device.
-
Specification