Training machine learning by sequential conditional generalized iterative scaling
First Claim
1. A system for training a machine learning system, comprising:
- an expected value update component that, for a plurality of outputs and for a plurality of instances in which a single feature function is non-zero, modifies an expected value based, at least in part, upon the single feature function of an input vector and an output value, a sum of lambda variable and a normalization variable;
an error calculator that calculates an error based, at least in part, upon the expected value and an observed value, the error calculation further employing, at least in part, the following equation;
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method facilitating training machine learning systems utilizing sequential conditional generalized iterative scaling is provided. The invention includes an expected value update component that modifies an expected value based, at least in part, upon a feature function of an input vector and an output value, a sum of lambda variable and a normalization variable. The invention further includes an error calculator that calculates an error based, at least in part, upon the expected value and an observed value. The invention also includes a parameter update component that modifies a trainable parameter based, at least in part, upon the error. A variable update component that updates at least one of the sum of lambda variable and the normalization variable based, at least in part, upon the error is also provided.
39 Citations
20 Claims
-
1. A system for training a machine learning system, comprising:
-
an expected value update component that, for a plurality of outputs and for a plurality of instances in which a single feature function is non-zero, modifies an expected value based, at least in part, upon the single feature function of an input vector and an output value, a sum of lambda variable and a normalization variable; an error calculator that calculates an error based, at least in part, upon the expected value and an observed value, the error calculation further employing, at least in part, the following equation; - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for training a machine learning system, comprising:
-
an expected value update component that, for a plurality of outputs and for a plurality of instances in which a single feature function is non-zero, modifies an expected value based, at least in part, upon the single feature function of an input vector and an output value, a sum of lambda variable and a normalization variable, modification of the expected value being based, at least in part, upon the following equation;
expected value=expected value+ƒ
i({overscore (x)}j, y)es[j,y]/z[j]where ƒ
i({overscore (x)}j, y) is the feature function,{overscore (x)}j is the input vector, y is the output value, s[j,y] is the sum of lambda variable, and, z[j] is the normalization variable; an error calculator that calculates an error based, at least in part, upon the expected value and an observed value; a parameter update component that modifies class trainable parameters or word trainable parameters based, at least in part, upon the error; and
,a variable update component that, for the plurality of outputs and for the plurality of instances in which the feature function is non-zero, sequentially updates at least one of the sum of lambda variable and the normalization variable based, at least in part, upon the error. - View Dependent Claims (12, 13)
-
-
14. A method for training a machine learning system, comprising:
-
for each feature function, updating an expected value based, at least in part, upon a feature function of an input vector and an output value, a sum of lambda variable and a normalization variable; for each feature function, calculating an error based, at least in part, upon the expected value and an observed value, the error calculation being based, at least in part, upon the following equation; - View Dependent Claims (15)
-
-
16. A method for training a machine learning system, comprising:
-
updating an expected value based, at least in part, upon a feature function of an input vector and an output value, a sum of lambda variable and a normalization variable, for each output, for each instance that the feature function is not zero; calculating an error based, at least in part, upon the expected value and an observed value, the error calculation further employing, at least in part, the following equation; - View Dependent Claims (17)
-
-
18. A computer implemented method for training a learning system, comprising the following computer executable acts:
training trainable class parameters based, at least in part, upon sequential conditional generalized iterative scaling an input vector, an output value, and calculating an error employing, at least in part, the following equation;
-
19. A computer readable medium storing computer executable components of a system facilitating training of a machine learning system, comprising:
-
an expected value update component that modifies an expected value for a plurality of outputs and for a plurality of instances in which a single feature function is non-zero based, at least in part, upon the single feature function of an input vector and an output value, a sum of lambda variable and a normalization variable; an error calculator component that calculates an error based, at least in part, upon the expected value and an observed value; a parameter update component that modifies a trainable parameter based, at least in part, upon the error; and
,a variable update component that sequentially updates at least one of the sum of lambda variable and the normalization variable for the plurality of outputs and for the plurality of instances in which the feature function is non-zero based, at least in part, upon the error, the updating of the sum of lambda variable and the normalization variable being based, at least in part, upon the following equations;
z[j]=z[j]−
es[j,y]
s[j,y]=s[j,y]+δ
i
z[j]=z[j]+es[j,y]where s[j,y] is the sum of lambda variable, z[j] is the normalization variable, and, δ
i is the error.
-
-
20. A training system for a machine learning system, comprising:
-
means for modifying an expected value for a plurality of outputs and for a plurality of instances in which a feature function is non-zero based, at least in part, upon the feature function of an input vector and an output value, a sum of lambda variable and a normalization variable; means for calculating an error based, at least in part, upon the expected value and an observed value, the means for error calculation further employing, at least in part, the following equation;
-
Specification