Optimized training of linear machine learning models
First Claim
1. A system, comprising:
- one or more computing devices configured to;
receive, at a machine learning service of a provider network, an indication of a data source to be used for generating a linear prediction model, wherein, to generate a prediction, the linear prediction model is to utilize respective weights assigned to individual ones of a plurality of features derived from observation records of the data source, wherein the respective weights are stored in a parameter vector of the linear prediction model and updated in-memory during a machine training phase of the linear prediction model;
determine, based at least in part on examination of a particular set of observation records of the data source, respective weights for one or more features to be added to the parameter vector during a particular learning iteration of a plurality of learning iterations of the training phase of the linear prediction model, wherein the addition increases memory consumption during the machine training phase;
check, during one or more of the plurality of learning iterations, for a triggering condition to prune the parameter vector;
in response to a determination that the triggering condition has been met during the training phase,identify one or more pruning victims from a set of features whose weights are included in the parameter vector, based at least in part on a quantile analysis of the weights, wherein the quantile analysis is performed without a sort operation; and
remove at least a particular weight corresponding to a particular pruning victim of the one or more pruning victims from the parameter vector, wherein the removal reduces memory consumption during the training phase; and
generate, during a post-training-phase prediction run of the linear prediction model, a prediction using at least one feature for which a weight is determined after the particular weight of the particular pruning victim is removed from the parameter vector.
1 Assignment
0 Petitions
Accused Products
Abstract
An indication of a data source to be used to train a linear prediction model is obtained. The model is to generate predictions using respective parameters assigned to a plurality of features derived from observation records of the data source. The parameter values are stored in a parameter vector. During a particular learning iteration of the training phase of the model, one or more features for which parameters are to be added to the parameter vector are identified. In response to a triggering condition, parameters for one or more features are removed from the parameter vector based on an analysis of relative contributions of the features represented in the parameter vector to the model'"'"'s predictions. After the parameters are removed, at least one parameter is added to the parameter vector.
99 Citations
21 Claims
-
1. A system, comprising:
one or more computing devices configured to; receive, at a machine learning service of a provider network, an indication of a data source to be used for generating a linear prediction model, wherein, to generate a prediction, the linear prediction model is to utilize respective weights assigned to individual ones of a plurality of features derived from observation records of the data source, wherein the respective weights are stored in a parameter vector of the linear prediction model and updated in-memory during a machine training phase of the linear prediction model; determine, based at least in part on examination of a particular set of observation records of the data source, respective weights for one or more features to be added to the parameter vector during a particular learning iteration of a plurality of learning iterations of the training phase of the linear prediction model, wherein the addition increases memory consumption during the machine training phase; check, during one or more of the plurality of learning iterations, for a triggering condition to prune the parameter vector; in response to a determination that the triggering condition has been met during the training phase, identify one or more pruning victims from a set of features whose weights are included in the parameter vector, based at least in part on a quantile analysis of the weights, wherein the quantile analysis is performed without a sort operation; and remove at least a particular weight corresponding to a particular pruning victim of the one or more pruning victims from the parameter vector, wherein the removal reduces memory consumption during the training phase; and generate, during a post-training-phase prediction run of the linear prediction model, a prediction using at least one feature for which a weight is determined after the particular weight of the particular pruning victim is removed from the parameter vector. - View Dependent Claims (2, 3, 4, 5)
-
6. A method, comprising:
performing, by one or more computing devices; receiving an indication of a data source to be used for training a machine learning model, wherein, to generate a prediction, the machine learning model is to utilize respective parameters assigned to individual ones of a plurality of features derived from observation records of the data source, wherein the respective parameters are stored in a parameter vector of the machine learning model and updated in-memory during a training phase of the machine learning model; identifying one or more features for which respective parameters are to be added to the parameter vector during a particular learning iteration of a plurality of learning iterations of the training phase of the machine learning model, wherein the addition increases memory consumption during the training phase; checking, during one or more of the plurality of learning iterations, for a triggering condition to prune the parameter vector; in response to determining that the triggering condition has been met in the training phase, removing respective parameters of one or more pruning victim features from the parameter vector, wherein the removal reduces memory consumption during the training phase, and wherein the one or more pruning victim features are selected based at least in part on an analysis of relative contributions of features whose parameters are included in the parameter vector to predictions made using the machine learning model; and generating, during a post-training-phase prediction run of the machine learning model, a particular prediction using at least one feature for which a parameter is determined after the one or more pruning victim features are selected. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
17. A non-transitory computer-accessible storage medium storing program instructions that when executed on one or more processors implements a model generator of a machine learning service, wherein the model generator is configured to:
-
determine a data source to be used for generating a model, wherein, to generate a prediction, the model is to utilize respective parameters assigned to individual ones of a plurality of features derived from observation records of the data source, wherein the respective parameters are stored in a parameter vector of the model and updated in-memory during a training phase of the model; identify one or more features for which parameters are to be added to the parameter vector during a particular learning iteration of a plurality of learning iterations of the training phase of the model, wherein the addition increases memory consumption during the training phase; check, during one or more of the plurality of learning iterations, for a triggering condition to prune the parameter vector; in response to a determination that the triggering condition has been met, remove respective parameters assigned to one or more pruning victim features from the parameter vector, wherein the removal reduces memory consumption during the training phase, and wherein the one or more pruning victim features are selected based at least in part on an analysis of relative contributions of features whose parameters are included in the parameter vector to predictions made using the model; and add, subsequent to a removal from the parameter vector of at least one parameter assigned to a pruning victim feature, at least one parameter to the parameter vector. - View Dependent Claims (18, 19, 20, 21)
-
Specification