Feature processing tradeoff management
First Claim
Patent Images
1. A system, comprising:
- one or more computing devices configured to;
determine, via one or more programmatic interactions with a client of a machine learning service of a provider network, (a) one or more target variables to be predicted using a specified training data set, (b) one or more prediction quality metrics including a particular prediction quality metric, and (c) one or more prediction run-time goals including a particular prediction run-time goal;
identify a set of candidate feature processing transformations to derive a first set of processed variables from one or more input variables of the specified data set, wherein at least a subset of the first set of processed variables is usable to train a machine learning model to predict the one or more target variables, and wherein the set of candidate feature processing transformations includes a particular feature processing transformation;
determine (a) a quality estimate indicative of an effect, on the particular prediction quality metric, of implementing the particular candidate feature processing transformation, and (b) a cost estimate indicative of an effect, on a particular run-time performance metric associated with the particular prediction run-time goal, of implementing the particular candidate feature processing transformation;
generate, based at least in part on the quality estimate and at least in part on the cost estimate, a feature processing proposal to be provided to the client for approval, wherein the feature processing proposal includes a recommendation to implement the particular feature processing transformation; and
in response to an indication of approval from the client, execute a machine learning model trained using a particular processed variable obtained from the particular feature processing transformation.
1 Assignment
0 Petitions
Accused Products
Abstract
At a machine learning service, a set of candidate variables that can be used to train a model is identified, including at least one processed variable produced by a feature processing transformation. A cost estimate indicative of an effect of implementing the feature processing transformation on a performance metric associated with a prediction goal of the model is determined. Based at least in part on the cost estimate, a feature processing proposal that excludes the feature processing transformation is implemented.
65 Citations
21 Claims
-
1. A system, comprising:
one or more computing devices configured to; determine, via one or more programmatic interactions with a client of a machine learning service of a provider network, (a) one or more target variables to be predicted using a specified training data set, (b) one or more prediction quality metrics including a particular prediction quality metric, and (c) one or more prediction run-time goals including a particular prediction run-time goal; identify a set of candidate feature processing transformations to derive a first set of processed variables from one or more input variables of the specified data set, wherein at least a subset of the first set of processed variables is usable to train a machine learning model to predict the one or more target variables, and wherein the set of candidate feature processing transformations includes a particular feature processing transformation; determine (a) a quality estimate indicative of an effect, on the particular prediction quality metric, of implementing the particular candidate feature processing transformation, and (b) a cost estimate indicative of an effect, on a particular run-time performance metric associated with the particular prediction run-time goal, of implementing the particular candidate feature processing transformation; generate, based at least in part on the quality estimate and at least in part on the cost estimate, a feature processing proposal to be provided to the client for approval, wherein the feature processing proposal includes a recommendation to implement the particular feature processing transformation; and in response to an indication of approval from the client, execute a machine learning model trained using a particular processed variable obtained from the particular feature processing transformation. - View Dependent Claims (2, 3, 4, 5)
-
6. A method, comprising:
performing, by one or more computing devices; identifying, at a machine learning service, a set of candidate input variables usable to train a machine learning model to predict one or more target variables, wherein the set of candidate input variables includes at least a particular processed variable generated by a particular feature processing transformation applicable to one or more input variables of a training data set; determining (a) a quality estimate indicative of an effect, on a particular prediction quality metric, of implementing the particular feature processing transformation, and (b) a cost estimate indicative of an effect, on a performance metric associated with a particular prediction goal, of implementing the particular feature processing transformation; and implementing, based at least in part on the quality estimate and at least in part on the cost estimate, a feature processing plan that includes the particular feature processing transformation. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
18. A non-transitory computer-accessible storage medium storing program instructions that when executed on one or more processors:
-
identify, at a machine learning service, a set of candidate input variables usable to train a machine learning model to predict one or more target variables, wherein the set of candidate input variables includes at least a particular processed variable resulting from a particular feature processing transformation applicable to one or more input variables of a training data set; determine a cost estimate indicative of an effect, on a performance metric associated with a particular prediction goal, of implementing the particular feature processing transformation; and implement, based at least in part on the cost estimate, a feature processing proposal that excludes the particular feature processing transformation. - View Dependent Claims (19, 20, 21)
-
Specification