Staged training of neural networks for improved time series prediction performance
First Claim
1. An apparatus comprising a processor and a storage to store instructions that, when executed by the processor, cause the processor to perform operations comprising:
- train, using initial neural network configuration data comprising neural network hyperparameters, a first neural network of a chain of neural networks to generate first neural network configuration data comprising the hyperparameters and first trained parameters learned by the first neural network, wherein;
the chain is to perform an analytical function to generate a set of output data values from a set of input data values;
the analytical function comprises the generation of a time series prediction that covers a selected full range of time;
the chain comprises a set of multiple neural networks that includes at least the first neural network and a last neural network;
the set of neural networks is ordered to form the chain starting with the first neural network at a head of the chain and ending with the last neural network at a tail of the chain;
each neural network in the chain comprises external inputs to receive the set of input data values;
each neural network in the chain comprises outputs at which the neural network outputs a portion of the set of output data values from the set input data values during operation of the chain to perform the analytical function; and
the set of neural networks is interconnected within the chain such that each neural network in the chain, except the first neural network at the head of the chain, receives the outputs of a preceding neural network in the ordering of neural networks within the chain as additional inputs;
train, using the first neural network configuration data, a next neural network in the ordering of neural networks within the chain to generate a next neural network configuration data comprising the hyperparameters and next trained parameters learned by the next neural network;
use at least the first neural network configuration data, the next neural network configuration data, and additional data comprising an indication of interconnections among the neural networks within the chain to instantiate the chain of neural networks; and
operate the chain of neural networks to perform the analytical function.
1 Assignment
0 Petitions
Accused Products
Abstract
An apparatus includes a processor to: train a first neural network of a chain to generate first configuration data including first trained parameters, wherein the chain performs an analytical function generating a set of output values from a set of input values, each neural network has inputs to receive the set of input values and outputs to output a portion of the set of output values, and the neural networks are ordered from the first at the head to a last neural network at the tail, and are interconnected so that each neural network additionally receives the outputs of a preceding neural network; train, using the first configuration data, a next neural network in the chain ordering to generate next configuration data including next trained parameters; and use at least the first and next configuration data and data indicating the interconnections to instantiate the chain to perform the analytical function.
309 Citations
30 Claims
-
1. An apparatus comprising a processor and a storage to store instructions that, when executed by the processor, cause the processor to perform operations comprising:
-
train, using initial neural network configuration data comprising neural network hyperparameters, a first neural network of a chain of neural networks to generate first neural network configuration data comprising the hyperparameters and first trained parameters learned by the first neural network, wherein; the chain is to perform an analytical function to generate a set of output data values from a set of input data values; the analytical function comprises the generation of a time series prediction that covers a selected full range of time; the chain comprises a set of multiple neural networks that includes at least the first neural network and a last neural network; the set of neural networks is ordered to form the chain starting with the first neural network at a head of the chain and ending with the last neural network at a tail of the chain; each neural network in the chain comprises external inputs to receive the set of input data values; each neural network in the chain comprises outputs at which the neural network outputs a portion of the set of output data values from the set input data values during operation of the chain to perform the analytical function; and the set of neural networks is interconnected within the chain such that each neural network in the chain, except the first neural network at the head of the chain, receives the outputs of a preceding neural network in the ordering of neural networks within the chain as additional inputs; train, using the first neural network configuration data, a next neural network in the ordering of neural networks within the chain to generate a next neural network configuration data comprising the hyperparameters and next trained parameters learned by the next neural network; use at least the first neural network configuration data, the next neural network configuration data, and additional data comprising an indication of interconnections among the neural networks within the chain to instantiate the chain of neural networks; and operate the chain of neural networks to perform the analytical function. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer-program product tangibly embodied in a non-transitory machine-readable storage medium, the computer-program product including instructions operable to cause a processor to perform operations comprising:
-
train, using initial neural network configuration data comprising neural network hyperparameters, a first neural network of a chain of neural networks to generate first neural network configuration data comprising the hyperparameters and first trained parameters learned by the first neural network, wherein; the chain is to perform an analytical function to generate a set of output data values from a set of input data values; the analytical function comprises the generation of a time series prediction that covers a selected full range of time; the chain comprises a set of multiple neural networks that includes at least the first neural network and a last neural network; the set of neural networks is ordered to form the chain starting with the first neural network at a head of the chain and ending with the last neural network at a tail of the chain; each neural network in the chain comprises external inputs to receive the set of input data values; each neural network in the chain comprises outputs at which the neural network outputs a portion of the set of output data values from the set input data values during operation of the chain to perform the analytical function; and the set of neural networks is interconnected within the chain such that each neural network in the chain, except the first neural network at the head of the chain, receives the outputs of a preceding neural network in the ordering of neural networks within the chain as additional inputs; train, using the first neural network configuration data, a next neural network in the ordering of neural networks within the chain to generate a next neural network configuration data comprising the hyperparameters and next trained parameters learned by the next neural network; use at least the first neural network configuration data, the next neural network configuration data, and additional data comprising an indication of interconnections among the neural networks within the chain to instantiate the chain of neural networks; and operate the chain of neural networks to perform the analytical function. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer-implemented method comprising:
-
training, by a processor, and using initial neural network configuration data comprising neural network hyperparameters, a first neural network of a chain of neural networks to generate first neural network configuration data comprising the hyperparameters and first trained parameters learned by the first neural network, wherein; the chain is to perform an analytical function to generate a set of output data values from a set of input data values; the analytical function comprises the generation of a time series prediction that covers a selected full range of time; the chain comprises a set of multiple neural networks that includes at least the first neural network and a last neural network; the set of neural networks is ordered to form the chain starting with the first neural network at a head of the chain and ending with the last neural network at a tail of the chain; each neural network in the chain comprises external inputs to receive the set of input data values; each neural network in the chain comprises outputs at which the neural network outputs a portion of the set of output data values from the set input data values during operation of the chain to perform the analytical function; and the set of neural networks is interconnected within the chain such that each neural network in the chain, except the first neural network at the head of the chain, receives the outputs of a preceding neural network in the ordering of neural networks within the chain as additional inputs; training, by the processor, and using the first neural network configuration data, a next neural network in the ordering of neural networks within the chain to generate a next neural network configuration data comprising the hyperparameters and next trained parameters learned by the next neural network; using at least the first neural network configuration data, the next neural network configuration data, and additional data comprising an indication of interconnections among the neural networks within the chain to instantiate the chain of neural networks; and operating the chain of neural networks to perform the analytical function. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30)
-
Specification