DISTRIBUTED MODEL TRAINING
First Claim
1. A method comprising:
- determining, by a device, that a machine learning model is to be trained by a plurality of devices in a network;
identifying a set of training devices from among the plurality of devices in the network to train the machine learning model, wherein each of the training devices has a local set of training data;
sending an instruction to each of the training devices, wherein the instruction is configured to cause a training device to receive model parameters from a first training device in the set, use the received model parameters with at least a portion of the local set of training data to generate new model parameters, and forward the new model parameters to a second training device in the set; and
receiving model parameters from the training devices that have been trained using a global set of training data comprising the local sets of training data on the training devices.
1 Assignment
0 Petitions
Accused Products
Abstract
In one embodiment, a device determines that a machine learning model is to be trained by a plurality of devices in a network. A set of training devices are identified from among the plurality of devices to train the model, with each of the training devices having a local set of training data. An instruction is then sent to each of the training devices that is configured to cause a training device to receive model parameters from a first training device in the set, use the parameters with at least a portion of the local set of training data to generate new model parameters, and forward the new model parameters to a second training device in the set. Model parameters from the training devices are also received that have been trained using a global set of training data that includes the local sets of training data on the training devices.
-
Citations
22 Claims
-
1. A method comprising:
-
determining, by a device, that a machine learning model is to be trained by a plurality of devices in a network; identifying a set of training devices from among the plurality of devices in the network to train the machine learning model, wherein each of the training devices has a local set of training data; sending an instruction to each of the training devices, wherein the instruction is configured to cause a training device to receive model parameters from a first training device in the set, use the received model parameters with at least a portion of the local set of training data to generate new model parameters, and forward the new model parameters to a second training device in the set; and receiving model parameters from the training devices that have been trained using a global set of training data comprising the local sets of training data on the training devices. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. An apparatus, comprising:
-
one or more network interfaces to communicate in a computer network; a processor coupled to the network interfaces and configured to execute one or more processes; and a memory configured to store a process executable by the processor, the process when executed operable to; determine that a machine learning model is to be trained by a plurality of devices in a network; identify a set of training devices from among the plurality of devices in the network to train the machine learning model, wherein each of the training devices has a local set of training data; send an instruction to each of the training devices, wherein the instruction is configured to cause a training device to receive model parameters from a first training device in the set, use the received model parameters with at least a is portion of the local set of training data to generate new model parameters, and forward the new model parameters to a second training device in the set; and receive model parameters from the training devices that have been trained using a global set of training data comprising the local sets of training data on the training devices. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A method comprising:
-
receiving, from a first training device, model parameters for a machine learning model; using the received model parameters to generate new model parameters by training the machine learning model with a local set of training data; and sending the new model parameters to a second training device configured to use the sent model parameters and a second set of training data on the second training device to train the machine learning model. - View Dependent Claims (18, 19)
-
-
20. An apparatus, comprising:
-
one or more network interfaces to communicate in a computer network; a processor coupled to the network interfaces and adapted to execute one or more processes; and a memory configured to store a process executable by the processor, the process when executed operable to; receive, from a first training device, model parameters for a machine learning model; use the received model parameters to generate new model parameters by training the machine learning model with a local set of training data; and send the new model parameters to a second training device configured to use the sent model parameters and a second set of training data on the second training device to train the machine learning model. - View Dependent Claims (21)
-
-
22. A tangible, non-transitory, computer-readable media having software encoded thereon, the software when executed by a processor operable to:
-
determine that a machine learning model is to be trained by a plurality of devices in a network; identify a set of training devices from among the plurality of devices in the network to train the machine learning model, wherein each of the training devices has a local set of training data; send an instruction to each of the training devices, wherein the instruction is configured to cause a training device to receive model parameters from a first training device in the set, use the received model parameters with at least a portion of the local set of training data to generate new model parameters, and forward the new model parameters to a second training device in the set; and receive model parameters from the training devices that have been trained using a global set of training data comprising the local sets of training data on the training devices.
-
Specification