PARALLELIZING THE TRAINING OF CONVOLUTIONAL NEURAL NETWORKS
First Claim
1. A system for training a convolutional neural network on a plurality of batches of training examples, the convolutional neural network having a plurality of layers arranged in a sequence from lowest to highest, the sequence including one or more convolutional layers followed by one or more fully-connected layers, each convolutional layer and each fully-connected layer comprising a respective plurality of nodes, the system comprising:
- a plurality of workers, wherein each worker is configured to maintain a respective replica of each of the convolutional layers and a respective disjoint partition of each of the fully-connected layers, wherein each replica of a convolutional layer includes all of the nodes in the convolutional layer, wherein each disjoint partition of a fully-connected layer includes a portion of the nodes of the fully-connected layer, and wherein each worker is configured to perform operations comprising;
receiving a batch of training examples assigned to the worker, wherein the batches of training examples are assigned such that each worker receives a respective batch of the plurality of batches;
training the convolutional layer replica maintained by the worker on the batch of training examples assigned to the worker; and
training the fully-connected layer partitions maintained by the worker on each of the plurality of batches of training examples.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a convolutional neural network (CNN). The system includes a plurality of workers, wherein each worker is configured to maintain a respective replica of each of the convolutional layers of the CNN and a respective disjoint partition of each of the fully-connected layers of the CNN, wherein each replica of a convolutional layer includes all of the nodes in the convolutional layer, and wherein each disjoint partition of a fully-connected layer includes a portion of the nodes of the fully-connected layer.
30 Citations
20 Claims
-
1. A system for training a convolutional neural network on a plurality of batches of training examples, the convolutional neural network having a plurality of layers arranged in a sequence from lowest to highest, the sequence including one or more convolutional layers followed by one or more fully-connected layers, each convolutional layer and each fully-connected layer comprising a respective plurality of nodes, the system comprising:
-
a plurality of workers, wherein each worker is configured to maintain a respective replica of each of the convolutional layers and a respective disjoint partition of each of the fully-connected layers, wherein each replica of a convolutional layer includes all of the nodes in the convolutional layer, wherein each disjoint partition of a fully-connected layer includes a portion of the nodes of the fully-connected layer, and wherein each worker is configured to perform operations comprising; receiving a batch of training examples assigned to the worker, wherein the batches of training examples are assigned such that each worker receives a respective batch of the plurality of batches; training the convolutional layer replica maintained by the worker on the batch of training examples assigned to the worker; and training the fully-connected layer partitions maintained by the worker on each of the plurality of batches of training examples. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method for training a convolutional neural network on a plurality of batches of training examples, the convolutional neural network having a plurality of layers arranged in a sequence from lowest to highest, the sequence including one or more convolutional layers followed by one or more fully-connected layers, each convolutional layer and each fully-connected layer comprising a respective plurality of nodes, the method comprising:
-
maintaining, by each of a plurality of workers, a respective replica of each of the convolutional layers, wherein each replica of a convolutional layer includes all of the nodes in the convolutional layer; maintaining, by each of the workers, a respective disjoint partition of each of the fully-connected layers, wherein each disjoint partition of a fully-connected layer includes a portion of the nodes of the fully-connected layer; receiving, by each of the workers, a batch of training examples assigned to the worker, wherein the batches of training examples are assigned such that each worker receives a respective batch of the plurality of batches; training, by each of the workers, the convolutional layer replica maintained by the worker on the batch of training examples assigned to the worker; and training, by each of the workers, the fully-connected layer partitions maintained by the worker on each of the plurality of batches of training examples. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. One or more computer storage media encoded with instructions that, when executed by one or more computers, cause the one or more computers to perform operations for training a convolutional neural network on a plurality of batches of training examples, the convolutional neural network having a plurality of layers arranged in a sequence from lowest to highest, the sequence including one or more convolutional layers followed by one or more fully-connected layers, each convolutional layer and each fully-connected layer comprising a respective plurality of nodes, the operations comprising:
-
maintaining, by each of a plurality of workers, a respective replica of each of the convolutional layers, wherein each replica of a convolutional layer includes all of the nodes in the convolutional layer; maintaining, by each of the workers, a respective disjoint partition of each of the fully-connected layers, wherein each disjoint partition of a fully-connected layer includes a portion of the nodes of the fully-connected layer; receiving, by each of the workers, a batch of training examples assigned to the worker, wherein the batches of training examples are assigned such that each worker receives a respective batch of the plurality of batches; training, by each of the workers, the convolutional layer replica maintained by the worker on the batch of training examples assigned to the worker; and training, by each of the workers, the fully-connected layer partitions maintained by the worker on each of the plurality of batches of training examples.
-
Specification