Deep neural networks training for speech and pattern recognition
First Claim
1. A system comprising:
- one or more processors; and
one or more computer storage media storing computer-executable instructions that are executable to cause the one or more processors to perform acts comprising;
providing a pipelined algorithm to train deep neural networks (DNNs) for performing data analysis based on training data, the DNNs being one of context-dependent DNNs or context-independent DNNs;
partitioning the training data into sample batches of a specific batch size based on rates of data transfers between the one or more processors for executing the pipelined algorithm and an execution speed of each of the one or more processors; and
pipelining an execution of the pipelined algorithm on the DNNs through the one or more processors to train the DNNs using the sample batches.
3 Assignments
0 Petitions
Accused Products
Abstract
The use of a pipelined algorithm that performs parallelized computations to train deep neural networks (DNNs) for performing data analysis may reduce training time. The DNNs may be one of context-independent DNNs or context-dependent DNNs. The training may include partitioning training data into sample batches of a specific batch size. The partitioning may be performed based on rates of data transfers between processors that execute the pipelined algorithm, considerations of accuracy and convergence, and the execution speed of each processor. Other techniques for training may include grouping layers of the DNNs for processing on a single processor, distributing a layer of the DNNs to multiple processors for processing, or modifying an execution order of steps in the pipelined algorithm.
68 Citations
20 Claims
-
1. A system comprising:
-
one or more processors; and one or more computer storage media storing computer-executable instructions that are executable to cause the one or more processors to perform acts comprising; providing a pipelined algorithm to train deep neural networks (DNNs) for performing data analysis based on training data, the DNNs being one of context-dependent DNNs or context-independent DNNs; partitioning the training data into sample batches of a specific batch size based on rates of data transfers between the one or more processors for executing the pipelined algorithm and an execution speed of each of the one or more processors; and pipelining an execution of the pipelined algorithm on the DNNs through the one or more processors to train the DNNs using the sample batches. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-implemented method, comprising:
-
providing a pipelined algorithm to train deep neural networks (DNNs) for performing data analysis based on training data, the DNNs being one of context-dependent DNNs or context-independent DNNs and including multiple layers; determining that a ratio between a size of a top layer and a size of one or more of the multiple layers exceeds a predetermined threshold; based at least in part on the determining, distributing the top layer of the DNNs across multiple processors through model striping for parallelized processing by the pipelined algorithm; and pipelining an execution of the pipelined algorithm on the DNNs through the multiple processors to train the DNNs using sample batches of the training data. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A system, comprising:
-
a plurality of processors; a memory that includes a plurality of computer-executable components that are executable by the plurality of processors, comprising; a batch generation component that partitions training data into sample batches of a specific batch size; and an algorithm execution component that pipelines an execution of a pipelined algorithm through the plurality of processors to train deep neural networks (DNNs) using the sample batches, the execution including executing a model update prior to an input data forward propagation in a computation iteration of the pipelined algorithm, the DNNs being one of context-dependent DNNs or context-independent DNNs, wherein the algorithm execution component trains the DNNs based at least in part on performing gradient descent techniques, wherein the DNNs include multiple layers, and wherein the execution further includes streaming output data from a computation at a first processor of the plurality of processors that processes an upper layer to a second processor of the plurality of processors that processes a lower layer following a performance of an error back propagation of the computation iteration, the streaming of the output data occurring at least partially in parallel with one or more of the model update or the input data forward propagation. - View Dependent Claims (17, 18, 19, 20)
-
Specification