BATCH PROCESSING IN A NEURAL NETWORK PROCESSOR
First Claim
1. A method for generating a respective neural network output for each of a plurality of inputs, wherein the generating comprises processing each input through each of a plurality of neural network layers to generate the respective neural network output for the input, wherein the neural network layers are arranged in a directed graph structure, and wherein each neural network layer has a respective batch size, the method comprising, for each of the neural network layers:
- receiving a plurality of inputs to be processed at the neural network layer;
forming one or more batches of inputs from the plurality of inputs, each batch having a number of inputs equal to the respective batch size for the neural network layer;
selecting a number of the one or more batches of inputs to process, where a count of the inputs in the number of the one or more batches is greater than, less than, or equal to the respective associated batch size of a subsequent layer in the directed graph structure; and
processing the number of the one or more batches of inputs to generate the respective neural network layer output.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a respective neural network output for each of a plurality of inputs, the method comprising, for each of the neural network layers: receiving a plurality of inputs to be processed at the neural network layer; forming one or more batches of inputs from the plurality of inputs, each batch having a number of inputs up to the respective batch size for the neural network layer; selecting a number of the one or more batches of inputs to process, where a count of the inputs in the number of the one or more batches is greater than or equal to the respective associated batch size of a subsequent layer in the sequence; and processing the number of the one or more batches of inputs to generate the respective neural network layer output.
42 Citations
27 Claims
-
1. A method for generating a respective neural network output for each of a plurality of inputs, wherein the generating comprises processing each input through each of a plurality of neural network layers to generate the respective neural network output for the input, wherein the neural network layers are arranged in a directed graph structure, and wherein each neural network layer has a respective batch size, the method comprising, for each of the neural network layers:
-
receiving a plurality of inputs to be processed at the neural network layer; forming one or more batches of inputs from the plurality of inputs, each batch having a number of inputs equal to the respective batch size for the neural network layer; selecting a number of the one or more batches of inputs to process, where a count of the inputs in the number of the one or more batches is greater than, less than, or equal to the respective associated batch size of a subsequent layer in the directed graph structure; and processing the number of the one or more batches of inputs to generate the respective neural network layer output. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system for generating a respective neural network output for each of a plurality of inputs, wherein the generating comprises processing each input through each of a plurality of neural network layers to generate the respective neural network output for the input, wherein the neural network layers are arranged in a directed graph structure, and wherein each neural network layer has a respective batch size, the system comprising:
-
one or more computers; and computer-readable medium coupled to the one or more computers and having instructions stored thereon, which, when executed by the one or more computers, cause the one or more computers to, for each of the neural network layers, perform operations comprising; receiving a plurality of inputs to be processed at the neural network layer; forming one or more batches of inputs from the plurality of inputs, each batch having a number of inputs equal to the respective batch size for the neural network layer; selecting a number of the one or more batches of inputs to process, where a count of the inputs in the number of the one or more batches is greater than, less than, or equal to the respective associated batch size of a subsequent layer in the directed graph structure; and processing the number of the one or more batches of inputs to generate the respective neural network layer output. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer-readable medium having instructions stored thereon, which, when executed by one or more computers, cause the one or more computers to perform operations for generating a respective neural network output for each of a plurality of inputs, wherein the generating comprises processing each input through each of a plurality of neural network layers to generate the respective neural network output for the input, wherein the neural network layers are arranged in a directed graph structure, and wherein each neural network layer has a respective batch size, the operations comprising, for each of the neural network layers:
-
receiving a plurality of inputs to be processed at the neural network layer; forming one or more batches of inputs from the plurality of inputs, each batch having a number of inputs equal to the respective batch size for the neural network layer; selecting a number of the one or more batches of inputs to process, where a count of the inputs in the number of the one or more batches is greater than, less than, or equal to the respective associated batch size of a subsequent layer in the directed graph structure; and processing the number of the one or more batches of inputs to generate the respective neural network layer output. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27)
-
Specification