Batch processing in a neural network processor

US 10,083,395 B2
Filed: 09/03/2015
Issued: 09/25/2018
Est. Priority Date: 05/21/2015
Status: Active Grant

First Claim

Patent Images

1. A method for performing neural network computations using hardware circuitry comprising a hardware matrix computation unit, the neural network computations being for a neural network having a plurality of neural network layers, the method comprising:

obtaining, using the hardware circuitry, a plurality of layer inputs to be processed;

based on (i) a size of a layer input to a particular neural network layer of the plurality of neural network layers and (ii) a weight reuse value representing a number of times that the hardware matrix computation unit of the hardware circuitry reuses weight inputs for neural network computations, determining, using the hardware circuitry, a batch size for the particular neural network layer, wherein the batch size represents a number of batches to be processed in parallel by the hardware matrix computation unit for the particular neural network layer; and

processing, by the hardware matrix computation unit and for the particular neural network layer, one or more batches of layer inputs to generate one or more layer outputs, wherein each batch of the one or more batches includes a number of layer inputs corresponding to the batch size for particular neural network layer.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a respective neural network output for each of a plurality of inputs, the method comprising, for each of the neural network layers: receiving a plurality of inputs to be processed at the neural network layer; forming one or more batches of inputs from the plurality of inputs, each batch having a number of inputs up to the respective batch size for the neural network layer; selecting a number of the one or more batches of inputs to process, where a count of the inputs in the number of the one or more batches is greater than or equal to the respective associated batch size of a subsequent layer in the sequence; and processing the number of the one or more batches of inputs to generate the respective neural network layer output.

34 Citations

View as Search Results

27 Claims

1. A method for performing neural network computations using hardware circuitry comprising a hardware matrix computation unit, the neural network computations being for a neural network having a plurality of neural network layers, the method comprising:
- obtaining, using the hardware circuitry, a plurality of layer inputs to be processed;
  
  based on (i) a size of a layer input to a particular neural network layer of the plurality of neural network layers and (ii) a weight reuse value representing a number of times that the hardware matrix computation unit of the hardware circuitry reuses weight inputs for neural network computations, determining, using the hardware circuitry, a batch size for the particular neural network layer, wherein the batch size represents a number of batches to be processed in parallel by the hardware matrix computation unit for the particular neural network layer; and
  
  processing, by the hardware matrix computation unit and for the particular neural network layer, one or more batches of layer inputs to generate one or more layer outputs, wherein each batch of the one or more batches includes a number of layer inputs corresponding to the batch size for particular neural network layer.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, where the weight reuse value is based at least on a clock rate of a memory storing weight inputs.
  - 3. The method of claim 1, wherein the plurality of neural network layers are arranged in a directed graph structure.
  - 4. The method of claim 3, further comprising determining, using the hardware circuitry, a number of the one or more batches of layer inputs for processing based on a batch size of a subsequent layer in the directed graph structure.
  - 5. The method of claim 1, where the batch size is determined based at least on the weight reuse value divided by the size of the layer input.
  - 6. The method of claim 1, wherein processing the one or more batches of layer inputs comprises computing accumulated values for each layer input using the hardware matrix computation unit of the hardware circuitry.
  - 7. The method of claim 6, further comprising determining, using the hardware circuitry, one or more inferences based on the one or more layer outputs.
  - 8. The method of claim 1, wherein determining the batch size comprises:
    - for the plurality of neural network layers, determining a least common multiple of batch sizes across the plurality of neural network layers; and
      
      determining the batch size including a minimum number of layer inputs that are equal to or greater than the least common multiple.
  - 9. The method of claim 1, wherein each layer input corresponds to a feature of a distinct image resource.
  - 10. The method of claim 1, where each layer input corresponds to a feature of an audio sample.

11. A system for performing neural network computations using hardware circuitry comprising a hardware matrix computation unit, the neural network computations being for a neural network having a plurality of neural network layers, the system comprising:
- one or more processors; and
  
  a non-transitory computer-readable medium coupled to the one or more processors and having instructions stored thereon, which, when executed by the one or more processors, cause performance of operations comprising;
  
  obtaining, using the hardware circuitry, a plurality of layer inputs to be processed;
  
  based on (i) a size of a layer input to a particular neural network layer of the plurality of neural network layers and (ii) a weight reuse value representing a number of times that the hardware matrix computation unit of the hardware circuitry reuses weight inputs for neural network computations, determining, using the hardware circuitry, a batch size for the particular neural network layer, wherein the batch size represents a number of batches to be processed in parallel by the hardware matrix computation unit for the particular neural network layer; and
  
  processing, by the hardware matrix computation unit and for the particular neural network layer, one or more batches of layer inputs to generate one or more layer outputs, wherein each batch of the one or more batches includes a number of layer inputs corresponding to the batch size for particular neural network layer.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
- - 12. The system of claim 11, where the weight reuse value is based at least on a clock rate of a memory storing weight inputs.
  - 13. The system of claim 11, wherein the plurality of neural network layers are arranged in a directed graph structure.
  - 14. The system of claim 13, further comprising determining, using the hardware circuitry, a number of the one or more batches of layer inputs for processing based on a batch size of a subsequent layer in the directed graph structure.
  - 15. The system of claim 11, where the batch size is determined based at least on the weight reuse value divided by the size of the layer input.
  - 16. The system of claim 11, wherein processing the one or more batches of layer inputs comprises computing accumulated values for each layer input using the hardware matrix computation unit of the hardware circuitry.
  - 17. The system of claim 16, further comprising determining, using the hardware circuitry, one or more inferences based on the one or more layer outputs.
  - 18. The system of claim 11, wherein determining the batch size comprises:
    - for the plurality of neural network layers, determining a least common multiple of batch sizes across the plurality of neural network layers; and
      
      determining the batch size including a minimum number of layer inputs that are equal to or greater than the least common multiple.

19. A non-transitory computer-readable medium having instructions stored thereon, which, when executed by one or more processors, cause performance of operations comprising:
- obtaining, for a neural network having a plurality of neural network layers, a plurality of layer inputs to be processed, wherein the plurality of layer inputs are obtained using hardware circuitry comprising a hardware matrix computation unit;
  
  based on (i) a size of a layer input to a particular neural network layer of the plurality of neural network layers and (ii) a weight reuse value representing a number of times that the hardware matrix computation unit of the hardware circuitry reuses weight inputs for neural network computations, determining, using the hardware circuitry, a batch size for the particular neural network layer, wherein the batch size represents a number of batches to be processed in parallel by the hardware matrix computation unit for the particular neural network layer; and
  
  processing, by the hardware matrix computation unit and for the particular neural network layer, one or more batches of layer inputs to generate one or more layer outputs, wherein each batch of the one or more batches includes a number of layer inputs corresponding to the batch size for particular neural network layer.
- View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27)
- - 20. The computer-readable medium of claim 19, where the weight reuse value is based at least on a clock rate of a memory storing weight inputs.
  - 21. The computer-readable medium of claim 19, wherein the plurality of neural network layers are arranged in a directed graph structure.
  - 22. The computer-readable medium of claim 21, further comprising determining, using the hardware circuitry, a number of the one or more batches of layer inputs for processing based on a batch size of a subsequent layer in the directed graph structure.
  - 23. The computer-readable medium of claim 19, where the batch size is determined based at least on the weight reuse value divided by the size of the layer input.
  - 24. The computer-readable medium of claim 19, wherein processing the one or more batches of layer inputs comprises computing accumulated values for each layer input using the hardware matrix computation unit of the neural network hardware circuit.
  - 25. The computer-readable medium of claim 24, wherein the operations further comprise determining, using the hardware circuitry, one or more inferences based on the one or more layer outputs.
  - 26. The computer-readable medium of claim 19, wherein determining the batch size comprises:
    - for the plurality of neural network layers, determining a least common multiple of batch sizes across the plurality of neural network layers; and
      
      determining the batch size including a minimum number of layer inputs that are equal to or greater than the least common multiple.
  - 27. The computer-readable medium of claim 19, wherein each layer input corresponds to a feature of a distinct image resource.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google LLC (Alphabet Inc.)
Inventors
Young, Reginald Clifford
Primary Examiner(s)
Gonzales, Vincent

Application Number

US14/844,431
Publication Number

US 20160342890A1
Time in Patent Office

1,118 Days
Field of Search
US Class Current
CPC Class Codes

G06N 3/06   Physical realisation, i.e. ...

G06N 3/063   using electronic means

G06N 3/08   Learning methods

G06N 5/04   Inference or reasoning models

Batch processing in a neural network processor

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

34 Citations

27 Claims

Specification

Use Cases

Quick Links

Others

Batch processing in a neural network processor

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

34 Citations

27 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others