×

Batch processing in a neural network processor

  • US 9,842,293 B2
  • Filed: 12/22/2016
  • Issued: 12/12/2017
  • Est. Priority Date: 05/21/2015
  • Status: Active Grant
First Claim
Patent Images

1. A method for generating a respective neural network output for each of a plurality of inputs, wherein the generating comprises processing each input through each of a plurality of neural network layers to generate the respective neural network output for the input, wherein the neural network layers are arranged in a directed graph structure, and wherein each neural network layer has a respective batch size, the method comprising, for each of the neural network layers:

  • receiving a plurality of inputs to be processed at the neural network layer;

    forming one or more batches of inputs from the plurality of inputs, each batch having a number of inputs equal to the respective batch size for the neural network layer, where the respective batch size is based at least on a weight reuse value, the weight reuse value representing a number of times that weight inputs need to be reused for a compute time of output values using the weight inputs at a hardware matrix computation unit of a neural network hardware circuit to be longer than a load time of the weight inputs from memory;

    selecting a number of the one or more batches of inputs to process, where a count of the inputs in the number of the one or more batches is greater than, less than, or equal to the respective associated batch size of a subsequent layer in the directed graph structure; and

    processing, at the neural network hardware circuit and using the hardware matrix computation unit, the number of the one or more batches of inputs to generate the respective neural network layer output.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×