Prefetching weights for use in a neural network processor
First Claim
1. A system for performing neural network computations for a neural network having a plurality of neural network layers, the system comprising:
- a matrix computation unit comprising M×
N cells, wherein M and N are positive integers that are larger than one, andwherein each cell of the M×
N cells includes respective circuitry configured to;
obtain a respective weight input for a neural network layer of the plurality of neural network layers;
obtain a respective activation input for the neural network layer;
determine whether to load the respective weight input in the respective cell or to provide the respective weight input to a next cell of the M×
N cells;
in response to determining to load the respective weight input in the respective cell, determine a respective multiplication product based on the respective weight input and the respective activation input; and
in response to determining to provide the respective weight input to the next cell, provide the respective weight input to the next cell without determining a respective multiplication product based on the respective weight input and the respective activation input.
2 Assignments
0 Petitions
Accused Products
Abstract
A circuit for performing neural network computations for a neural network, the circuit comprising: a systolic array comprising a plurality of cells; a weight fetcher unit configured to, for each of the plurality of neural network layers: send, for the neural network layer, a plurality of weight inputs to cells along a first dimension of the systolic array; and a plurality of weight sequencer units, each weight sequencer unit coupled to a distinct cell along the first dimension of the systolic array, the plurality of weight sequencer units configured to, for each of the plurality of neural network layers: shift, for the neural network layer, the plurality of weight inputs to cells along the second dimension of the systolic array over a plurality of clock cycles and where each cell is configured to compute a product of an activation input and a respective weight input using multiplication circuitry.
-
Citations
20 Claims
-
1. A system for performing neural network computations for a neural network having a plurality of neural network layers, the system comprising:
-
a matrix computation unit comprising M×
N cells, wherein M and N are positive integers that are larger than one, andwherein each cell of the M×
N cells includes respective circuitry configured to;obtain a respective weight input for a neural network layer of the plurality of neural network layers; obtain a respective activation input for the neural network layer; determine whether to load the respective weight input in the respective cell or to provide the respective weight input to a next cell of the M×
N cells;in response to determining to load the respective weight input in the respective cell, determine a respective multiplication product based on the respective weight input and the respective activation input; and in response to determining to provide the respective weight input to the next cell, provide the respective weight input to the next cell without determining a respective multiplication product based on the respective weight input and the respective activation input. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A method for performing neural network computations using hardware circuitry comprising a matrix computation unit, the neural network computations being for a neural network having a plurality of neural network layers, the method comprising:
-
for each cell of M×
N cells of a matrix computation unit;obtaining, using the hardware circuitry, a respective weight input for a neural network layer of the plurality of neural network layers; obtaining, using the hardware circuitry, a respective activation input for the neural network layer; determining, using the hardware circuitry, whether to load the respective weight input in the respective cell or to provide the respective weight input to a next cell of the M×
N cells;in response to determining to load the respective weight input in the respective cell, determining a respective multiplication product based on the respective weight input and the respective activation input; and in response to determining to provide the respective weight input to the next cell, providing, using the hardware circuitry, the respective weight input to the next cell without determining a respective multiplication product based on the respective weight input and the respective activation input, wherein M and N are positive integers that are larger than one. - View Dependent Claims (17, 18)
-
-
19. A matrix computation unit for performing neural network computations for a neural network having a plurality of neural network layers, the matrix computation unit comprising:
-
M×
N cells, wherein M and N are positive integers that are larger than one,wherein each cell of the M×
N cells includes respective circuitry configured to;obtain a respective weight input for a neural network layer of the plurality of neural network layers; obtain a respective activation input for the neural network layer; determine whether to load the respective weight input in the respective cell or to provide the respective weight input to a next cell of the M×
N cells;in response to determining to load the respective weight input in the respective cell, determine a respective multiplication product based on the respective weight input and the respective activation input; and in response to determining to provide the respective weight input to the next cell, provide the respective weight input to the next cell without determining a respective multiplication product based on the respective weight input and the respective activation input. - View Dependent Claims (20)
-
Specification