PREFETCHING WEIGHTS FOR USE IN A NEURAL NETWORK PROCESSOR
First Claim
1. A circuit for performing neural network computations for a neural network comprising a plurality of layers, the circuit comprising:
- a systolic array comprising a plurality of cells;
a weight fetcher unit configured to, for each of the plurality of neural network layers;
send, for the neural network layer, a plurality of weight inputs to cells along a first dimension of the systolic array; and
a plurality of weight sequencer units, each weight sequencer unit coupled to a distinct cell along the first dimension of the systolic array, the plurality of weight sequencer units configured to, for each of the plurality of neural network layers;
shift, for the neural network layer, the plurality of weight inputs to cells along the second dimension of the systolic array over a plurality of clock cycles, where each weight input is stored inside a respective cell along the second dimension, and where each cell is configured to compute a product of an activation input and a respective weight input using multiplication circuitry.
2 Assignments
0 Petitions
Accused Products
Abstract
A circuit for performing neural network computations for a neural network, the circuit comprising: a systolic array comprising a plurality of cells; a weight fetcher unit configured to, for each of the plurality of neural network layers: send, for the neural network layer, a plurality of weight inputs to cells along a first dimension of the systolic array; and a plurality of weight sequencer units, each weight sequencer unit coupled to a distinct cell along the first dimension of the systolic array, the plurality of weight sequencer units configured to, for each of the plurality of neural network layers: shift, for the neural network layer, the plurality of weight inputs to cells along the second dimension of the systolic array over a plurality of clock cycles and where each cell is configured to compute a product of an activation input and a respective weight input using multiplication circuitry.
53 Citations
20 Claims
-
1. A circuit for performing neural network computations for a neural network comprising a plurality of layers, the circuit comprising:
-
a systolic array comprising a plurality of cells; a weight fetcher unit configured to, for each of the plurality of neural network layers; send, for the neural network layer, a plurality of weight inputs to cells along a first dimension of the systolic array; and a plurality of weight sequencer units, each weight sequencer unit coupled to a distinct cell along the first dimension of the systolic array, the plurality of weight sequencer units configured to, for each of the plurality of neural network layers; shift, for the neural network layer, the plurality of weight inputs to cells along the second dimension of the systolic array over a plurality of clock cycles, where each weight input is stored inside a respective cell along the second dimension, and where each cell is configured to compute a product of an activation input and a respective weight input using multiplication circuitry. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for performing neural network computations for a neural network comprising a plurality of layers, the method comprising, for each of the plurality of neural network layers:
-
sending, at a weight fetcher unit, a plurality of weight inputs to cells along a first dimension of a systolic array comprising a plurality of cells; shifting, at each of a plurality of weight sequencer units, each weight sequencer unit coupled to a distinct cell along the first dimension of the systolic array, the plurality of weight inputs to cells along the second dimension of the systolic array over a plurality of clock cycles, where each weight input is stored inside a respective cell along the second dimension, and where each cell is configured to compute a product of an activation input and a respective weight input using multiplication circuitry. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification