Prefetching weights for use in a neural network processor

US 10,049,322 B2
Filed: 09/03/2015
Issued: 08/14/2018
Est. Priority Date: 05/21/2015
Status: Active Grant

First Claim

Patent Images

1. A system for performing neural network computations for a neural network having a plurality of neural network layers, the system comprising:

a matrix computation unit comprising M×

N cells, wherein M and N are positive integers that are larger than one, andwherein each cell of the M×

N cells includes respective circuitry configured to;

obtain a respective weight input for a neural network layer of the plurality of neural network layers;

obtain a respective activation input for the neural network layer;

determine whether to load the respective weight input in the respective cell or to provide the respective weight input to a next cell of the M×

N cells;

in response to determining to load the respective weight input in the respective cell, determine a respective multiplication product based on the respective weight input and the respective activation input; and

in response to determining to provide the respective weight input to the next cell, provide the respective weight input to the next cell without determining a respective multiplication product based on the respective weight input and the respective activation input.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A circuit for performing neural network computations for a neural network, the circuit comprising: a systolic array comprising a plurality of cells; a weight fetcher unit configured to, for each of the plurality of neural network layers: send, for the neural network layer, a plurality of weight inputs to cells along a first dimension of the systolic array; and a plurality of weight sequencer units, each weight sequencer unit coupled to a distinct cell along the first dimension of the systolic array, the plurality of weight sequencer units configured to, for each of the plurality of neural network layers: shift, for the neural network layer, the plurality of weight inputs to cells along the second dimension of the systolic array over a plurality of clock cycles and where each cell is configured to compute a product of an activation input and a respective weight input using multiplication circuitry.

Citations

20 Claims

1. A system for performing neural network computations for a neural network having a plurality of neural network layers, the system comprising:
- a matrix computation unit comprising M×
  
  N cells, wherein M and N are positive integers that are larger than one, andwherein each cell of the M×
  
  N cells includes respective circuitry configured to;
  
  obtain a respective weight input for a neural network layer of the plurality of neural network layers;
  
  obtain a respective activation input for the neural network layer;
  
  determine whether to load the respective weight input in the respective cell or to provide the respective weight input to a next cell of the M×
  
  N cells;
  
  in response to determining to load the respective weight input in the respective cell, determine a respective multiplication product based on the respective weight input and the respective activation input; and
  
  in response to determining to provide the respective weight input to the next cell, provide the respective weight input to the next cell without determining a respective multiplication product based on the respective weight input and the respective activation input.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The system of claim 1,wherein for M cells of the M×
    - N cells, obtaining the respective activation input comprises obtaining the respective activation input from a respective value loader.
  - 3. The system of claim 1,wherein for N cells of the M×
    - N cells, obtaining the respective weight input comprises obtaining the respective weight input from a weight fetcher interface.
  - 4. The system of claim 1,wherein for (M−
    - 1)×
      
      (N−
      
      1) cells of the M×
      
      N cells, obtaining the respective weight input comprises obtaining the respective weight input shifted from a respective first cell of the M×
      
      N cells, and obtaining the respective activation input comprises obtaining the respective activation input shifted from a respective second cell of the M×
      
      N cells that is different from the respective first cell.
  - 5. The system of claim 1,wherein determining whether to load the respective weight input in the respective cell or to provide the respective weight input to a next cell of the M×
    - N cells comprises determining whether to load the respective weight input in the respective cell or to provide the respective weight input to a next cell of the M×
      
      N cells based on a control signal.
  - 6. The system of claim 5,wherein determining whether to load the respective weight input in the respective cell or to provide the respective weight input to a next cell of the M×
    - N cells comprises;
      
      determining that the control signal is equal to a predetermined value; and
      
      in response to determining that the control signal is equal to the predetermined value, determining that the respective weight input is to be loaded in the respective cell.
  - 7. The system of claim 6,wherein the respective circuitry of each cell of the M×
    - N cells comprises;
      
      a respective weight control register configured to store the control signal; and
      
      a respective weight register configured to load the respective weight input in response to determining that the control signal is equal to the predetermined value.
  - 8. The system of claim 5, wherein the matrix computation unit further comprises M weight sequencers, each weight sequencer of the M weight sequencers configured to provide a respective control signal to a corresponding cell of M cells of the M×
    - N cells.
  - 9. The system of claim 8, wherein each weight sequencer includes decrement circuitry configured to decrement, at each clock cycle, a value of the respective control signal provided to the corresponding cell.
  - 10. The system of claim 8, wherein each weight sequencer of (M−
    - 1) weight sequencers of the M weight sequencers is configured to provide the respective control signal to a next weight sequencer of the M weight sequencers.
  - 11. The system of claim 1, wherein in response to determining to load the respective weight input in the respective cell, the respective circuitry is further configured to:
    - determine a respective accumulated value based at least on the respective multiplication product; and
      
      provide the respective accumulated value for determining an output for the neural network layer.
  - 12. The system of claim 11, further comprising:
    - a first memory configured to provide activation inputs for the plurality of neural network layers; and
      
      a second memory configured to provide weight inputs for the plurality of neural network layers.
  - 13. The system of claim 12, further comprising:
    - vector computation circuitry configured to;
      
      determine a vector based on one or more accumulated values received from the matrix computation unit; and
      
      provide the vector to the first memory.
  - 14. The system of claim 13, further comprising:
    - sequencer circuitry configured to provide one or more control signals to the first memory, the second memory, the vector computation circuitry, or the matrix computation unit to control a dataflow of the system.
  - 15. The system of claim 1, wherein the M×
    - N cells form a systolic array.

16. A method for performing neural network computations using hardware circuitry comprising a matrix computation unit, the neural network computations being for a neural network having a plurality of neural network layers, the method comprising:
- for each cell of M×
  
  N cells of a matrix computation unit;
  
  obtaining, using the hardware circuitry, a respective weight input for a neural network layer of the plurality of neural network layers;
  
  obtaining, using the hardware circuitry, a respective activation input for the neural network layer;
  
  determining, using the hardware circuitry, whether to load the respective weight input in the respective cell or to provide the respective weight input to a next cell of the M×
  
  N cells;
  
  in response to determining to load the respective weight input in the respective cell, determining a respective multiplication product based on the respective weight input and the respective activation input; and
  
  in response to determining to provide the respective weight input to the next cell, providing, using the hardware circuitry, the respective weight input to the next cell without determining a respective multiplication product based on the respective weight input and the respective activation input,wherein M and N are positive integers that are larger than one.
- View Dependent Claims (17, 18)
- - 17. The method of claim 16, wherein determining whether to load the respective weight input in the respective cell or to provide the respective weight input to a next cell of the M×
    - N cells comprises determining whether to load the respective weight input in the respective cell or to provide the respective weight input to a next cell of the M×
      
      N cells based on a control signal generated at the hardware circuitry.
  - 18. The method of claim 17,wherein determining whether to load the respective weight input in the respective cell or to provide the respective weight input to a next cell of the M×
    - N cells comprises;
      
      determining that the control signal generated at the hardware circuitry is equal to a predetermined value; and
      
      in response to determining that the control signal is equal to the predetermined value, determining that the respective weight input is to be loaded in the respective cell.

19. A matrix computation unit for performing neural network computations for a neural network having a plurality of neural network layers, the matrix computation unit comprising:
- M×
  
  N cells, wherein M and N are positive integers that are larger than one,wherein each cell of the M×
  
  N cells includes respective circuitry configured to;
  
  obtain a respective weight input for a neural network layer of the plurality of neural network layers;
  
  obtain a respective activation input for the neural network layer;
  
  determine whether to load the respective weight input in the respective cell or to provide the respective weight input to a next cell of the M×
  
  N cells;
  
  in response to determining to load the respective weight input in the respective cell, determine a respective multiplication product based on the respective weight input and the respective activation input; and
  
  in response to determining to provide the respective weight input to the next cell, provide the respective weight input to the next cell without determining a respective multiplication product based on the respective weight input and the respective activation input.
- View Dependent Claims (20)
- - 20. The matrix computation unit of claim 19, wherein determining whether to load the respective weight input in the respective cell or to provide the respective weight input to a next cell of the M×
    - N cells comprises determining whether to load the respective weight input in the respective cell or to provide the respective weight input to a next cell of the M×
      
      N cells based on a control signal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google LLC (Alphabet Inc.)
Inventors
Ross, Jonathan
Primary Examiner(s)
Gonzales, Vincent

Application Number

US14/844,670
Publication Number

US 20160342892A1
Time in Patent Office

1,076 Days
Field of Search
US Class Current
CPC Class Codes

G06F 15/8046 Systolic arrays

G06N 3/063 using electronic means

Prefetching weights for use in a neural network processor

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Prefetching weights for use in a neural network processor

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links