Depth concatenation using a matrix computation unit

US 9,691,019 B1
Filed: 03/07/2017
Issued: 06/27/2017
Est. Priority Date: 03/07/2017
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving a request to process network inputs to a neural network using an integrated circuit that performs neural network computations in hardware using a matrix computation unit, the neural network comprising a depth concatenation neural network layer that specifies a concatenation of an input tensor having dimensions x₁by y₁by z₁and an input tensor having dimensions x₁by y₁by z₂along a depth dimension to generate an output tensor having dimensions x₁by y₁by (z₁+z₂); and

generating instructions that, when executed by the integrated circuit, cause the integrated circuit to, during processing of a network input by the neural network, generate a layer output tensor that satisfies the specification of the depth concatenation neural network layer by performing operations comprising;

for each spatial location in a first input tensor to the depth concatenation layer and a second input tensor to the depth concatenation layer;

multiplying, using the matrix computation unit, a second depth vector for the spatial location in the second input tensor by a shift weight matrix for the depth concatenation layer to generate a shifted second depth vector that has zeroes as the first z₁entries and entries of the second depth vector as the last z₂entries; and

adding the shifted second depth vector and a first input depth vector for the spatial location in the first input tensor to generate a concatenated depth vector, the first input depth vector having entries of the first input depth vector as the first z₁entries of the first input depth vector and zeroes as the last z₂entries of the first input depth vector.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for depth concatenation using a matrix computation unit. One of the methods includes: receiving a request to process network inputs to a neural network using an integrated circuit, the neural network comprising a depth concatenation neural network layer; and generating instructions that, when executed by the integrated circuit, cause the integrated circuit to performing operations comprising: for each spatial location in a first input tensor to the depth concatenation layer and a second input tensor to the depth concatenation layer: multiplying, using the matrix computation unit, a second depth vector for the spatial location by a shift weight matrix for the depth concatenation layer to generate a shifted second depth vector; and adding the shifted second depth vector and a first input depth vector for the spatial location to generate a concatenated depth vector.

Citations

20 Claims

1. A method comprising:
- receiving a request to process network inputs to a neural network using an integrated circuit that performs neural network computations in hardware using a matrix computation unit, the neural network comprising a depth concatenation neural network layer that specifies a concatenation of an input tensor having dimensions x₁by y₁by z₁and an input tensor having dimensions x₁by y₁by z₂along a depth dimension to generate an output tensor having dimensions x₁by y₁by (z₁+z₂); and
  
  generating instructions that, when executed by the integrated circuit, cause the integrated circuit to, during processing of a network input by the neural network, generate a layer output tensor that satisfies the specification of the depth concatenation neural network layer by performing operations comprising;
  
  for each spatial location in a first input tensor to the depth concatenation layer and a second input tensor to the depth concatenation layer;
  
  multiplying, using the matrix computation unit, a second depth vector for the spatial location in the second input tensor by a shift weight matrix for the depth concatenation layer to generate a shifted second depth vector that has zeroes as the first z₁entries and entries of the second depth vector as the last z₂entries; and
  
  adding the shifted second depth vector and a first input depth vector for the spatial location in the first input tensor to generate a concatenated depth vector, the first input depth vector having entries of the first input depth vector as the first z₁entries of the first input depth vector and zeroes as the last z₂entries of the first input depth vector.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, the operations further comprising:
    - moving the first input depth vector to a set of output sum-in registers of the matrix computation unit; and
      
      wherein adding the shifted second depth vector and the first input depth vector comprises;
      
      moving the shifted second depth vector into the set of output sum-in registers of the matrix computation unit while the first input depth vector is stored in the set of output sum-in registers of the matrix computation unit.
  - 3. The method of claim 2, wherein moving the first input depth vector comprises:
    - multiplying the first input depth vector by a modified identity weight matrix for the depth concatenation layer using the matrix computation unit.
  - 4. The method of claim 3, further comprising:
    - generating the modified identity weight matrix for the depth concatenation layer; and
      
      storing the modified identity weight matrix for the depth concatenation layer in a memory accessible to the special-purpose integrated circuit.
  - 5. The method of claim 1, further comprising:
    - generating the shift weight matrix for the depth concatenation layer; and
      
      storing the shift weight matrix for the depth concatenation layer in a memory accessible to the special-purpose integrated circuit.
  - 6. The method of claim 5, further comprising:
    - determining that the number of depth dimensions in the output tensor does not exceed a maximum vector length for the matrix computation unit; and
      
      generating the shift weight matrix for the depth concatenation in response to determining that the number of depth dimensions in the output tensor does not exceed the maximum vector length for the matrix computation unit.
  - 7. The method of claim 1, wherein the shift weight matrix for the depth concatenation layer is a (z₁+z₂) by (z₁+z₂) matrix having all entries be zero except for a diagonal row of ones starting at the first entry of the z₂-th column of the matrix.

8. A system comprising one or more computers and one or more storage devices storing first instructions that when executed by the one or more computers cause the one or more computers to perform first operations comprising:
- receiving a request to process network inputs to a neural network using an integrated circuit that performs neural network computations in hardware using a matrix computation unit, the neural network comprising a depth concatenation neural network layer that specifies a concatenation of an input tensor having dimensions x₁by y₁by z₁and an input tensor having dimensions x₁by y₁by z₂along a depth dimension to generate an output tensor having dimensions x₁by y₁by (z₁+z₂); and
  
  generating second instructions that, when executed by the integrated circuit, cause the integrated circuit to, during processing of a network input by the neural network, generate a layer output tensor that satisfies the specification of the depth concatenation neural network layer by performing second operations comprising;
  
  for each spatial location in a first input tensor to the depth concatenation layer and a second input tensor to the depth concatenation layer;
  
  multiplying, using the matrix computation unit, a second depth vector for the spatial location in the second input tensor by a shift weight matrix for the depth concatenation layer to generate a shifted second depth vector that has zeroes as the first z₁entries followed by entries of the second depth vector; and
  
  adding the shifted second depth vector and a first input depth vector for the spatial location in the first input tensor to generate a concatenated depth vector, the first input depth vector having entries of the first input depth vector as the first z₁entries of the first input depth vector.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The system of claim 8, the second operations further comprising:
    - moving the first input depth vector to a set of output sum-in registers of the matrix computation unit; and
      
      wherein adding the shifted second depth vector and the first input depth vector comprises;
      
      moving the shifted second depth vector into the set of output sum-in registers of the matrix computation unit while the first input depth vector is stored in the set of output sum-in registers of the matrix computation unit.
  - 10. The system of claim 9, wherein moving the first input depth vector comprises:
    - multiplying the first input depth vector by a modified identity weight matrix for the depth concatenation layer using the matrix computation unit.
  - 11. The system of claim 10, the first operations further comprising:
    - generating the modified identity weight matrix for the depth concatenation layer; and
      
      storing the modified identity weight matrix for the depth concatenation layer in a memory accessible to the special-purpose integrated circuit.
  - 12. The system of claim 8, the first operations further comprising:
    - generating the shift weight matrix for the depth concatenation layer; and
      
      storing the shift weight matrix for the depth concatenation layer in a memory accessible to the special-purpose integrated circuit.
  - 13. The system of claim 12, the first operations further comprising:
    - determining that the number of depth dimensions in the output tensor does not exceed a maximum vector length for the matrix computation unit; and
      
      generating the shift weight matrix for the depth concatenation in response to determining that the number of depth dimensions in the output tensor does not exceed the maximum vector length for the matrix computation unit.
  - 14. The system of claim 8, wherein the shift weight matrix for the depth concatenation layer is a matrix having all entries be zero except for a diagonal row of ones starting at the first entry of the z₂-th column of the matrix.

15. One or more non-transitory computer storage media encoded with first instructions that when executed by one or more computers cause the one or more computers to perform first operations comprising:
- receiving a request to process network inputs to a neural network using an integrated circuit that performs neural network computations in hardware using a matrix computation unit, the neural network comprising a depth concatenation neural network layer that specifies a concatenation of an input tensor having dimensions x1 by y1 by z1 and an input tensor having dimensions x1 by y1 by z2 along a depth dimension to generate an output tensor having dimensions x1 by y1 by (z1+z2); and
  
  generating second instructions that, when executed by the integrated circuit, cause the integrated circuit to, during processing of a network input by the neural network, generate a layer output tensor that satisfies the specification of the depth concatenation neural network layer by performing second operations comprising;
  
  for each spatial location in a first input tensor to the depth concatenation layer and a second input tensor to the depth concatenation layer;
  
  multiplying, using the matrix computation unit, a second depth vector for the spatial location in the second input tensor by a shift weight matrix for the depth concatenation layer to generate a shifted second depth vector that has zeroes as the first z1 entries followed by entries of the second depth vector; and
  
  adding the shifted second depth vector and a first input depth vector for the spatial location in the first input tensor to generate a concatenated depth vector, the first input depth vector having entries of the first input depth vector as the first z1 entries of the first input depth vector.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The computer storage media of claim 15, the second operations further comprising:
    - moving the first input depth vector to a set of output sum-in registers of the matrix computation unit; and
      
      wherein adding the shifted second depth vector and the first input depth vector comprises;
      
      moving the shifted second depth vector into the set of output sum-in registers of the matrix computation unit while the first input depth vector is stored in the set of output sum-in registers of the matrix computation unit.
  - 17. The computer storage media of claim 16, wherein moving the first input depth vector comprises:
    - multiplying the first input depth vector by a modified identity weight matrix for the depth concatenation layer using the matrix computation unit.
  - 18. The computer storage media of claim 17, the first operations further comprising:
    - generating the modified identity weight matrix for the depth concatenation layer; and
      
      storing the modified identity weight matrix for the depth concatenation layer in a memory accessible to the special-purpose integrated circuit.
  - 19. The computer storage media of claim 15, the first operations further comprising:
    - generating the shift weight matrix for the depth concatenation layer; and
      
      storing the shift weight matrix for the depth concatenation layer in a memory accessible to the special-purpose integrated circuit.
  - 20. The computer storage media of claim 15, wherein the shift weight matrix for the depth concatenation layer is a matrix having all entries be zero except for a diagonal row of ones starting at the first entry of the z₂-th column of the matrix.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Young, Reginald Clifford, Gulland, William John
Primary Examiner(s)
Holmes, Michael B

Application Number

US15/452,624
Time in Patent Office

112 Days
Field of Search

706 16
US Class Current
CPC Class Codes

G06F 17/16   Matrix or vector computatio...

G06F 9/46   Multiprogramming arrangements

G06N 3/063   using electronic means

Depth concatenation using a matrix computation unit

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Depth concatenation using a matrix computation unit

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links