Low latency matrix multiply unit
First Claim
1. A matrix multiply unit configured to perform neural network computations of a neural network, the matrix multiply unit implemented as a systolic array of cells, the systolic array of cells arranged in a two-dimensional format, each cell of the array of cells comprising:
- a weight matrix register configured to receive one of a first weight input of the neural network from a transposed weight shift register and a second weight input of the neural network from a non-transposed weight shift register;
the transposed weight shift register configured to receive the first weight input from a first direction of the two-dimensional format to be stored in the weight matrix register;
the non-transposed weight shift register configured to receive the second weight input from a second direction of the two-dimensional format to be stored in the weight matrix register, the second direction being perpendicular to the first direction; and
a multiply unit that is coupled to the weight matrix register and configured to multiply the received weight input with a vector data input of the neural network to obtain a multiplication result.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus for a matrix multiply unit implemented as a systolic array of cells are disclosed. Each cell of the matrix multiply includes: a weight matrix register configured to receive a weight input from either a transposed or a non-transposed weight shift register; a transposed weight shift register configured to receive a weight input from a horizontal direction to be stored in the weight matrix register; a non-transposed weight shift register configured to receive a weight input from a vertical direction to be stored in the weight matrix register; and a multiply unit that is coupled to the weight matrix register and configured to multiply the weight input of the weight matrix register with a vector data input in order to obtain a multiplication result.
-
Citations
13 Claims
-
1. A matrix multiply unit configured to perform neural network computations of a neural network, the matrix multiply unit implemented as a systolic array of cells, the systolic array of cells arranged in a two-dimensional format, each cell of the array of cells comprising:
-
a weight matrix register configured to receive one of a first weight input of the neural network from a transposed weight shift register and a second weight input of the neural network from a non-transposed weight shift register; the transposed weight shift register configured to receive the first weight input from a first direction of the two-dimensional format to be stored in the weight matrix register; the non-transposed weight shift register configured to receive the second weight input from a second direction of the two-dimensional format to be stored in the weight matrix register, the second direction being perpendicular to the first direction; and a multiply unit that is coupled to the weight matrix register and configured to multiply the received weight input with a vector data input of the neural network to obtain a multiplication result. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
Specification