Low latency matrix multiply unit
First Claim
1. A matrix multiply unit configured to perform neural network computations of a neural network, the matrix multiply unit implemented as a systolic array of cells, the systolic array of cells arranged in a two-dimensional format, each cell of the array of cells comprising:
- a weight matrix register configured to receive a weight input of the neural network from one or more weight storing registers;
the one or more weight storing registers, wherein the one or more weight storing registers are configured to receive weight inputs of the neural network to be stored in the weight matrix register from both a first direction of the two-dimensional format and a second direction of the two-dimensional format, the second direction being different from the first direction; and
a multiply unit that is coupled to the weight matrix register and configured to multiply the weight input of the weight matrix register with a vector data input of the neural network in order to obtain a multiplication result.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus for a matrix multiply unit implemented as a systolic array of cells are disclosed. Each cell of the matrix multiply includes: a weight matrix register configured to receive a weight input from either a transposed or a non-transposed weight shift register; a transposed weight shift register configured to receive a weight input from a horizontal direction to be stored in the weight matrix register; a non-transposed weight shift register configured to receive a weight input from a vertical direction to be stored in the weight matrix register; and a multiply unit that is coupled to the weight matrix register and configured to multiply the weight input of the weight matrix register with a vector data input in order to obtain a multiplication result.
10 Citations
12 Claims
-
1. A matrix multiply unit configured to perform neural network computations of a neural network, the matrix multiply unit implemented as a systolic array of cells, the systolic array of cells arranged in a two-dimensional format, each cell of the array of cells comprising:
-
a weight matrix register configured to receive a weight input of the neural network from one or more weight storing registers; the one or more weight storing registers, wherein the one or more weight storing registers are configured to receive weight inputs of the neural network to be stored in the weight matrix register from both a first direction of the two-dimensional format and a second direction of the two-dimensional format, the second direction being different from the first direction; and a multiply unit that is coupled to the weight matrix register and configured to multiply the weight input of the weight matrix register with a vector data input of the neural network in order to obtain a multiplication result. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
Specification