Apparatus and methods for matrix multiplication
First Claim
1. An apparatus for matrix multiplication in a neural network, comprising:
- a master computation module configured toreceive, in response to an instruction, a first matrix, andtransmit a row vector of the first matrix;
one or more slave computation modules respectively configured tostore a column vector of a second matrix,receive the row vector of the first matrix, andmultiply, in response to the instruction, the row vector of the first matrix with the stored column vector of the second matrix to generate a result element; and
an interconnection unit configured tocombine the one or more result elements generated respectively by the one or more slave computation modules to generate a row vector of a result matrix, andtransmit the row vector of the result matrix to the master computation module,wherein each of the one or more slave computation modules includes;
a slave neuron caching unit configured to store the column vector of the second matrix,one or more multipliers configured to respectively multiply one or more first elements in the row vector of the first matrix with one or more second elements in the stored column vector of the second matrix to generate one or more multiplication results;
an adder configured to add the one or more multiplication results to generate an intermediate value of the row vector of the result matrix; and
an accumulator configured to accumulate the one or more intermediate values to generate the result element.
1 Assignment
0 Petitions
Accused Products
Abstract
Aspects for matrix multiplication in neural network are described herein. The aspects may include a master computation module configured to receive a first matrix and transmit a row vector of the first matrix. In addition, the aspects may include one or more slave computation modules respectively configured to store a column vector of a second matrix, receive the row vector of the first matrix, and multiply the row vector of the first matrix with the stored column vector of the second matrix to generate a result element. Further, the aspects may include an interconnection unit configured to combine the one or more result elements generated respectively by the one or more slave computation modules to generate a row vector of a result matrix and transmit the row vector of the result matrix to the master computation module.
7 Citations
18 Claims
-
1. An apparatus for matrix multiplication in a neural network, comprising:
-
a master computation module configured to receive, in response to an instruction, a first matrix, and transmit a row vector of the first matrix; one or more slave computation modules respectively configured to store a column vector of a second matrix, receive the row vector of the first matrix, and multiply, in response to the instruction, the row vector of the first matrix with the stored column vector of the second matrix to generate a result element; and an interconnection unit configured to combine the one or more result elements generated respectively by the one or more slave computation modules to generate a row vector of a result matrix, and transmit the row vector of the result matrix to the master computation module, wherein each of the one or more slave computation modules includes; a slave neuron caching unit configured to store the column vector of the second matrix, one or more multipliers configured to respectively multiply one or more first elements in the row vector of the first matrix with one or more second elements in the stored column vector of the second matrix to generate one or more multiplication results; an adder configured to add the one or more multiplication results to generate an intermediate value of the row vector of the result matrix; and an accumulator configured to accumulate the one or more intermediate values to generate the result element. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for matrix multiplication in a neural network, comprising:
-
receiving, in response to an instruction, by a master computation module, a first matrix; transmitting, by the master computation module, a row vector of the first matrix to one or more slave computation modules; storing, by the one or more slave computation modules, a column vector of a second matrix; multiplying, in response to the instruction, by the one or more slave computation modules, the row vector of the first matrix with the stored column vector of the second matrix to generate a result element; combining, by an interconnection unit, the one or more result elements generated respectively by the one or more slave computation modules to generate a row vector of a result matrix; transmitting, by the interconnection unit, the row vector of the result matrix to the master computation module; storing, by a slave neuron caching unit of each of the one or more slave computation modules, the column vector of the second matrix; multiplying, by one or more multipliers of each of the one or more slave computation modules, one or more first elements in the row vector of the first matrix with one or more second elements in the stored column vector of the second matrix to generate one or more multiplication results; adding, by an adder of each of the one or more slave computation modules, the one or more multiplication results to generate an intermediate value of the row vector of the result matrix; and accumulating, by an accumulator of each of the one or more slave computation modules, the one or more intermediate values to generate the result element. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification