Processor and method for executing matrix multiplication operation on processor
First Claim
1. A processor comprising:
- a data bus; and
an array processor having k processing units;
the data bus configured to sequentially read 1×
n row vectors from an M×
N multiplicand matrix and input the 1×
n row vectors to each processing unit in the array processor, read an n×
k submatrix from an N×
K multiplier matrix and input each of n×
1 column vectors of the n×
k submatrix to a corresponding processing unit in the array processor, and output a result obtained by each processing unit after executing a vector multiplication operation; and
the each processing unit in the array processor configured to execute in parallel the vector multiplication operation on the input 1×
n row vectors and the input n×
1 column vectors, and the each processing unit comprising a Wallace tree multiplier having n multipliers and n−
1 adders, the Wallace tree multiplier in the each processing unit being configured to execute in parallel a multiplication operation and an addition operation in the vector multiplication operation,wherein n, k, M, and N are integers greater than 1.
3 Assignments
0 Petitions
Accused Products
Abstract
A processor and a method for executing a matrix multiplication operation on a processor. A specific implementation of the processor includes a data bus and an array processor having k processing units. The data bus is configured to sequentially read n columns of row vectors from an M×N multiplicand matrix and input same to each processing unit in the array processor, read an n×k submatrix from an N×K multiplier matrix and input each column vector of the submatrix to a corresponding processing unit in the array processor, and output a result obtained by each processing unit after executing a multiplication operation. Each processing unit in the array processor is configured to execute in parallel a vector multiplication operation on the input row and column vectors. Each processing unit includes a Wallace tree multiplier having n multipliers and n−1 adders. This implementation improves the processing efficiency of a matrix multiplication operation.
13 Citations
12 Claims
-
1. A processor comprising:
-
a data bus; and an array processor having k processing units; the data bus configured to sequentially read 1×
n row vectors from an M×
N multiplicand matrix and input the 1×
n row vectors to each processing unit in the array processor, read an n×
k submatrix from an N×
K multiplier matrix and input each of n×
1 column vectors of the n×
k submatrix to a corresponding processing unit in the array processor, and output a result obtained by each processing unit after executing a vector multiplication operation; andthe each processing unit in the array processor configured to execute in parallel the vector multiplication operation on the input 1×
n row vectors and the input n×
1 column vectors, and the each processing unit comprising a Wallace tree multiplier having n multipliers and n−
1 adders, the Wallace tree multiplier in the each processing unit being configured to execute in parallel a multiplication operation and an addition operation in the vector multiplication operation,wherein n, k, M, and N are integers greater than 1. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for executing a matrix multiplication operation on a processor, the processor comprising an array processor having k processing units, the method comprising:
-
reading 1×
n row vectors in an M×
N multiplicand matrix to each processing unit in the array processor, the each processing unit comprising a Wallace tree multiplier having n multipliers and n−
1 adders;reading each of n×
1 column vectors in an n×
k submatrix in an N×
K multiplier matrix to a corresponding processing unit in the array processor respectively;executing in parallel a vector multiplication operation on each of the n×
1 column vectors and each of the 1×
n row vectors by using the processing units, the Wallace tree multiplier in the each processing unit being configured to execute in parallel a multiplication operation and an addition operation in the vector multiplication operation; andoutputting a result obtained by the each processing unit after executing the vector multiplication operation, wherein n, k, M, and N are integers greater than 1. - View Dependent Claims (7, 8, 9, 10, 11, 12)
-
Specification