Complex Matrix Multiplication Operations with Data Pre-Conditioning in a High Performance Computing Architecture
First Claim
1. A method, in a data processing system comprising a processor, for performing a complex matrix multiplication operation, comprising:
- performing, by the processor, a vector load operation to load a first vector operand of the complex matrix multiplication operation to a first target vector register of the data processing system, the first vector operand comprising a real part of a first complex vector value and an imaginary part of the first complex vector value;
performing, by the processor, a complex load and splat operation to load a second complex vector value of a second vector operand and replicate the second complex vector value within a second target vector register of the data processing system, wherein the second complex vector value has a real part and an imaginary part;
performing, by the processor, a cross multiply add operation on elements of the first target vector register and elements of the second target vector register to generate a partial product of the complex matrix multiplication operation; and
accumulating, by the processor, the partial product of the complex matrix multiplication operation with other partial products of the complex matrix multiplication operation and storing a resulting accumulated partial product in a result vector register.
2 Assignments
0 Petitions
Accused Products
Abstract
Mechanisms for performing a complex matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the complex matrix multiplication operation to a first target vector register. The first vector operand comprises a real and imaginary part of a first complex vector value. A complex load and splat operation is performed to load a second complex vector value of a second vector operand and replicate the second complex vector value within a second target vector register. The second complex vector value has a real and imaginary part. A cross multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the complex matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored in a result vector register.
116 Citations
20 Claims
-
1. A method, in a data processing system comprising a processor, for performing a complex matrix multiplication operation, comprising:
-
performing, by the processor, a vector load operation to load a first vector operand of the complex matrix multiplication operation to a first target vector register of the data processing system, the first vector operand comprising a real part of a first complex vector value and an imaginary part of the first complex vector value; performing, by the processor, a complex load and splat operation to load a second complex vector value of a second vector operand and replicate the second complex vector value within a second target vector register of the data processing system, wherein the second complex vector value has a real part and an imaginary part; performing, by the processor, a cross multiply add operation on elements of the first target vector register and elements of the second target vector register to generate a partial product of the complex matrix multiplication operation; and accumulating, by the processor, the partial product of the complex matrix multiplication operation with other partial products of the complex matrix multiplication operation and storing a resulting accumulated partial product in a result vector register. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer program product comprising a computer readable storage medium having a computer readable program recorded thereon, wherein the computer readable program, when executed on a computing device, causes the computing device to:
-
perform a vector load operation to load a first vector operand of a complex matrix multiplication operation to a first target vector register, the first vector operand comprising a real part of a first complex vector value and an imaginary part of the first complex vector value; perform a complex load and splat operation to load a second complex vector value of a second vector operand and replicate the second complex vector value within a second target vector register, wherein the second complex vector value has a real part and an imaginary part; perform a cross multiply add operation on elements of the first target vector register and elements of the second target vector register to generate a partial product of the complex matrix multiplication operation; and accumulate the partial product of the complex matrix multiplication operation with other partial products of the complex matrix multiplication operation and storing a resulting accumulated partial product in a result vector register. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. An apparatus, comprising:
-
a processor; and a memory coupled to the processor, wherein the memory comprises instructions which, when executed by the processor, cause the processor to; perform a vector load operation to load a first vector operand of a complex matrix multiplication operation to a first target vector register, the first vector operand comprising a real part of a first complex vector value and an imaginary part of the first complex vector value; perform a complex load and splat operation to load a second complex vector value of a second vector operand and replicate the second complex vector value within a second target vector register, wherein the second complex vector value has a real part and an imaginary part; perform a cross multiply add operation on elements of the first target vector register and elements of the second target vector register to generate a partial product of the complex matrix multiplication operation; and accumulate the partial product of the complex matrix multiplication operation with other partial products of the complex matrix multiplication operation and storing a resulting accumulated partial product in a result vector register. - View Dependent Claims (17, 18, 19, 20)
-
Specification