Vector multiplication with accumulation in large register space
First Claim
Patent Images
1. An apparatus comprising:
- a decoder to decode a single vector multiply add instruction into a decoded single vector multiply add instruction; and
an instruction execution pipeline having a vector functional unit to execute the decoded single vector multiply add instruction to multiply respective K bit elements of two vectors and accumulate a portion of each of their respective products with another respective input operand in an X bit accumulator, wherein X is greater than K to store any carry, and the portion is a first portion when a field of the single vector multiply add instruction is a first value and the portion is a non-overlapping second portion when the field is a second value.
1 Assignment
0 Petitions
Accused Products
Abstract
An apparatus is described having an instruction execution pipeline that has a vector functional unit to support a vector multiply add instruction. The vector multiply add instruction to multiply respective K bit elements of two vectors and accumulate a portion of each of their respective products with another respective input operand in an X bit accumulator, where X is greater than K.
44 Citations
21 Claims
-
1. An apparatus comprising:
-
a decoder to decode a single vector multiply add instruction into a decoded single vector multiply add instruction; and an instruction execution pipeline having a vector functional unit to execute the decoded single vector multiply add instruction to multiply respective K bit elements of two vectors and accumulate a portion of each of their respective products with another respective input operand in an X bit accumulator, wherein X is greater than K to store any carry, and the portion is a first portion when a field of the single vector multiply add instruction is a first value and the portion is a non-overlapping second portion when the field is a second value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. An apparatus comprising:
-
a decoder to decode a single vector multiply add instruction into a decoded single vector multiply add instruction; and an instruction execution pipeline having a vector functional unit to execute the decoded single vector multiply add instruction to multiply respective K bit elements of two vectors and accumulate a portion of each of their respective products with another respective input operand in an X bit accumulator, wherein X is greater than K, and the portion is a first portion when a field of the single vector multiply add instruction is a first value and the portion is a non-overlapping second portion when the field is a second value. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
Specification