Multiply-sum dot product instruction with mask and splat
First Claim
Patent Images
1. A method of generating a dot product sum, comprising:
- receiving an instruction specifying at least two source registers and a target register;
generating a dot product sum by multiplying word elements contained in each source register and summing the products of the multiplication, wherein the word elements that participate in the multiplication are specified by one or more bits in the instruction; and
storing the dot product sum in none, one, or more word elements contained in the target register.
1 Assignment
0 Petitions
Accused Products
Abstract
An instruction, corresponding methods, and circuitry for efficiently performing partial dot sum products are provided. The instruction may include a source select field for specifying one or more source word elements to participate in the dot sum operation. The instruction may also include a target select field for specifying one or more (or none) target word elements for storing the result of the dot sum operation.
-
Citations
21 Claims
-
1. A method of generating a dot product sum, comprising:
-
receiving an instruction specifying at least two source registers and a target register;
generating a dot product sum by multiplying word elements contained in each source register and summing the products of the multiplication, wherein the word elements that participate in the multiplication are specified by one or more bits in the instruction; and
storing the dot product sum in none, one, or more word elements contained in the target register. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method of generating a dot product sum with accumulate, comprising:
-
receiving an instruction specifying at least two source registers and a target register;
generating a dot product sum by multiplying word elements contained in each source register and summing the products of the multiplication, wherein the word elements that participate in the multiplication are specified by one or more bits in the instruction;
adding the dot product sum to a value contained in an accumulate register to generate an accumulated sum; and
storing the accumulated sum in none, one, or more word elements contained in the target register. - View Dependent Claims (9, 10, 11)
-
-
12. A circuit for executing a dot product sum instruction, comprising:
-
mask logic configured to select word elements from at least two source registers to participate in a calculation of a dot product sum based on one or more bits contained in the instruction;
multiply sum logic configured to perform the calculation of the dot product sum based on the word elements selected by the mask logic; and
target routing logic configured to store the dot product sum calculated by the multiply sum logic in none, one, or all word elements of a target register. - View Dependent Claims (13, 14, 15, 16, 17)
-
-
18. A circuit for executing a dot product sum with accumulate instruction, comprising:
-
mask logic configured to select word elements from at least two source registers to participate in a calculation of a dot product sum based on one or more bits contained in the instruction;
multiply-sum-accumulate logic configured to perform the calculation of the dot product sum based on the word elements selected by the mask logic and add the dot product sum to the contents of an accumulate register to generate an accumulated sum; and
target routing logic configured to store the accumulated sum in none, one, or all word elements of a target register. - View Dependent Claims (19, 20, 21)
-
Specification