PROCESSOR AND METHOD FOR OUTER PRODUCT ACCUMULATE OPERATIONS
First Claim
Patent Images
1. A processor for multiplying each one of a plurality n of multiplier operands each having a bit width of b bits and having an aggregate width of r bits where r=n*b with each one of a plurality n of multiplicand operands each having a bit width of b bits and having an aggregate width of r bits where r=n*b, the processor comprising:
- a register file having a bit width of r bits;
an array of multipliers arranged in rows and columns, each column coupled to receive one multiplier operand, each row coupled to receive one multiplicand operand, whereby each multiplier receives a multiplier operand and a multiplicand operand and multiplies them together to provide a plurality n2 of multiplication results having an aggregate bit width greater than r bits;
an array of adders arranged in rows and columns, each adder being coupled to a corresponding multiplier;
an array of accumulators arranged in rows and columns, each accumulator being coupled to a corresponding adder; and
wherein the multiplication result from each multiplier is added to any previous multiplication result stored in the accumulator and provided to the corresponding accumulator and to thereby provide an accumulation result.
1 Assignment
0 Petitions
Accused Products
Abstract
A processor and method for performing outer product and outer product accumulation operations on vector operands requiring large numbers of multiplies and accumulations is disclosed.
-
Citations
23 Claims
-
1. A processor for multiplying each one of a plurality n of multiplier operands each having a bit width of b bits and having an aggregate width of r bits where r=n*b with each one of a plurality n of multiplicand operands each having a bit width of b bits and having an aggregate width of r bits where r=n*b, the processor comprising:
-
a register file having a bit width of r bits; an array of multipliers arranged in rows and columns, each column coupled to receive one multiplier operand, each row coupled to receive one multiplicand operand, whereby each multiplier receives a multiplier operand and a multiplicand operand and multiplies them together to provide a plurality n2 of multiplication results having an aggregate bit width greater than r bits; an array of adders arranged in rows and columns, each adder being coupled to a corresponding multiplier; an array of accumulators arranged in rows and columns, each accumulator being coupled to a corresponding adder; and wherein the multiplication result from each multiplier is added to any previous multiplication result stored in the accumulator and provided to the corresponding accumulator and to thereby provide an accumulation result. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. In a processor for multiplying each one of a plurality n of multiplier operands each having a bit width of b bits and having an aggregate width of r bits where r=n*b with each one of a plurality n of multiplicand operands each having a bit width of b bits and having an aggregate width of r bits where r=n*b, a tile of the processor comprising:
-
a multiplier coupled to receive one of a plurality n of multiplier operands and one of a plurality n of multiplicand operands and multiply them together to provide a multiplication result having a bit width greater than r bits; an adder coupled to the multiplier; and an accumulator coupled to the adder; and wherein the multiplication result from the multiplier is provided to the adder and added to any previous multiplication result stored in the accumulator to thereby provide an accumulation result. - View Dependent Claims (12, 13, 14)
-
-
15. In a processor having an array of multipliers arranged in rows and columns, an array of accumulators arranged in rows and columns, and an array of adders arranged in rows and columns, each multiplier having an associated adder and an associated accumulator, the processor having a register file with a bit width of r bits, a method of performing an outer product of a vector multiplier operand and a vector multiplicand operand comprising:
-
loading a first multiplier operand and a first multiplicand operand into each of the multipliers, the multiplier at location [i, j] in the array receiving first multiplier operand i and first multiplicand operand j; at each multiplier performing a multiplication of the first multiplier operand i and the first first multiplicand operand j to produce a first multiplication result i*j which is wider than r bits; and providing the first multiplication result to the associated adder; adding the first multiplication result to any previous multiplication result to provide an accumulated multiplication result to the associated accumulator. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23)
-
Specification