RECONFIGURABLE MULTI-PRECISION INTEGER DOT-PRODUCT HARDWARE ACCELERATOR FOR MACHINE-LEARNING APPLICATIONS
First Claim
1. A configurable integrated circuit to compute vector dot products between a first N-bit vector and a second N-bit vector in a plurality of precision modes, the circuit comprising:
- M slices, each slice to calculate the vector dot products between a corresponding segment of the first and the second N-bit vectors, each slice to output a plurality of intermediary multiplier results for a first precision mode;
a plurality of adder trees to sum up the plurality of intermediate multiplier results from the M slices, each of the plurality of adder trees to produce a respective adder out result; and
an accumulator to merge the adder out result from a first adder tree with the adder out result from a second adder tree to produce a vector dot product of the first and the second N-bit vector for a second precision mode, wherein the second precision mode is of higher precision than the first precision mode.
1 Assignment
0 Petitions
Accused Products
Abstract
A configurable integrated circuit to compute vector dot products between a first N-bit vector and a second N-bit vector in a plurality of precision modes. An embodiment includes M slices, each of which calculates the vector dot products between a corresponding segment of the first and the second N-bit vectors. Each of the slices outputs intermediary multiplier results for the lower precision modes, but not for highest precision mode. A plurality of adder trees to sum up the plurality of intermediate multiplier results, with each adder tree producing a respective adder out result. An accumulator to merge the adder out result from a first adder tree with the adder out result from a second adder tree to produce the vector dot product of the first and the second N-bit vector in the highest precision mode.
3 Citations
20 Claims
-
1. A configurable integrated circuit to compute vector dot products between a first N-bit vector and a second N-bit vector in a plurality of precision modes, the circuit comprising:
-
M slices, each slice to calculate the vector dot products between a corresponding segment of the first and the second N-bit vectors, each slice to output a plurality of intermediary multiplier results for a first precision mode; a plurality of adder trees to sum up the plurality of intermediate multiplier results from the M slices, each of the plurality of adder trees to produce a respective adder out result; and an accumulator to merge the adder out result from a first adder tree with the adder out result from a second adder tree to produce a vector dot product of the first and the second N-bit vector for a second precision mode, wherein the second precision mode is of higher precision than the first precision mode. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A configurable integrated circuit to compute vector dot products between a first N-bit vector and a second N-bit vector in a plurality of precision modes, the circuit comprising:
-
M slices, each slice to calculate a plurality of intermediary multiplier results between a corresponding segment of the first and the second N-bit vectors, each of the plurality of intermediary multiplier results corresponding to a quadrant usable to build a larger multiplier result; a plurality of adder trees to sum up the plurality of intermediate multiplier results from the M slices based on the corresponding quadrants of each intermediary multiplier result to generate a plurality of multiplier sums; and a multiply merge circuitry to merge the multiplier sums from all of the plurality of adder trees, including bit-shifting at least some of the multiplier sums, to produce a vector dot product of the first and the second N-bit vector in the highest precision mode. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification