Computation Engine with Strided Dot Product
First Claim
1. A system comprising:
- a processor configured to issue a first instruction to a computation engine;
the computation engine coupled to the processor, wherein;
the computation engine comprises;
a first memory storing, during use, a first plurality of input vectors that include first vector elements, anda second memory storing, during use, a second plurality of input vectors that include second vector elements; and
the computation engine is configured to compute a dot product of a subset of the first vector elements and each of the second vector elements, wherein respective elements of the subset of the first vector elements are separated in the first plurality of input vectors by a stride corresponding to the first instruction.
1 Assignment
0 Petitions
Accused Products
Abstract
In an embodiment, a computation engine may perform dot product computations on input vectors. The dot product operation may have a first operand and a second operand, and the dot product may be performed on a subset of the vector elements in the first operand and each of the vector elements in the second operand. The subset of vector elements may be separated in the first operand by a stride that skips one or more elements between each element to which the dot product operation is applied. More particularly, in an embodiment, the input operands of the dot product operation may be a first vector having second vectors as elements, and the stride may select a specified element of each second vector.
2 Citations
20 Claims
-
1. A system comprising:
-
a processor configured to issue a first instruction to a computation engine; the computation engine coupled to the processor, wherein; the computation engine comprises; a first memory storing, during use, a first plurality of input vectors that include first vector elements, and a second memory storing, during use, a second plurality of input vectors that include second vector elements; and the computation engine is configured to compute a dot product of a subset of the first vector elements and each of the second vector elements, wherein respective elements of the subset of the first vector elements are separated in the first plurality of input vectors by a stride corresponding to the first instruction. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A circuit comprising:
-
a first input memory storing a first plurality of input vectors, during use; a second input memory storing a second plurality of input vectors, during use; and a compute circuit coupled to the first input memory and the second input memory, wherein the compute circuit is configured, responsive to a first instruction, to multiply selected vector elements of the first plurality of input vectors by the second plurality of input vectors, wherein the selected vector elements are separated in the first plurality of input vectors by a stride associated with the first instruction. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A system comprising:
-
a processor configured to issue a first instruction to a computation engine; the computation engine coupled to the processor, wherein; the computation engine comprises; a first memory storing, during use, a first plurality of input vectors that include first vector elements, a second memory storing, during use, a second plurality of input vectors that include second vector elements, and a third memory storing, during use, a plurality of results; and the computation engine further comprises a plurality of multiply accumulate (MAC) circuits, wherein the plurality of MAC circuits are configured to multiply selected first vector elements by second vector elements and to sum the multiplication results with the plurality of results, and the computation engine performs the multiplications and additions in response to the first instruction, and wherein the selected first vector elements are identified using a stride corresponding to the first instruction. - View Dependent Claims (19, 20)
-
Specification