PROCESSORS, METHODS, AND SYSTEMS FOR A CONFIGURABLE SPATIAL ACCELERATOR WITH MEMORY SYSTEM PERFORMANCE, POWER REDUCTION, AND ATOMICS SUPPORT FEATURES
First Claim
1. A processor comprising:
- a plurality of processing elements;
an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform an operation when an incoming operand set arrives at the plurality of processing elements; and
a streamer element to prefetch the incoming operand set from two or more levels of a memory system.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform an operation when an incoming operand set arrives at the plurality of processing elements. The processor also includes a streamer element to prefetch the incoming operand set from two or more levels of a memory system.
-
Citations
16 Claims
-
1. A processor comprising:
-
a plurality of processing elements; an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform an operation when an incoming operand set arrives at the plurality of processing elements; and a streamer element to prefetch the incoming operand set from two or more levels of a memory system. - View Dependent Claims (2, 3, 4, 5)
-
-
6-10. -10. (canceled)
-
11. A method comprising:
-
receiving an input of a dataflow graph comprising a plurality of nodes; overlaying the dataflow graph into a plurality of processing elements of the processor and an interconnect network between the plurality of processing elements of the processor with each node represented as a dataflow operator in the plurality of processing elements; prefetching, by a streamer element, an incoming operand set from two or more levels of a memory system; and performing an operation of the dataflow graph with the interconnect network and the plurality of processing elements when the incoming operand set arrives at the plurality of processing elements. - View Dependent Claims (12, 13, 14, 15)
-
-
16-20. -20. (canceled)
Specification