Processors, methods, and systems with a configurable spatial accelerator
First Claim
1. An apparatus comprising:
- a first tile and a second tile, each comprising a plurality of processing elements and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements of the first tile and the second tile with each node represented as a dataflow operator in the interconnect network and the plurality of processing elements of the first tile or the second tile, and the plurality of processing elements of the first tile and the second tile are to perform an operation when an incoming operand set arrives at the plurality of processing elements of the first tile and the second tile;
a synchronizer circuit coupled between the interconnect network of the first tile and the interconnect network of the second tile and comprising storage to store data to be sent between the interconnect network of the first tile and the interconnect network of the second tile, the synchronizer circuit to convert the data from the storage between a first voltage or a first frequency of the first tile and a second voltage or a second frequency of the second tile to generate converted data, and send the converted data between the interconnect network of the first tile and the interconnect network of the second tile; and
one of;
a second synchronizer circuit coupled between the interconnect network of the first tile and the interconnect network of the second tile and comprising storage to store second data to be sent from the interconnect network of the second tile into the interconnect network of the first tile, the second synchronizer circuit to convert the second data from the storage from the second voltage or the second frequency of the second tile to the first voltage or the first frequency of the first tile to generate second converted data, and send the second converted data into the interconnect network of the first tile, wherein the synchronizer circuit is coupled between the interconnect network of the first tile and the interconnect network of the second tile and comprises storage to store data to be sent from the interconnect network of the first tile into the interconnect network of the second tile, the synchronizer circuit to convert the data from the storage from the first voltage or the first frequency of the first tile to the second voltage or the second frequency of the second tile to generate the converted data, and send the converted data into the interconnect network of the second tile,orwherein the synchronizer circuit is to send a backpressure signal from a downstream processing element of the second tile to a processing element of the first tile to stall execution of the processing element of the first tile, wherein the backpressure signal indicates that storage in the downstream processing element is not available for an output of the processing element.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a synchronizer circuit coupled between an interconnect network of a first tile and an interconnect network of a second tile and comprising storage to store data to be sent between the interconnect network of the first tile and the interconnect network of the second tile, the synchronizer circuit to convert the data from the storage between a first voltage or a first frequency of the first tile and a second voltage or a second frequency of the second tile to generate converted data, and send the converted data between the interconnect network of the first tile and the interconnect network of the second tile
279 Citations
24 Claims
-
1. An apparatus comprising:
-
a first tile and a second tile, each comprising a plurality of processing elements and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements of the first tile and the second tile with each node represented as a dataflow operator in the interconnect network and the plurality of processing elements of the first tile or the second tile, and the plurality of processing elements of the first tile and the second tile are to perform an operation when an incoming operand set arrives at the plurality of processing elements of the first tile and the second tile; a synchronizer circuit coupled between the interconnect network of the first tile and the interconnect network of the second tile and comprising storage to store data to be sent between the interconnect network of the first tile and the interconnect network of the second tile, the synchronizer circuit to convert the data from the storage between a first voltage or a first frequency of the first tile and a second voltage or a second frequency of the second tile to generate converted data, and send the converted data between the interconnect network of the first tile and the interconnect network of the second tile; and one of; a second synchronizer circuit coupled between the interconnect network of the first tile and the interconnect network of the second tile and comprising storage to store second data to be sent from the interconnect network of the second tile into the interconnect network of the first tile, the second synchronizer circuit to convert the second data from the storage from the second voltage or the second frequency of the second tile to the first voltage or the first frequency of the first tile to generate second converted data, and send the second converted data into the interconnect network of the first tile, wherein the synchronizer circuit is coupled between the interconnect network of the first tile and the interconnect network of the second tile and comprises storage to store data to be sent from the interconnect network of the first tile into the interconnect network of the second tile, the synchronizer circuit to convert the data from the storage from the first voltage or the first frequency of the first tile to the second voltage or the second frequency of the second tile to generate the converted data, and send the converted data into the interconnect network of the second tile, or wherein the synchronizer circuit is to send a backpressure signal from a downstream processing element of the second tile to a processing element of the first tile to stall execution of the processing element of the first tile, wherein the backpressure signal indicates that storage in the downstream processing element is not available for an output of the processing element. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method comprising:
-
providing a first tile and a second tile, each comprising a plurality of processing elements and an interconnect network between the plurality of processing elements, having a dataflow graph comprising a plurality of nodes overlaid into the first tile and the second tile, with each node represented as a dataflow operator in the interconnect network and the plurality of processing elements of the first tile or the second tile; storing data to be sent between the interconnect network of the first tile and the interconnect network of the second tile in storage with a synchronizer circuit coupled between the interconnect network of the first tile and the interconnect network of the second tile; converting the data from the storage between a first voltage or a first frequency of the first tile and a second voltage or a second frequency of the second tile to generate converted data with the synchronizer circuit; sending the converted data with the synchronizer circuit between the interconnect network of the first tile and the interconnect network of the second tile; and one of; providing a second synchronizer circuit coupled between the interconnect network of the first tile and the interconnect network of the second tile, storing second data to be sent from the interconnect network of the second tile into the interconnect network of the first tile in storage of the second synchronizer circuit, converting the second data from the storage from the second voltage or the second frequency of the second tile to the first voltage or the first frequency of the first tile to generate second converted data with the second synchronizer circuit, and sending the second converted data into the interconnect network of the first tile, wherein the synchronizer circuit is coupled between the interconnect network of the first tile and the interconnect network of the second tile and comprises storage to store data to be sent from the interconnect network of the first tile into the interconnect network of the second tile, the synchronizer circuit to convert the data from the storage from the first voltage or the first frequency of the first tile to the second voltage or the second frequency of the second tile to generate the converted data, and send the converted data into the interconnect network of the second tile, or sending, with the synchronizer circuit, a backpressure signal from a downstream processing element of the second tile to a processing element of the first tile to stall execution of the processing element of the first tile, the backpressure signal indicating that storage in the downstream processing element is not available for an output of the processing element. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. An apparatus comprising:
-
a first data path network between a plurality of processing elements in a first tile; a second data path network between a plurality of processing elements in a second tile; a first flow control path network between the plurality of processing elements of the first tile; a second flow control path network between the plurality of processing elements of the second tile, the first data path network, the second data path network, the first flow control path network, and the second flow control path network are to receive an input of a dataflow graph comprising a plurality of nodes, the dataflow graph is to be overlaid into the first data path network, the second data path network, the first flow control path network, the second flow control path network, the plurality of processing elements of the first tile, and the plurality of processing elements of the second tile with each node represented as a dataflow operator in the plurality of processing elements of the first tile or the plurality of processing elements of the second tile to perform an operation by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements of the first tile, and the plurality of processing elements of the second tile; a synchronizer circuit coupled between the first data path network of the first tile and the second data path network of the second tile, and comprising storage to store data to be sent between the first data path network of the first tile and the second data path network of the second tile, the synchronizer circuit to convert the data from the storage between a first voltage or a first frequency of the first tile and a second voltage or a second frequency of the second tile to generate converted data, and send the converted data between the first data path network of the first tile and the second data path network of the second tile; and one of; a second synchronizer circuit coupled between the first flow control path network of the first tile and the second flow control path network of the second tile, and comprising storage to store control data to be sent from the second flow control path network of the second tile into the first flow control path network of the first tile, the second synchronizer circuit to convert the control data from the storage from the second voltage or the second frequency of the second tile to the first voltage or the first frequency of the first tile to generate converted control data, and send the converted control data into the first flow control path network of the first tile, or wherein the synchronizer circuit is to send a backpressure control signal as control data from a downstream processing element of the second tile to a processing element of the first tile to stall execution of the processing element of the first tile, wherein the backpressure control signal indicates that storage in the downstream processing element is not available for an output of the processing element. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. A method comprising:
-
providing a first tile and a second tile having a dataflow graph comprising a plurality of nodes overlaid into a first data path network between a plurality of processing elements in the first tile, a second data path network between a plurality of processing elements in the second tile, a first flow control path network between the plurality of processing elements of the first tile, a second flow control path network between the plurality of processing elements of the second tile, the plurality of processing elements of the first tile, and the plurality of processing elements of the second tile with each node represented as a dataflow operator in the plurality of processing elements of the first tile or the plurality of processing elements of the second tile; storing data to be sent between the first data path network of the first tile and the second data path network of the second tile in storage with a synchronizer circuit coupled between the first data path network of the first tile and the second data path network of the second tile; converting the data from the storage between a first voltage or a first frequency of the first tile and a second voltage or a second frequency of the second tile to generate converted data with the synchronizer circuit; sending the converted data with the synchronizer circuit between the first data path network of the first tile and the second data path network of the second tile; and one of; providing a second synchronizer circuit coupled between the first flow control path network of the first tile and the second flow control path network of the second tile, storing control data to be sent from the second flow control path network of the second tile into the first flow control path network of the first tile in storage of the second synchronizer circuit, converting the control data from the storage from the second voltage or the second frequency of the second tile to the first voltage or the first frequency of the first tile to generate converted control data with the second synchronizer circuit, and sending the converted control data into the first flow control path network of the first tile, or sending, with the synchronizer circuit, a backpressure control signal as control data from a downstream processing element of the second tile to a processing element of the first tile to stall execution of the processing element of the first tile, wherein the backpressure control signal indicates that storage in the downstream processing element is not available for an output of the processing element. - View Dependent Claims (20, 21, 22, 23, 24)
-
Specification