Method of synchronizing parallel processors employing channels and compiling method minimizing cross-processor data dependencies
First Claim
1. A compiling method for scheduling data-dependent operations, describable as a directed acyclic graph of nodes, representing operations, and edges, representing data dependencies, into a plurality of parallel instruction streams for execution by a plurality of respective digital processors and for scheduling synchronizing data transfers on a plurality of inter-processor data transfer means, each being usable for transfer of synchronizing data from one to another of said processors, said method comprising:
- scheduling all nodes of said graph into said plurality of streams with each edge of said graph being describable as either an intra-stream edge directed between nodes scheduled in the same stream or an inter-stream edge directed between nodes scheduled in different streams, said nodes being scheduled in a manner that the intra-stream edges are directed in the same direction;
identifying synchronization redundant edges among said inter-stream edges; and
scheduling inter-stream edges which are not synchronization redundant as synchronizing data transfers on said inter-processor data transfer means.
6 Assignments
0 Petitions
Accused Products
Abstract
A method of synchronizing the parallel processors of a multiple instruction stream multiprocessor employs a limited number of register channels, which may be re-used, for enforcing cross-stream data or event dependencies by passing data or event notifications in a synchronizing fashion. Cross-stream dependencies which by virtue of identified "synchronization redundancey" do not require enforcement by register channels are passed by writing to and reading from ordinary shared memory. A compiling method schedules the instructions into parallel instruction streams by reference to a directed acyclic graph (DAG), in a manner to minimize the production of cross-stream dependencies. The scheduling is determined beginning from the highest nodes in the DAG and proceeding to nodes in order of descending node height in a manner tending and tends to assign whole sub-graphs of the DAG to different processors.
102 Citations
15 Claims
-
1. A compiling method for scheduling data-dependent operations, describable as a directed acyclic graph of nodes, representing operations, and edges, representing data dependencies, into a plurality of parallel instruction streams for execution by a plurality of respective digital processors and for scheduling synchronizing data transfers on a plurality of inter-processor data transfer means, each being usable for transfer of synchronizing data from one to another of said processors, said method comprising:
-
scheduling all nodes of said graph into said plurality of streams with each edge of said graph being describable as either an intra-stream edge directed between nodes scheduled in the same stream or an inter-stream edge directed between nodes scheduled in different streams, said nodes being scheduled in a manner that the intra-stream edges are directed in the same direction; identifying synchronization redundant edges among said inter-stream edges; and scheduling inter-stream edges which are not synchronization redundant as synchronizing data transfers on said inter-processor data transfer means. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A compiling method for scheduling data-dependent operations, describable as a directed acyclic graph of nodes, representing operations, and edges, representing data dependencies, into a plurality of parallel instruction streams for execution by a plurality of respective digital processors comprising:
-
scheduling all nodes of said graph into said plurality of streams with each edge of said graph being describable as either an intra-stream edge directed between nodes in the same stream or an inter-stream edge directed between nodes in different streams, said nodes being scheduled in a manner such that the intra-stream edges are directed in the same direction; and wherein said scheduling all nodes is performed in a manner for minimizing the number of inter-stream edges. - View Dependent Claims (10, 11, 12)
-
-
13. In a multiprocessor having first and second parallel processors for performing sequential operations specified in respective first and second instruction streams having cross-stream data dependencies, a register channel accessible by said processors characterized by a synchronization bit indicating whether said register channel has been written to, and a memory accessible to said processors not having a synchronization bit, the method of passing plural cross-stream dependent data between said processors in a synchronizing fashion comprising:
-
first writing, by said first processor to said memory, first cross-stream dependent data available to said first processor; second writing, by said first processor to said register channel, not earlier than said first writing, second cross-stream dependent data available to said first processor; waiting, if necessary, by said second processor until said synchronization bit indicates that said second writing has occurred; first reading, by said second processor, the second cross-stream dependent data from said register channel; and second reading, by said second processor, not earlier than said first reading of the first cross-stream dependent data from said memory.
-
-
14. In a multiprocessor having first, second and third parallel processors for performing sequential operations specified in respective first, second and third instruction streams having cross-stream data dependencies, first and second register channels accessible by said processors each characterized by a synchronization bit indicating whether said register channel has been written to, and a memory accessible to said processors not having a synchronization bit, the method of passing plural cross-stream dependent data between said processors in synchronizing fashion comprising:
-
first writing, by said first processor to said memory, first cross-stream dependent data available to said first processor; second writing by said first processor to said first register channel, not earlier than said first writing, second cross-stream dependent data available to said first processor; first waiting, if necessary, by said second processor until said synchronization bit of said first register channel indicates that said second writing has occurred; first reading by said second processor of the second cross-stream dependent data from said first register channel; third writing, by said second processor to said second register channel, not earlier than said second reading, third cross-stream dependent data available to said second processor; second waiting, if necessary, by said third processor until said synchronization bit of said second register channel indicates that said third writing has occurred; second reading, by said third processor, the third cross-stream dependent data from said second register channel; and third reading by said third processor, not earlier than said second reading, the first cross-stream dependent data from said memory.
-
-
15. In a multiprocessor having a plurality of parallel processors for performing sequential operations specified respectively in a plurality of instruction streams having cross-stream data dependencies, a plurality of register channels accessible by said processors for synchronizing data passing between said processors, and a memory accessible to said processors for non-synchronizing data passing between said processors, the method for passing plural cross-stream dependent data between processors in a synchronized fashion comprising:
-
first writing, by one of said processors to said memory means, first cross-stream dependent data available to said first processor; last reading said first cross-stream dependent data from said memory means by another of said processors; and enforcing other cross-stream data dependencies by a sequence of one or more writing-reading pairs associated respectively with one or more register channels, said sequence beginning with a second writing to a register channel, not earlier than said first writing, and ending with a next-to-last reading, from the same or different register channel not later than said last reading.
-
Specification