Method of synchronizing parallel processors employing channels and compiling method minimizing cross-processor data dependencies

US 5,317,734 A
Filed: 08/29/1989
Issued: 05/31/1994
Est. Priority Date: 08/29/1989
Status: Expired due to Term

First Claim

Patent Images

1. A compiling method for scheduling data-dependent operations, describable as a directed acyclic graph of nodes, representing operations, and edges, representing data dependencies, into a plurality of parallel instruction streams for execution by a plurality of respective digital processors and for scheduling synchronizing data transfers on a plurality of inter-processor data transfer means, each being usable for transfer of synchronizing data from one to another of said processors, said method comprising:

scheduling all nodes of said graph into said plurality of streams with each edge of said graph being describable as either an intra-stream edge directed between nodes scheduled in the same stream or an inter-stream edge directed between nodes scheduled in different streams, said nodes being scheduled in a manner that the intra-stream edges are directed in the same direction;

identifying synchronization redundant edges among said inter-stream edges; and

scheduling inter-stream edges which are not synchronization redundant as synchronizing data transfers on said inter-processor data transfer means.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of synchronizing the parallel processors of a multiple instruction stream multiprocessor employs a limited number of register channels, which may be re-used, for enforcing cross-stream data or event dependencies by passing data or event notifications in a synchronizing fashion. Cross-stream dependencies which by virtue of identified "synchronization redundancey" do not require enforcement by register channels are passed by writing to and reading from ordinary shared memory. A compiling method schedules the instructions into parallel instruction streams by reference to a directed acyclic graph (DAG), in a manner to minimize the production of cross-stream dependencies. The scheduling is determined beginning from the highest nodes in the DAG and proceeding to nodes in order of descending node height in a manner tending and tends to assign whole sub-graphs of the DAG to different processors.

102 Citations

View as Search Results

15 Claims

1. A compiling method for scheduling data-dependent operations, describable as a directed acyclic graph of nodes, representing operations, and edges, representing data dependencies, into a plurality of parallel instruction streams for execution by a plurality of respective digital processors and for scheduling synchronizing data transfers on a plurality of inter-processor data transfer means, each being usable for transfer of synchronizing data from one to another of said processors, said method comprising:
- scheduling all nodes of said graph into said plurality of streams with each edge of said graph being describable as either an intra-stream edge directed between nodes scheduled in the same stream or an inter-stream edge directed between nodes scheduled in different streams, said nodes being scheduled in a manner that the intra-stream edges are directed in the same direction;
  
  identifying synchronization redundant edges among said inter-stream edges; and
  
  scheduling inter-stream edges which are not synchronization redundant as synchronizing data transfers on said inter-processor data transfer means.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1 wherein said scheduling all nodes is performed in a manner for substantially minimizing the number of inter-stream edges.
  - 3. The method of claim 1 wherein said scheduling all nodes is determined generally inversely from the order in which the operations represented by said nodes are to be performed by said processors and comprises:
    - identifying, among unscheduled nodes of said graph, a plurality of nodes having the greatest height; and
      
      scheduling nodes from said plurality of nodes of greatest height into different streams, while substantially minimizing the number of resulting inter-stream edges directed to scheduled nodes.
  - 4. The method of claim 3 wherein said scheduling all nodes further comprises:
    - identifying, for a plurality of said scheduled nodes from said plurality of nodes of greatest height, any sub-graphs rooted at said scheduled nodes; and
      
      scheduling, if possible, an equal number of nodes of the respective identified sub-graphs into the same respective streams as the scheduled nodes of greatest height rooting said respective sub-graphs, said scheduling of nodes of the respective identified sub-graphs being determined in order of descending node height from said scheduled nodes rooting said respective identified sub-graphs.
  - 5. The method of claim 4 further comprising repeating said identifying among unscheduled nodes of said graph, a plurality of nodes having the greatest height after said scheduling, if possible, an equal number of nodes of the respective identified sub-graphs.
  - 6. The method of claim 1 wherein said scheduling inter-stream edges which are not synchronization redundant comprises:
    - identifying among said inter-stream edges which are not synchronization redundant, edges representing data dependencies which must be resolved in a definite order as candidates for scheduling by re-use of the same inter-processor data transfer means.
  - 7. The method of claim 6 wherein said identifying among said inter-stream edges which are not synchronization redundant, edges as candidates for scheduling by re-use of the same inter-processor data transfer means, is such that re-use will not cause implicit synchronizations due to the possibility of blocking to writing on said data transfer means.
  - 8. The method of claim 6 further comprising:
    - identifying implicit synchronizations due to the possibility of blocking to reading on re-used inter-processor data transfer means; and
      
      identifying further inter-stream edges that would be rendered synchronization redundant by said implicit synchronizations.

9. A compiling method for scheduling data-dependent operations, describable as a directed acyclic graph of nodes, representing operations, and edges, representing data dependencies, into a plurality of parallel instruction streams for execution by a plurality of respective digital processors comprising:
- scheduling all nodes of said graph into said plurality of streams with each edge of said graph being describable as either an intra-stream edge directed between nodes in the same stream or an inter-stream edge directed between nodes in different streams, said nodes being scheduled in a manner such that the intra-stream edges are directed in the same direction; and
  
  wherein said scheduling all nodes is performed in a manner for minimizing the number of inter-stream edges.
- View Dependent Claims (10, 11, 12)
- - 10. The method of claim 9 wherein said scheduling all nodes is determined generally inversely from the order in which the operations represented by said nodes are to be performed by said processors and comprises:
    - identifying, among unscheduled nodes of said graph, a plurality of nodes having the greatest height in said graph; and
      
      scheduling nodes from said plurality of nodes of greatest height into different streams while minimizing the number of resulting inter-stream edges directed to scheduled nodes.
  - 11. The method of claim 10 further comprising:
    - identifying, for a plurality of said nodes scheduled from said plurality of nodes of greatest height, any sub-graphs rooted at each said node; and
      
      scheduling, if possible, an equal number of nodes of the respective identified sub-graphs into the same respective streams as the scheduled nodes from said plurality of nodes of greatest height rooting said respective sub-graphs, said scheduling of nodes of the respective identified sub-graphs being determined in order of descending node height from said scheduled nodes rooting said respective identified sub-graphs.
  - 12. The method of claim 11 further comprising repeating said identifying among unscheduled nodes of said graph, a plurality of nodes having the greatest height after said scheduling, if possible, an equal number of nodes of the respective identified sub-graphs.

13. In a multiprocessor having first and second parallel processors for performing sequential operations specified in respective first and second instruction streams having cross-stream data dependencies, a register channel accessible by said processors characterized by a synchronization bit indicating whether said register channel has been written to, and a memory accessible to said processors not having a synchronization bit, the method of passing plural cross-stream dependent data between said processors in a synchronizing fashion comprising:
- first writing, by said first processor to said memory, first cross-stream dependent data available to said first processor;
  
  second writing, by said first processor to said register channel, not earlier than said first writing, second cross-stream dependent data available to said first processor;
  
  waiting, if necessary, by said second processor until said synchronization bit indicates that said second writing has occurred;
  
  first reading, by said second processor, the second cross-stream dependent data from said register channel; and
  
  second reading, by said second processor, not earlier than said first reading of the first cross-stream dependent data from said memory.

14. In a multiprocessor having first, second and third parallel processors for performing sequential operations specified in respective first, second and third instruction streams having cross-stream data dependencies, first and second register channels accessible by said processors each characterized by a synchronization bit indicating whether said register channel has been written to, and a memory accessible to said processors not having a synchronization bit, the method of passing plural cross-stream dependent data between said processors in synchronizing fashion comprising:
- first writing, by said first processor to said memory, first cross-stream dependent data available to said first processor;
  
  second writing by said first processor to said first register channel, not earlier than said first writing, second cross-stream dependent data available to said first processor;
  
  first waiting, if necessary, by said second processor until said synchronization bit of said first register channel indicates that said second writing has occurred;
  
  first reading by said second processor of the second cross-stream dependent data from said first register channel;
  
  third writing, by said second processor to said second register channel, not earlier than said second reading, third cross-stream dependent data available to said second processor;
  
  second waiting, if necessary, by said third processor until said synchronization bit of said second register channel indicates that said third writing has occurred;
  
  second reading, by said third processor, the third cross-stream dependent data from said second register channel; and
  
  third reading by said third processor, not earlier than said second reading, the first cross-stream dependent data from said memory.

15. In a multiprocessor having a plurality of parallel processors for performing sequential operations specified respectively in a plurality of instruction streams having cross-stream data dependencies, a plurality of register channels accessible by said processors for synchronizing data passing between said processors, and a memory accessible to said processors for non-synchronizing data passing between said processors, the method for passing plural cross-stream dependent data between processors in a synchronized fashion comprising:
- first writing, by one of said processors to said memory means, first cross-stream dependent data available to said first processor;
  
  last reading said first cross-stream dependent data from said memory means by another of said processors; and
  
  enforcing other cross-stream data dependencies by a sequence of one or more writing-reading pairs associated respectively with one or more register channels, said sequence beginning with a second writing to a register channel, not earlier than said first writing, and ending with a next-to-last reading, from the same or different register channel not later than said last reading.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
NXP B.V. (NXP Semiconductors NV)
Original Assignee
North American Philips Corp. (Koninklijke Philips N.V.)
Inventors
Gupta, Rajiv
Primary Examiner(s)
Shaw, Gareth D.
Assistant Examiner(s)
Chavis, John Q.

Application Number

US07/400,178
Time in Patent Office

1,736 Days
Field of Search

364/200, 364/900
US Class Current

717/161
CPC Class Codes

G06F 8/45 Exploiting coarse grain par...

G06F 8/458 Synchronisation, e.g. post-...

Method of synchronizing parallel processors employing channels and compiling method minimizing cross-processor data dependencies

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

102 Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Method of synchronizing parallel processors employing channels and compiling method minimizing cross-processor data dependencies

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

102 Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links