Scalable processor to processor and processor-to-I/O interconnection network and method for parallel processing arrays
First Claim
1. A multi-stage interconnect network (MIN) for a parallel processor array comprising:
- first, second and third switching stages for forming routing paths between processor elements (PEs) of the parallel processor array, each stage resolving one or more bits of a data routing header; and
address bit duplicating means for duplicating bits resolved in a first stage such that the same bits are again resolved in a later stage to balance data routing loading;
wherein;
each PE is identified as belonging to a cluster of a plurality of PEs;
each cluster is identified as belonging to one of a plurality of PE circuit boards; and
said multi-stage interconnect network is divided into first, second, third and fourth resolving stages for resolving a plurality of route-requesting bits identifying each target PE, the second resolving stage being implemented in said second switching stage for revolving route requests according to the PE board on which the target PE resides, the fourth resolving stage being implemented in said each cluster of PEs for resolving the bits of a route requesting signal according to the location of the target PE within a specified PE cluster, and the first and third resolving stages being implemented in said first and third switching stages respectively for resolving the cluster number of the target PE.
4 Assignments
0 Petitions
Accused Products
Abstract
A massively parallel computer system is disclosed having a global router network in which pipeline registers are spatially distributed to increase the messaging speed of the global router network. The global router network includes an expansion tap for processor to I/O messaging so that I/O messaging bandwidth matches interprocessor messaging bandwidth. A route-opening message packet includes protocol bits which are treated homogeneously with steering bits. The route-opening packet further includes redundant address bits for imparting a multiple-crossbars personality to router chips within the global router network. A structure and method for spatially supporting the processors of the massively parallel system and the global router network are also disclosed.
-
Citations
11 Claims
-
1. A multi-stage interconnect network (MIN) for a parallel processor array comprising:
-
first, second and third switching stages for forming routing paths between processor elements (PEs) of the parallel processor array, each stage resolving one or more bits of a data routing header; and address bit duplicating means for duplicating bits resolved in a first stage such that the same bits are again resolved in a later stage to balance data routing loading;
wherein;each PE is identified as belonging to a cluster of a plurality of PEs; each cluster is identified as belonging to one of a plurality of PE circuit boards; and said multi-stage interconnect network is divided into first, second, third and fourth resolving stages for resolving a plurality of route-requesting bits identifying each target PE, the second resolving stage being implemented in said second switching stage for revolving route requests according to the PE board on which the target PE resides, the fourth resolving stage being implemented in said each cluster of PEs for resolving the bits of a route requesting signal according to the location of the target PE within a specified PE cluster, and the first and third resolving stages being implemented in said first and third switching stages respectively for resolving the cluster number of the target PE. - View Dependent Claims (2)
-
-
3. A global router network for a massively parallel array of processing elements, the routing network comprising a plurality of data-routing stages, wherein each of said data-routing stages comprises:
-
a route requesting input wire (RRW-x) for receiving a route-requesting header signal; a pipeline latch (612) having a data input terminal (D) and a data output terminal (Q); a first tristate buffer (611) for selectively coupling the route request input wire (RRW-x) to the data input terminal of the pipeline latch (612); a switching matrix (615) having a router header-in line (621x), horizontal data input lines (650x), vertical output lines (654Y) and switching cells (620) for selectively coupling any one of said horizontal input lines (650x) to one of the vertical output lines (654Y); a second tristate buffer (652) for selectively coupling the output terminal (Q) of the pipeline latch (612) to said horizontal input line (650x) of the switching matrix (615) during a forward messagingmode; a third tristate buffer (657) for selectively coupling said horizontal data line (650x) to the data input terminal (D) of the pipeline latch (612) during a reverse messaging mode; and a fourth tristate buffer (658) for selectively coupling the output terminal (Q) of the pipeline latch (612) to the route requesting wire (RRW-x) during the reverse messaging mode. - View Dependent Claims (4)
-
-
5. A method for routing data in a global router system between any one processor element (PE) of an array of processor elements (PEs) and any other PE of the array, comprising the steps of:
-
providing an interconnection network for establishing data routing paths between a set of source PEs and a set of target PEs; furnishing said PEs with respectively parity identities having precomputed values based on the array addresses of the respective PEs; generating route requesting signals to be propagated at least in part through said interconnection network from the set of source PEs to the set of target Pes for establishing data carrying routes through said interconnection network in accordance with address information in said route requesting signals, each of said route requesting signals including a protocol bit for indicating to said interconnection network the presence of a route requesting signal;
`generating parity bits respectively associated with said route requesting signals for propagating through said interconnection network to indicate respectively an odd or even parity of the addresses in said route requesting signals; andcomparing in each PE of the set of target PEs receiving a parity bit the parity identity thereof with the received parity bit to indicate an error condition in the event the parity identity of said each PE and the parity bit received by said each PE are unequal. - View Dependent Claims (6, 7, 8, 9, 10)
-
-
11. In a parallel processor having an array of processor elements, an interconnection network for indirectly routing data from one set of the processor elements to another set of the processor elements comprising:
-
a first bidirectional latch having a set of first ports and a set of second ports; first bidirectional routing path segments respectively connected to the first ports of the first latch, the first routing path segments including a first switch stage responsive to header data from the processor elements for configuring the first routing path segments; second bidirectional routing path segments respectively connected to the second ports of the first latch, the second routing path segments including a second switch stage responsive to header data from the first switch stage for configuring the second routing path segments; a second bidirectional latch having a set of first ports and a set of second ports, the first ports thereof being connected to the processor elements and the second ports thereof being respectively connected to the first routing path segments; a third bidirectional latch having a set of first ports and a set of second ports, the first ports thereof being respectively connected to the second routing path segments, and the second ports thereof being connected to the processor elements; third and fourth bidirectional routing path segments, wherein; the first ports of the second bidirectional latch are respectively connected to the processor elements by the third bidirectional routing path segments; and the second ports of the third bidirectional latch are respectively connected to the processor elements by the fourth bidirectional routing path segments; and means for operating the first, second and third latches and the processor elements to transfer data between one set of the processor elements and another set of the processor elements in either direction along routing paths comprising the configured first, second, third and fourth routing path segments.
-
Specification