Reliable communication between multi-processor clusters of multi-cluster computer systems
First Claim
1. A computer-implemented method for detecting errors in a computer system comprising a plurality of clusters, each cluster including a plurality of local nodes and an interconnection controller interconnected by point-to-point intra-cluster links, communications between the local nodes and the interconnection controller made via an intra-cluster protocol using intra-cluster packets, the interconnection controller of each cluster interconnected by point-to-point inter-cluster links with the interconnection controller of other clusters, the computer-implemented method comprising:
- forming an inter-cluster packet by encapsulating an intra-cluster packet;
encoding a sequence identifier in the inter-cluster packet;
calculating first cyclic redundancy code check data based only upon the inter-cluster packet;
encoding the first cyclic redundancy code check data in the inter-cluster packet; and
transmitting the inter-cluster packet from a first cluster to a second cluster on a point-to-point inter-cluster link.
9 Assignments
0 Petitions
Accused Products
Abstract
Improved techniques are provided for detecting and correcting errors and skew in inter-cluster communications within computer systems having a plurality of multi-processor clusters. The local nodes of each cluster include a plurality of processors and an interconnection controller. Intra-cluster links are formed between the local nodes, including the interconnection controller, within a cluster. Inter-cluster links are formed between interconnection controllers of different clusters. Intra-cluster packets may be serialized and encapsulated as inter-cluster packets for transmission on inter-cluster links, preferably with link-layer encapsulation. Each inter-cluster packet may include a sequence identifier and error information computed for that packet. Clock data may be embedded in symbols sent on each bit lane of the inter-cluster links. Copies of transmitted inter-cluster packets may be stored until an acknowledgement is received. The use of inter-cluster packets on an inter-cluster link is preferably transparent to other links and to the protocol layer.
76 Citations
25 Claims
-
1. A computer-implemented method for detecting errors in a computer system comprising a plurality of clusters, each cluster including a plurality of local nodes and an interconnection controller interconnected by point-to-point intra-cluster links, communications between the local nodes and the interconnection controller made via an intra-cluster protocol using intra-cluster packets, the interconnection controller of each cluster interconnected by point-to-point inter-cluster links with the interconnection controller of other clusters, the computer-implemented method comprising:
-
forming an inter-cluster packet by encapsulating an intra-cluster packet;
encoding a sequence identifier in the inter-cluster packet;
calculating first cyclic redundancy code check data based only upon the inter-cluster packet;
encoding the first cyclic redundancy code check data in the inter-cluster packet; and
transmitting the inter-cluster packet from a first cluster to a second cluster on a point-to-point inter-cluster link. - View Dependent Claims (2, 3, 4)
-
-
5. An apparatus for detecting errors in a computer system comprising a plurality of clusters, each cluster including a plurality of local nodes and an interconnection controller interconnected by point-to-point intra-cluster links, communications between the local nodes and the interconnection controller made via an intra-cluster protocol using intra-cluster packets, the interconnection controller of each cluster interconnected by point-to-point inter-cluster links with the interconnection controller of other clusters, the apparatus comprising:
-
means for forming an inter-cluster packet by encapsulating an intra-cluster packet;
means for encoding a sequence identifier in the inter-cluster packet;
means for calculating first cyclic redundancy code check data based only upon the inter-cluster packet;
means for encoding the first cyclic redundancy code check data in the inter-cluster packet; and
means for transmitting the inter-cluster packet from a first interconnection controller to a second interconnection controller on a point-to-point inter-cluster link.
-
-
6. A computer system, comprising:
-
a first cluster including a first plurality of processors and a first interconnection controller, the first plurality of processors and the first interconnection controller interconnected by first point-to-point intra-cluster links; and
a second cluster including a second plurality of processors and a second interconnection controller, the second plurality of processors and the second interconnection controller interconnected by second point-to-point intra-cluster links, the first interconnection controller coupled to the second interconnection controller by point-to-point inter-cluster links, communications on the first and second intra-cluster links made via an intra-cluster protocol by intra-cluster packets;
wherein the first interconnection controller is configured to;
receive an intra-cluster packet from a first processor in the first plurality of processors;
store the intra-cluster packet in a buffer;
add a header, including a sequence identifier, to the intra-cluster packet to form a high-speed link packet;
compute a first cyclic redundancy code check based only upon the high-speed link packet;
encode first cyclic redundancy code check data in the high-speed link packet; and
transmit the high-speed link packet to the second interconnection controller in the second cluster;
wherein the second interconnection controller is configured to;
receive the high-speed link packet;
compute a second cyclic redundancy code check based only upon the high-speed link packet;
compare results of the second cyclic redundancy code check with the encoded first cyclic redundancy code check data in the high-speed link packet; and
notify the first interconnection controller regarding the results of the comparison. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. An interconnection controller, comprising:
-
an intra-cluster interface configured for coupling with intra-cluster links to a plurality of local processors arranged in a point-to-point architecture in a local cluster;
an inter-cluster interface configured for coupling with an inter-cluster link to a non-local interconnection controller in a non-local cluster;
a transceiver configured to;
receive an intra-cluster packet from a local processor via an intra-cluster link;
encode a sequence identifier in a header of the intra-cluster packet;
compute cyclic redundancy code check data based only on the encoded packet; and
encode the cyclic redundancy code check data in the encoded packet; and
a serializer/deserializer configured to serialize the encoded packet and forward the encoded, serialized packet to the inter-cluster interface for transmission to the non-local interconnection controller via an inter-cluster link. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25)
-
Specification