Dynamic modification of cluster communication parameters in clustered computer system
First Claim
1. A method of dynamically modifying a cluster communication parameter in a clustered computer system, the method comprising:
- (a) initiating a cluster communication parameter modification by transmitting a message to a plurality of nodes in the clustered computer system;
(b) locally confirming, within each node, receipt of the message by each of the plurality of nodes;
(c) in response to confirming receipt of the message by each of the plurality of nodes, invoking a local cluster communication parameter modification operation on each node;
(d) transmitting from each node a status of the local cluster communication parameter modification invoked on that node;
(e) locally detecting, within each node, an unsuccessful status for the local cluster communication parameter modification on any node; and
(f) in response to detecting an unsuccessful status for any node, locally undoing, in each node for which the local cluster communication operation was performed, the local cluster communication parameter modification operation performed on that node;
wherein the cluster communication parameter is selected from the group consisting of heartbeat message time out, heartbeat acknowledgment message time out, heartbeat frequency or interval, heartbeat failure threshold, heartbeat acknowledgment failure threshold, receive/send timer ratio, maximum fragment size, message retry timer value, maximum message retry time, send queue overflow threshold, message send window size, and combinations thereof.
1 Assignment
0 Petitions
Accused Products
Abstract
An apparatus, program product and method support the dynamic modification of cluster communication parameters through a distributed protocol whereby individual nodes locally confirm initiation and status information for every node participating in a parameter modification operation. By doing so, individual nodes are also able to locally determine the need to undo locally-performed parameter modifications should any other node be incapable of performing a parameter modification. Moreover, specifically with respect to cluster communication parameters such as heartbeat parameters, such parameters may be dynamically modified by configuring a sending node to send a heartbeat message to a receiving node, with the heartbeat message indicating that a heartbeat parameter is to be modified. In response to the heartbeat message, the receiving node may then send an acknowledgment message to the sending node that indicates whether the heartbeat parameter has been modified in the receiving node. Further, modification of the heartbeat parameter in the sending node may be deferred until the acknowledgment message from the receiving node indicates that the heartbeat parameter has been modified in the receiving node.
-
Citations
29 Claims
-
1. A method of dynamically modifying a cluster communication parameter in a clustered computer system, the method comprising:
-
(a) initiating a cluster communication parameter modification by transmitting a message to a plurality of nodes in the clustered computer system; (b) locally confirming, within each node, receipt of the message by each of the plurality of nodes; (c) in response to confirming receipt of the message by each of the plurality of nodes, invoking a local cluster communication parameter modification operation on each node; (d) transmitting from each node a status of the local cluster communication parameter modification invoked on that node; (e) locally detecting, within each node, an unsuccessful status for the local cluster communication parameter modification on any node; and (f) in response to detecting an unsuccessful status for any node, locally undoing, in each node for which the local cluster communication operation was performed, the local cluster communication parameter modification operation performed on that node; wherein the cluster communication parameter is selected from the group consisting of heartbeat message time out, heartbeat acknowledgment message time out, heartbeat frequency or interval, heartbeat failure threshold, heartbeat acknowledgment failure threshold, receive/send timer ratio, maximum fragment size, message retry timer value, maximum message retry time, send queue overflow threshold, message send window size, and combinations thereof. - View Dependent Claims (2, 3, 4, 5)
-
-
6. An apparatus, comprising:
-
(a) a memory; and (b) a program resident in the memory, the program configured to dynamically modify a cluster communication parameter on a local node among a plurality of nodes in a clustered computer system, the program configured to locally confirm, for the local node, successful receipt of an initiation message by each of the plurality of nodes, and a status for a local cluster communication parameter modification operation performed by each of the plurality of nodes, the program further configured to undo a local cluster communication parameter modification operation performed on the local node in response to detection of an unsuccessful status for a local cluster communication parameter modification on any node; wherein the cluster communication parameter is selected from the group consisting of heartbeat message time out, heartbeat acknowledgment message time out, heartbeat frequency or interval, heartbeat failure threshold, heartbeat acknowledgment failure threshold, receive/send timer ratio, maximum fragment size, message retry timer value, maximum message retry time, send queue overflow threshold, message send window size, and combinations thereof. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A clustered computer system, comprising:
-
(a) a plurality of nodes coupled to one another over a network; and (b) a plurality of programs, each local to a node among the plurality of nodes, each program configured to dynamically modify a cluster communication parameter on its respective local node, each program further configured to locally confirm, for its respective local node, successful receipt of an initiation message by each of the plurality of nodes, and a status for a local cluster communication parameter modification operation performed by each of the plurality of nodes, and each program further configured to undo a local cluster communication parameter modification operation performed on its respective local node in response to detection of an unsuccessful status for a local cluster communication parameter modification on any node; wherein the cluster communication parameter is selected from the group consisting of heartbeat message time out, heartbeat acknowledgment message time out, heartbeat frequency or interval, heartbeat failure threshold, heartbeat acknowledgment failure threshold, receive/send timer ratio, maximum fragment size, message retry timer value, maximum message retry time, send queue overflow threshold, message send window size, and combinations thereof.
-
-
12. A program product, comprising:
-
(a) a program configured to dynamically modify a cluster communication parameter on a local node among a plurality of nodes in a clustered computer system, the program configured to locally confirm, for the local node, successful receipt of an initiation message by each of the plurality of nodes, and a status for a local cluster communication parameter modification operation performed by each of the plurality of nodes, the program further configured to undo a local cluster communication parameter modification operation performed on the local node in response to detection of an unsuccessful status for a local cluster communication parameter modification on any node; and (b) a tangible computer readable medium bearing the program; wherein the cluster communication parameter is selected from the group consisting of heartbeat message time out, heartbeat acknowledgment message time out, heartbeat frequency or interval, heartbeat failure threshold, heartbeat acknowledgment failure threshold, receive/send timer ratio, maximum fragment size, message retry timer value, maximum message retry time, send queue overflow threshold, message send window size, and combinations thereof. - View Dependent Claims (13, 14, 15, 16)
-
-
17. A method of dynamically modifying a heartbeat parameter in a node among a plurality of nodes in a clustered computer system, the plurality of nodes including first and second nodes, the first node configured to send a heartbeat message to the second node, and the second node configured to send an acknowledgment message to the first node in response to receiving the heartbeat message, the method comprising:
-
(a) sending a heartbeat message from the first node to the second node, the heartbeat message indicating that a heartbeat parameter is to be modified; and (b) deferring modification of the heartbeat parameter in the first node until receipt of an acknowledgment message sent from the second node to the first node that indicates that the heartbeat parameter has been modified in the second node; wherein the heartbeat parameter is selected from the group consisting of heartbeat message time out, heartbeat acknowledgment message time out, heartbeat frequency or interval, heartbeat failure threshold, heartbeat acknowledgment failure threshold, receive/send timer ratio, and combinations thereof. - View Dependent Claims (18, 19, 20, 21, 22, 23)
-
-
24. An apparatus, comprising:
-
(a) a memory; and (b) a program resident in the memory and configured to dynamically modify a heartbeat parameter in a first node among a plurality of nodes in a clustered computer system by sending a heartbeat message to a second node among the plurality of nodes that indicates that the heartbeat parameter is to be modified and thereafter deferring modification of the heartbeat parameter in the first node only after receiving an acknowledgment message from the second node indicating that the heartbeat parameter has been modified in the second node; wherein the heartbeat parameter is selected from the group consisting of heartbeat message time out, heartbeat acknowledgment message time out, heartbeat frequency or interval, heartbeat failure threshold, heartbeat acknowledgment failure threshold, receive/send timer ratio, and combinations thereof. - View Dependent Claims (25, 26, 27, 28)
-
-
29. A program product, comprising:
-
(a) a program configured to dynamically modify a heartbeat parameter in a first node among a plurality of nodes in a clustered computer system by sending a heartbeat message to a second node among the plurality of nodes that indicates that the heartbeat parameter is to be modified and thereafter deferring modification of the heartbeat parameter in the first node only after receiving an acknowledgment message from the second node indicating that the heartbeat parameter has been modified in the second node; and (b) a tangible computer readable medium bearing the program; wherein the heartbeat parameter is selected from the group consisting of heartbeat message time out, heartbeat acknowledgment message time out, heartbeat frequency or interval, heartbeat failure threshold, heartbeat acknowledgment failure threshold, receive/send timer ratio, and combinations thereof.
-
Specification