Node shutdown in clustered computer system
First Claim
1. A method of shutting down a node in a clustered computer system, the method comprising:
- (a) detecting a failure in a first node among a plurality of nodes in a clustered computer system, wherein detecting the failure is performed by a first group member resident on the first node;
(b) in response to detecting the failure, transmitting a signal to each of the other nodes in the plurality of nodes to initiate on each of the other nodes a node leave operation that terminates clustering with the first node; and
(c) in response to detecting the failure, preemptively terminating a second group member resident on the first node prior to any detection of the failure by the second group member.
1 Assignment
0 Petitions
Accused Products
Abstract
A clustered computer system, apparatus, program product and method utilize a group member-initiated shutdown process to terminate clustering on a node in an automated and orderly fashion, typically in the event of a failure detected by a group member residing on that node. As a component of such a process, node leave operations are initiated on the other nodes in a clustered computer system, thereby permitting any dependency failovers to occur in an automated fashion. Moreover, other group members on a node to be shutdown are preemptively terminated prior to local detection of the failure within those other group members, so that termination of clustering on the node may be initiated to complete a shutdown operation.
-
Citations
27 Claims
-
1. A method of shutting down a node in a clustered computer system, the method comprising:
-
(a) detecting a failure in a first node among a plurality of nodes in a clustered computer system, wherein detecting the failure is performed by a first group member resident on the first node;
(b) in response to detecting the failure, transmitting a signal to each of the other nodes in the plurality of nodes to initiate on each of the other nodes a node leave operation that terminates clustering with the first node; and
(c) in response to detecting the failure, preemptively terminating a second group member resident on the first node prior to any detection of the failure by the second group member. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method of shutting down a node in a clustered computer system, the method comprising:
-
in a group member resident on a first node among a plurality of nodes in a clustered computer system, initiating a shutdown of the first node;
shutting down the first node in response to initiation of the shutdown by the group member; and
detecting a failure in the first node with the group member, wherein initiating the shutdown of the first node is performed in response to detecting the failure;
wherein shutting down the first node comprises transmitting a signal to each of the other nodes in the plurality of nodes to initiate on each of the other nodes a node leave operation that terminates clustering with the first node; and
preemptively terminating a second group member resident on the first node prior to any detection of the failure by the second group member.- View Dependent Claims (14, 15, 16, 17)
-
-
18. An apparatus, comprising:
-
(a) a memory accessible by a first node among a plurality of nodes in a clustered computer system; and
(b) first and second group members resident in the memory, the first group member configured to detect a failure in the first node; and
(c) a program resident in the memory, the program configured to shut down the first node in response to the detected failure by transmitting a signal to each of the other nodes in the plurality of nodes to initiate on each of the other nodes a node leave operation that terminates clustering with the first node, and preemptively terminating the second group member resident on the first node prior to any detection of the failure by the second group member. - View Dependent Claims (19, 20, 21, 22, 23)
-
-
24. A clustered computer system, comprising first and second nodes coupled to one another over a network, wherein:
-
(a) the first node is configured to shut down in response to a failure detected in the first node by a first group member resident on the first node by transmitting a signal to the second node and preemptively terminating a second group member resident on the first node prior to any detection of the failure by the second group member; and
(b) the second node is configured to initiate a node leave operation that terminates clustering with the first node in response to the signal from the first node. - View Dependent Claims (25, 26)
-
-
27. A program product, comprising:
-
(a) first and second group members, the first group member configured to detect a failure in a first node among a plurality of nodes in a clustered computer system;
(b) a program configured to shut down the first node in response to the detected failure by transmitting a signal to each of the other nodes in the plurality of nodes to initiate on each of the other nodes a node leave operation that terminates clustering with the first node, and preemptively terminating the second group member resident on the first node prior to any detection of the failure by the second group member, and (c) a computer readable medium bearing the program.
-
Specification