Automated node restart in clustered computer system
First Claim
1. A method of restarting a node in a clustered computer system, wherein the clustered computer system hosts a group including first and second members that reside respectively on first and second nodes, the method comprising:
- (a) in response to a clustering failure on the first node, notifying the second member of the group the using the first member; and
(b) in response to the notification, initiating a restart of the first node using the second member.
1 Assignment
0 Petitions
Accused Products
Abstract
An apparatus, program product and method initiate a restart of a node in a clustered computer system using a member of a clustering group that resides on a different node from that to be restarted. Typically, a restart operation is initiated by the member in response to a membership change message sent by another group member that is resident on the node to be restarted, with an indicator associated with the membership change message that indicates that a restart should be initiated. Typically, the restart is implemented in much the same manner as a start operation that is performed when a node is initially added to a cluster, with additional functionality utilized to preclude repeated restart attempts upon a failure of a prior restart operation.
49 Citations
38 Claims
-
1. A method of restarting a node in a clustered computer system, wherein the clustered computer system hosts a group including first and second members that reside respectively on first and second nodes, the method comprising:
-
(a) in response to a clustering failure on the first node, notifying the second member of the group the using the first member; and
(b) in response to the notification, initiating a restart of the first node using the second member. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 18, 19, 20, 21, 22, 23, 25, 26, 27, 28, 29, 30, 31, 32, 33, 35, 37, 38)
-
-
15. A method of restarting a node among a plurality of nodes in a clustered computer system, wherein the clustered computer system hosts a cluster control group including a plurality of cluster control members, each residing respectively on a different node from the plurality of nodes, the method comprising:
-
(a) detecting a clustering failure on a first node among the plurality of nodes;
(b) in response to detecting the clustering failure on the first node, issuing a membership change request from the first node to the cluster control member on each other node in the plurality of nodes, the membership change request indicating that the membership change request is for the purpose of restarting the first node;
(c) terminating clustering on the first node after issuing the membership change request;
(d) in response to the membership change request, selecting a second node from the plurality of nodes that is different from the first node;
(d) issuing a start node request using the selected second node, the start node request indicating that the purpose of the start node request is for restarting the first node; and
(e) in response to the start node request, initiating clustering on the first node.
-
-
17. An apparatus, comprising:
-
(a) a memory accessible by a node in a clustered computer system; and
(b) a program resident in the memory, the program configured to initiate a restart of another node in the clustered computer system in response to a notification from the other node of a clustering failure on the other node.
-
-
24. A clustered computer system, comprising:
-
(a) first and second nodes coupled to one another over a network; and
(b) a group including first and second members, the first member resident on the first node and the second member resident on the second node, wherein the first member is configured to notify the second member in response to a clustering failure on the first node, and wherein the second member is configured to initiate a restart of the first node in response to the notification.
-
-
34. A program product, comprising:
-
(a) a program configured to reside on a node in a clustered computer system, the program configured to initiate a restart of another node in the clustered computer system in response to a notification from the other node of a clustering failure on the other node; and
(b) a signal bearing medium bearing the program.
-
-
36. A program product, comprising:
-
(a) first and second programs respectively configured to reside on first and second nodes in a clustered computer system, the first and second programs respectively operating as first and second members of a group, the first program configured to configured to notify the second program in response to a clustering failure on the first node, and the second program configured to initiate a restart of the first node in response to the notification; and
(b) at least one signal bearing medium bearing the first and second programs.
-
Specification