QUORUM-BASED POWER-DOWN OF UNRESPONSIVE SERVERS IN A COMPUTER CLUSTER
First Claim
1. An apparatus comprising:
- at least one processor;
a memory coupled to the at least one processor;
a server process residing in the memory and executed by the at least one processor;
a cluster engine residing in the memory and executed by the at least one processor, the cluster engine handling communications between the server process and other servers in a cluster; and
a quorum-based server power-down mechanism residing in the memory and executed by the at least one processor, the quorum-based server power-down mechanism determining whether the server process is part of a group of servers that include a majority of servers in the cluster, and if so, the quorum-based server power-down mechanism determines whether a manager of the cluster failed when an indication of a server failure is received, and if so, the quorum-based server power-down mechanism issues at least one command to power down all unresponsive servers in the cluster, wherein an unresponsive server is a server that fails to send a periodic message that indicates the server is functioning properly, and if the manager of the cluster did not fail, the quorum-based server power-down mechanism issues at least one command to power down a server corresponding to the received indication of server failure.
0 Assignments
0 Petitions
Accused Products
Abstract
A quorum-based server power-down mechanism allows a manager in a computer cluster to power-down unresponsive servers in a manner that assures that an unresponsive server does not become responsive again. In order for a manager in a cluster to power down servers in the cluster, the cluster must have quorum, meaning that a majority of the computers in the cluster must be responsive. If the cluster has quorum, and if the manager server did not fail, the manager causes the failed server(s) to be powered down. If the manager server did fail, the new manager causes all unresponsive servers in the cluster to be powered down. If the power-down is successful, the resources on the failed server(s) may be failed over to other servers in the cluster that were not powered down. If the power-down is not successful, the cluster is disabled.
-
Citations
23 Claims
-
1. An apparatus comprising:
-
at least one processor; a memory coupled to the at least one processor; a server process residing in the memory and executed by the at least one processor; a cluster engine residing in the memory and executed by the at least one processor, the cluster engine handling communications between the server process and other servers in a cluster; and a quorum-based server power-down mechanism residing in the memory and executed by the at least one processor, the quorum-based server power-down mechanism determining whether the server process is part of a group of servers that include a majority of servers in the cluster, and if so, the quorum-based server power-down mechanism determines whether a manager of the cluster failed when an indication of a server failure is received, and if so, the quorum-based server power-down mechanism issues at least one command to power down all unresponsive servers in the cluster, wherein an unresponsive server is a server that fails to send a periodic message that indicates the server is functioning properly, and if the manager of the cluster did not fail, the quorum-based server power-down mechanism issues at least one command to power down a server corresponding to the received indication of server failure. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A networked computer system comprising:
a plurality of servers coupled together via a network into a cluster, each server comprising; a cluster engine that handles communications between servers in the cluster; and a quorum-based server power-down mechanism that determines whether a server is part of a group of servers that includes a majority of servers in the cluster, and if so, the quorum-based server power-down mechanism determines whether a manager of the cluster failed when an indication of a server failure is received, and if so, the quorum-based server power-down mechanism issues at least one command to power down all unresponsive servers in the cluster, wherein an unresponsive server is a server that fails to send a periodic message that indicates the server is functioning properly, and if the manager of the cluster did not fail, the quorum-based server power-down mechanism issues at least one command to power down a server corresponding to the received indication of server failure. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
17. A computer readable recordable medium bearing a computer program, the computer program comprising:
a quorum-based server power-down mechanism that determines whether a server is part of a group of servers that include a majority of servers in a cluster, and if so, the quorum-based server power-down mechanism determines whether a manager of the cluster failed when an indication of a server failure is received, and if so, the quorum-based server power-down mechanism issues at least one command to power down all unresponsive servers in the cluster, wherein an unresponsive server is a server that fails to send a periodic message that indicates the server is functioning properly, and if the manager of the cluster did not fail, the quorum-based server power-down mechanism issues at least one command to power down a server corresponding to the received indication of server failure. - View Dependent Claims (18, 19, 20, 21, 22, 23)
Specification