System and method for splitting a cluster for disaster recovery
First Claim
Patent Images
1. A method for operating a cluster of servers, comprising:
- detecting that an error condition has occurred in a plurality of servers, the plurality of servers making up the cluster, the error condition initiated in response to the clusters inability to meet a conventional quorum requirement and preventing any write operations from occurring on any one server of the plurality of servers and wherein the conventional quorum state requires a majority of the nodes to be healthy to have quorum;
determining that a selected server of the plurality of servers is functioning correctly;
executing a split command by a user to override the quorum requirement by designating the selected server as a full read/write replica of the cluster and forming a cluster of one server from the selected server, the cluster now having a stand alone server and a cluster configuration of one to one thereby allowing the server to be modified without use of a voting system; and
assigning a workload of the cluster to the cluster of one server.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides a system and method for disaster recovery split of a node from a cluster to enable cluster management operations using quorum-based data replication services to continue. A split command is executed on a selected node and a new site list data structure describing the cluster is generated. The site list data structure marks all nodes other than the selected node as ineligible, thereby placing the selected node in quorum.
46 Citations
38 Claims
-
1. A method for operating a cluster of servers, comprising:
-
detecting that an error condition has occurred in a plurality of servers, the plurality of servers making up the cluster, the error condition initiated in response to the clusters inability to meet a conventional quorum requirement and preventing any write operations from occurring on any one server of the plurality of servers and wherein the conventional quorum state requires a majority of the nodes to be healthy to have quorum; determining that a selected server of the plurality of servers is functioning correctly; executing a split command by a user to override the quorum requirement by designating the selected server as a full read/write replica of the cluster and forming a cluster of one server from the selected server, the cluster now having a stand alone server and a cluster configuration of one to one thereby allowing the server to be modified without use of a voting system; and assigning a workload of the cluster to the cluster of one server. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14)
-
-
13. A cluster of servers, comprising:
-
a plurality of servers that develops an error condition, the plurality of servers making up the cluster, the error condition is detected, the error condition initiated in response to the clusters inability to meet a conventional quorum requirement and preventing any write operations from occurring on any one server of the plurality of servers and wherein the conventional quorum state requires a majority of the nodes to be healthy to have quorum; a server which is determined to be functioning correctly is selected, hereinafter the selected server; a split command that is executed by a user to override the quorum requirement by designating the selected server as a full read/write replica of the cluster and forming a cluster of one server from the selected server, the cluster now having a stand alone server and a cluster configuration of one to one thereby allowing the server to be modified without use of a voting system; and a workload of the cluster assigned to the cluster of one server. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A computer readable media, comprising:
-
said computer readable media containing instructions for execution on a processor for a method of operating a cluster of servers, the method having the steps of, detecting that an error condition has occurred in a plurality of servers, the plurality of servers making up the cluster, the error condition initiated in response to the clusters inability to meet a conventional quorum requirement and preventing any write operations from occurring on any one server of the plurality of servers and wherein the conventional quorum state requires a majority of the nodes to be healthy to have quorum; determining that a selected server of the plurality of servers is functioning correctly; executing a split command to override the quorum requirement by designating the selected server as a full read/write replica of the cluster and forming, a cluster of one server from the selected server computer, the cluster now having a stand alone server and a cluster configuration of one to one thereby allowing the server to be modified without use of a voting system; and assigning a workload of the cluster to the cluster of one server.
-
-
26. A method for operating a cluster of servers, comprising:
-
detecting that an error condition has occurred in a plurality of servers, the plurality of servers making up the cluster; determining that a failure of the cluster resulted in the impossibility of a conventional quorum being achieved preventing a write operation from occurring on any one server of the plurality of servers and wherein the conventional quorum state requires a majority of the nodes to be healthy to have quorum; executing a split command by a user to override the quorum requirement by designating a selected server as a full read/write replica of the cluster and forming a cluster of one server from the selected server, the cluster now having a stand alone server and a cluster configuration of one to one thereby allowing the server to be modified without use of a voting system; and assigning a workload of the cluster to the cluster of one server. - View Dependent Claims (27, 28, 29, 30, 31, 32, 33, 34, 35, 36)
-
-
37. A cluster of servers, comprising:
-
a plurality of servers that develops an error condition, the plurality of servers making up the cluster; a server which determined that a failure of the cluster resulted in the impossibility of a conventional quorum being achieved preventing a write operation from occurring on any one server of the plurality of servers and wherein the conventional quorum state requires a majority of the nodes to be healthy to have quorum; a split command by a user to override the quorum requirement by that is executed to designate a selected server as a full read/write replica of the cluster and forming a cluster of one server from the selected server, the cluster now having a stand alone server and a cluster configuration of one to one thereby allowing the server to be modified without use of a voting system; and a workload of the cluster assigned to the cluster of one server.
-
-
38. A computer readable media, comprising:
-
said computer readable media containing instructions for execution on a processor for a method of operating a cluster of servers, the method having the steps of; detecting that an error condition has occurred in a plurality of servers, the plurality of servers making up the cluster; determining that a failure of the cluster resulted in the impossibility of a conventional quorum being achieved preventing a write operation from occurring on any one server of the plurality of servers and wherein the conventional quorum state requires a majority of the nodes to be healthy to have quorum; executing a split command by a user to override the quorum requirement by designating a selected server as a full read/write replica of the cluster and forming a cluster of one server from the selected server, the cluster now having a stand alone server and a cluster configuration of one to one thereby allowing the server to be modified without use of a voting system; and assigning a workload of the cluster to the cluster of one server.
-
Specification