Automated failover in a cluster of geographically dispersed server nodes using data replication over a long distance communication link
First Claim
1. A method comprising:
- selecting a first server node to be recipient of a failover from a second server node using a cluster service software, the first and second server nodes being programmatically connected by the cluster service software, the first server node being coupled to a first storage system and a first replication module external to the first storage system, the second server node being coupled to a second storage system and a second replication module external to the second storage system, the first and second replication modules being in communication with each other via a long distance communication link to perform data replication between the first and second storage systems;
bringing a controlling cluster resource online at the first server node, the controlling cluster resource being a base dependency of dependent cluster resources in a cluster group;
setting the state of the controlling cluster resource to online pending;
verifying configuration information of the controlling cluster resource;
if the configuration information is correct, determining the name of the first server node;
sending a first command from the controlling cluster resource to the first replication module to initiate failover of data; and
sending a second command from the controlling cluster resource to the local replication module to check for completion of failover of data.
5 Assignments
0 Petitions
Accused Products
Abstract
An embodiment of the invention is a method for performing an automated failover from a remote server node to a local server node, the remote server node and the local server node being in a cluster of geographically dispersed server nodes. The local server node is selected to be recipient of a failover from a remote server node by a cluster service software. The local server node is coupled to a local storage system and a local replication module external to the local storage system. The remote server node is coupled to a remote storage system and a remote replication module external to the remote storage system. The local and remote replication modules are in long distance communication with each other to perform data replication between the local and remote storage systems. A controlling cluster resource is brought online at the local server node, the controlling cluster resource being a base dependency of dependent cluster resources in a cluster group. The state of the controlling cluster resource is set to online pending to delay the dependent cluster resources in the cluster group from going online at the local server node. Configuration information of the controlling cluster resource is then verified.
-
Citations
20 Claims
-
1. A method comprising:
-
selecting a first server node to be recipient of a failover from a second server node using a cluster service software, the first and second server nodes being programmatically connected by the cluster service software, the first server node being coupled to a first storage system and a first replication module external to the first storage system, the second server node being coupled to a second storage system and a second replication module external to the second storage system, the first and second replication modules being in communication with each other via a long distance communication link to perform data replication between the first and second storage systems;
bringing a controlling cluster resource online at the first server node, the controlling cluster resource being a base dependency of dependent cluster resources in a cluster group;
setting the state of the controlling cluster resource to online pending;
verifying configuration information of the controlling cluster resource;
if the configuration information is correct, determining the name of the first server node;
sending a first command from the controlling cluster resource to the first replication module to initiate failover of data; and
sending a second command from the controlling cluster resource to the local replication module to check for completion of failover of data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system comprising:
-
a first server node including a controlling cluster resource and dependent cluster resources in a cluster group, the controlling cluster resource being a base dependency of the dependent cluster resources in the cluster group;
a first storage system coupled to the first server node;
a first replication module coupled to the first server node, the first replication module being external to the first storage system;
a second server node including a copy of the controlling cluster resource and copies of the cluster resources in the cluster group;
a second storage system coupled to the second server node;
a second replication module coupled to the second server node, the second replication module being external to the second storage system;
wherein the first and second server nodes are programmatically connected by a cluster service software, and the first server node is selected by the cluster service software to be recipient of a failover from the second server node, the first and second replication modules are in communication with each other via a long distance communication link to perform data replication between the first and second storage systems, and wherein the controlling cluster resource controls the failover. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. An article of manufacture comprising:
-
a machine-accessible medium including data that, when accessed by a machine, cause the machine to perform operations comprising;
selecting a first server node to be recipient of a failover from a second server node, the first and second server nodes being programmatically connected by a cluster service software, the first server node being coupled to a first storage system and a first replication module external to the first storage system, the second server node being coupled to a second storage system and a second replication module external to the second storage system, the first and second replication modules being in communication with each other via a long distance communication link to perform data replication between the first and second storage systems;
bringing a controlling cluster resource online at the first server node, the controlling cluster resource being a base dependency of dependent cluster resources in a cluster group;
setting the state of the controlling cluster resource to online pending;
verifying configuration information of the controlling cluster resource;
if the configuration information is correct, determining the name of the first server node;
sending a first command from the controlling cluster resource to the first replication module to initiate failover of data; and
sending a second command from the controlling cluster resource to the local replication module to check for completion of failover of data. - View Dependent Claims (18, 19, 20)
-
Specification