Synchronous local and cross-site failover in clustered storage systems
First Claim
Patent Images
1. A method comprising:
- receiving, by a takeover node in a first cluster at a first site of a cross-site clustered storage system, a failover request;
processing, by the takeover node, the failover request to identify a first partner node in the first cluster and a second partner node in a second cluster at a second site, the first partner node and the takeover node forming a first high-availability (HA) group, the second partner node and a third partner node in the second cluster forming a second HA group, the first HA group and the second HA group forming a disaster recovery (DR) group and sharing a storage fabric with each other; and
resuming, by the takeover node, client access requests associated with a failed partner node synchronously at the takeover node.
1 Assignment
0 Petitions
Accused Products
Abstract
Synchronous local and cross-site switchover and switchback operations of a node in a disaster recovery (DR) group are described. In one embodiment, during switchover, a takeover node receives a failover request and responsively identifies a first partner node in a first cluster and a second partner node in a second cluster. The first partner node and the takeover node form a first high-availability (HA) group and the second partner node and a third partner node in the second cluster form a second HA group. The first and second HA groups form the DR group and share a storage fabric. The takeover node synchronously restores client access requests associated with a failed partner node at the takeover node.
257 Citations
30 Claims
-
1. A method comprising:
-
receiving, by a takeover node in a first cluster at a first site of a cross-site clustered storage system, a failover request; processing, by the takeover node, the failover request to identify a first partner node in the first cluster and a second partner node in a second cluster at a second site, the first partner node and the takeover node forming a first high-availability (HA) group, the second partner node and a third partner node in the second cluster forming a second HA group, the first HA group and the second HA group forming a disaster recovery (DR) group and sharing a storage fabric with each other; and resuming, by the takeover node, client access requests associated with a failed partner node synchronously at the takeover node. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A storage node for use in a first cluster of a clustered storage system, the storage node comprising:
-
an interface configured to receive a cluster switchover request to failover from a second cluster located at a second site of the clustered storage system to the first cluster located at a first site; a node management module configured to process the cluster switchover request to identify a first partner node in the first cluster and a second partner node in the second cluster, and assign ownership of a first storage container in a shared storage fabric from the second partner node to the storage node in response to the cluster switchover request, wherein the first storage container is located at the first site, the first partner node and the storage node form a first high-availability (HA) group, the second partner node and a third partner node in the second cluster form a second HA group, and the first HA group and the second HA group forming a disaster recovery (DR) group and share the storage fabric with each other; and a cache system configured to store cache data associated with the storage node, replicated cache data associated with the first partner node, and replicated cache data associated with the second partner node. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A method comprising:
-
receiving, by a source cluster, a client write request including write data and an indication of a logical container indicating a location to which to write the write data; identifying, by the source cluster, a source node in the source cluster, the source node associated with the logical container and including a source cache system; identifying, by the source cluster, a first partner node in the source cluster and a second partner node in a destination cluster, the first partner node having a first cache system and the second partner node having a second cache system; and concurrently writing, by the source cluster, the client write data to the source cache system, the first cache system, and the second cache system. - View Dependent Claims (26, 27, 28, 29)
-
-
30. A cluster for use in a clustered storage system, the cluster comprising:
-
a cluster management module configured to maintain configuration data associated with the cluster and replicated configuration data associated with a second cluster, wherein the configuration data associated with the cluster is replicated to the second cluster; and a first high-availability (HA) group including a first node having a first cache system and a second node having a second cache system, the first HA group in communication with a second HA group located in a second cluster, the first HA group and the second HA group forming a disaster recovery (DR) group that shares a common storage fabric, the second HA group including a third node having a third cache system and a fourth node having a fourth cache system, wherein the cluster is configured to synchronously replicate storage data directed to the first node to the second node and the third node and synchronously replicate storage data directed to the second node to the first node and the fourth node.
-
Specification