Synchronous local and cross-site failover in clustered storage systems

US 8,904,231 B2
Filed: 08/08/2012
Issued: 12/02/2014
Est. Priority Date: 08/08/2012
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving, by a takeover node in a first cluster at a first site of a cross-site clustered storage system, a failover request;

processing, by the takeover node, the failover request to identify a first partner node in the first cluster and a second partner node in a second cluster at a second site, the first partner node and the takeover node forming a first high-availability (HA) group, the second partner node and a third partner node in the second cluster forming a second HA group, the first HA group and the second HA group forming a disaster recovery (DR) group and sharing a storage fabric with each other; and

resuming, by the takeover node, client access requests associated with a failed partner node synchronously at the takeover node.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Synchronous local and cross-site switchover and switchback operations of a node in a disaster recovery (DR) group are described. In one embodiment, during switchover, a takeover node receives a failover request and responsively identifies a first partner node in a first cluster and a second partner node in a second cluster. The first partner node and the takeover node form a first high-availability (HA) group and the second partner node and a third partner node in the second cluster form a second HA group. The first and second HA groups form the DR group and share a storage fabric. The takeover node synchronously restores client access requests associated with a failed partner node at the takeover node.

257 Citations

30 Claims

1. A method comprising:
- receiving, by a takeover node in a first cluster at a first site of a cross-site clustered storage system, a failover request;
  
  processing, by the takeover node, the failover request to identify a first partner node in the first cluster and a second partner node in a second cluster at a second site, the first partner node and the takeover node forming a first high-availability (HA) group, the second partner node and a third partner node in the second cluster forming a second HA group, the first HA group and the second HA group forming a disaster recovery (DR) group and sharing a storage fabric with each other; and
  
  resuming, by the takeover node, client access requests associated with a failed partner node synchronously at the takeover node.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The method of claim 1, further comprising:
    - synchronously replicating, by the takeover node, cache data associated with the takeover node to the first partner node at the first site and the second partner node at the second site during non-failover conditions, wherein the first site and the second site are geographically remote with respect to each other.
  - 3. The method of claim 2, wherein synchronously replicating the cache data comprises synchronously replicating, by the takeover node, the cache data associated with the takeover node to the first partner node using direct memory access (DMA) operations and to the second partner node using remote direct memory access (RDMA) operations.
  - 4. The method of claim 1, further comprising:
    - assigning, by the takeover node, ownership of a first storage container in the shared storage fabric from the failed partner node to the takeover node, the first storage container geographically co-located at the first site with the takeover node.
  - 5. The method of claim 4, further comprising:
    - prior to said restoring, writing, by the takeover node, cache data associated with the failed partner node from a takeover cache system in the takeover node to the first storage container, wherein the cache data associated with the failed partner node is synchronously replicated from the failed partner node to the takeover node during normal non-failover conditions.
  - 6. The method of claim 5, wherein the failover request comprises a cluster switchover request indicating a cluster failure at the second cluster and that the failed partner node comprises the second partner node.
  - 7. The method of claim 6, wherein cache data associated with the second partner node synchronously replicates from the second partner node to the takeover node and the third partner node during non-failover conditions.
  - 8. The method of claim 6, further comprising:
    - prior to said restoring, activating replicated configuration information associated with the second cluster at the first cluster, wherein the replicated configuration information associated with the second cluster synchronously replicates from the second cluster to the first cluster during non-failover conditions.
  - 9. The method of claim 8, wherein said activating the replicated configuration information further comprises:
    - activating one or more virtual volumes associated with the second cluster at the first cluster; and
      
      activating one or more virtual servers associated with the second cluster at the first cluster.
  - 10. The method of claim 8, further comprising:
    - responsive to receiving a switchback request, deactivating replicated configuration information associated with second cluster at the first cluster.
  - 11. The method of claim 6, further comprising notifying, by the takeover node, the first partner node of the switchover request.
  - 12. The method of claim 6, further comprising notifying, by the takeover node, non-HA partner nodes in the first cluster of the switchover request.
  - 13. The method of claim 1, wherein the failover request comprises a local failover message and the failed node comprises the first partner node.
  - 14. The method of claim 13, further comprising:
    - assigning, by the takeover node, ownership of a second storage container in the shared storage fabric from the first partner node to the takeover node, wherein the second storage container is a mirror copy of a first storage container located at the second site.
  - 15. The method of claim 13, wherein cache data associated with the first partner node is synchronously replicated from the first partner node to the takeover node and the third partner node during non-failover conditions.

16. A storage node for use in a first cluster of a clustered storage system, the storage node comprising:
- an interface configured to receive a cluster switchover request to failover from a second cluster located at a second site of the clustered storage system to the first cluster located at a first site;
  
  a node management module configured to process the cluster switchover request to identify a first partner node in the first cluster and a second partner node in the second cluster, and assign ownership of a first storage container in a shared storage fabric from the second partner node to the storage node in response to the cluster switchover request, wherein the first storage container is located at the first site, the first partner node and the storage node form a first high-availability (HA) group, the second partner node and a third partner node in the second cluster form a second HA group, and the first HA group and the second HA group forming a disaster recovery (DR) group and share the storage fabric with each other; and
  
  a cache system configured to store cache data associated with the storage node, replicated cache data associated with the first partner node, and replicated cache data associated with the second partner node.
- View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24)
- - 17. The storage node of claim 16, the cache system further configured to write the replicated cache data associated with the second partner node to the first storage container after taking ownership of the first container.
  - 18. The storage node of claim 16, the cache system further configured to synchronously replicate the cache data associated with the storage node to the first partner node and to the second partner node during non-failover conditions.
  - 19. The storage node of claim 16, the node management module further configured to activate replicated configuration information associated with the second cluster at the first cluster, wherein the replicated configuration information associated with the second cluster is synchronously replicated from the second cluster to the first cluster during non-failover conditions.
  - 20. The storage node of claim 19, wherein to activate replicated configuration information associated with the second cluster at the first cluster, the node management module activates one or more virtual volumes associated with the second cluster and one or more virtual servers associated with the second cluster at the first cluster.
  - 21. The storage node of claim 19, the node management module further configured to resume client access requests associated with the second partner node after the replicated configuration information associated with the second cluster is activated at the second cluster.
  - 22. The storage node of claim 16, wherein the first site and the second site are geographically remote with respect to each other.
  - 23. The storage node of claim 22, wherein the cache system comprise non-volatile random access memory.
  - 24. The storage node of claim 23, wherein the non-volatile random access memories are battery backed.

25. A method comprising:
- receiving, by a source cluster, a client write request including write data and an indication of a logical container indicating a location to which to write the write data;
  
  identifying, by the source cluster, a source node in the source cluster, the source node associated with the logical container and including a source cache system;
  
  identifying, by the source cluster, a first partner node in the source cluster and a second partner node in a destination cluster, the first partner node having a first cache system and the second partner node having a second cache system; and
  
  concurrently writing, by the source cluster, the client write data to the source cache system, the first cache system, and the second cache system.
- View Dependent Claims (26, 27, 28, 29)
- - 26. The method of claim 25, wherein the logical container comprises a virtual volume.
  - 27. The method of claim 25, further comprising periodically writing data in the source cache system to storage containers assigned to the source node in a storage fabric shared by the source cluster and the destination cluster.
  - 28. The method of claim 25, further comprising:
    - identifying, by the source cluster, a switchback notification indicating a switchback from the destination cluster to the source cluster, the switchback occurring responsive to a healing of the source cluster after a cluster failover event;
      
      identifying, by the source cluster, storage containers in a shared storage fabric owned by the source node prior to the cluster failover event; and
      
      assigning, by the source cluster, ownership of the storage containers to the source node.
  - 29. The method of claim 25 further comprising:
    - activating, by the source cluster, one or more virtual volumes associated with the source cluster at the source cluster; and
      
      activating, by the source cluster, one or more virtual servers associated with the source cluster at the source cluster.

30. A cluster for use in a clustered storage system, the cluster comprising:
- a cluster management module configured to maintain configuration data associated with the cluster and replicated configuration data associated with a second cluster, wherein the configuration data associated with the cluster is replicated to the second cluster; and
  
  a first high-availability (HA) group including a first node having a first cache system and a second node having a second cache system, the first HA group in communication with a second HA group located in a second cluster, the first HA group and the second HA group forming a disaster recovery (DR) group that shares a common storage fabric, the second HA group including a third node having a third cache system and a fourth node having a fourth cache system, wherein the cluster is configured to synchronously replicate storage data directed to the first node to the second node and the third node and synchronously replicate storage data directed to the second node to the first node and the fourth node.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
NetApp, Inc.
Original Assignee
NetApp, Inc.
Inventors
Coatney, Susan, Reddy, Sreelatha S., Bolt, Thomas B., Lambert, Laurent, Ramasubramaniam, Vaiapuri, Patel, Chaitanya, Keremane, Hrishikesh, Kadayam, Harihara
Primary Examiner(s)
KO, CHAE M

Application Number

US13/569,874
Publication Number

US 20140047263A1
Time in Patent Office

846 Days
Field of Search

714/6.3, 714/4.11
US Class Current

714/6.3
CPC Class Codes

G06F 11/20   using active fault-masking,...

G06F 11/2023   Failover techniques

G06F 11/2058   using more than 2 mirrored ...

G06F 11/2092   Techniques of failing over ...

G06F 11/2097   maintaining the standby con...

G06F 12/0802   Addressing of a memory leve...

G06F 12/0866   for peripheral storage syst...

G06F 2212/286   Mirrored cache memory

Synchronous local and cross-site failover in clustered storage systems

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

257 Citations

30 Claims

Specification

Solutions

Use Cases

Quick Links

Synchronous local and cross-site failover in clustered storage systems

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

257 Citations

30 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links