Failover processing in a storage system

US 7,039,827 B2
Filed: 02/13/2002
Issued: 05/02/2006
Est. Priority Date: 02/13/2001
Status: Expired due to Term

First Claim

Patent Images

1. A method for supporting failover between networked storage systems, coupled between a first storage system and a second storage system and a set of one or more storage systems, comprising:

providing a single homogeneous environment distributed across a plurality of processors, cards, and storage systems;

identifying member candidates using a standard protocol;

creating Failover Sets, each Failover Set comprising one or more of said member candidates;

using a database to store and synchronize a configuration on all member candidates in a Failover Set;

for each Failover Set, designating one of the member candidates as a Primary, designating one of the member candidates as a Secondary, and designating remaining member candidates as Alternates;

performing startup processing of the member candidates; and

providing policies for run-time member behavior including fault characterization and detection, health monitoring, compatibility requirements, corrective action during failover, member restart and re-integration, and member failure limit exceeded condition.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Failover processing in storage server system utilizes policies for managing fault tolerance (FT) and high availability (HA) configurations. The approach encapsulates the knowledge of failover recovery between components within a storage server and between storage server systems. This knowledge includes information about what components are participating in a Failover Set, how they are configured for failover, what is the Fail-Stop policy, and what are the steps to perform when “failing-over” a component.

Citations

10 Claims

1. A method for supporting failover between networked storage systems, coupled between a first storage system and a second storage system and a set of one or more storage systems, comprising:
- providing a single homogeneous environment distributed across a plurality of processors, cards, and storage systems;
  
  identifying member candidates using a standard protocol;
  
  creating Failover Sets, each Failover Set comprising one or more of said member candidates;
  
  using a database to store and synchronize a configuration on all member candidates in a Failover Set;
  
  for each Failover Set, designating one of the member candidates as a Primary, designating one of the member candidates as a Secondary, and designating remaining member candidates as Alternates;
  
  performing startup processing of the member candidates; and
  
  providing policies for run-time member behavior including fault characterization and detection, health monitoring, compatibility requirements, corrective action during failover, member restart and re-integration, and member failure limit exceeded condition.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1 wherein said storage systems include a single chassis-based product.
  - 3. The method of claim 1 wherein said storage systems include a single stack-based product.
  - 4. The method of claim 1 wherein said storage systems include two or more chassis-based products.
  - 5. The method of claim 1 wherein said storage systems include two or more stack-based products.
  - 6. The method of claim 1 wherein redundant network links between said networked storage systems are employed by:
    - a Discovery Service to identify said member candidates and verify connectivity by confirming information exchanged in each network;
      
      an Arbitration Service to ensure that a member candidate'"'"'s role is Primary, a member candidate'"'"'s role is Secondary, and remaining member candidates'"'"' roles are Alternates, by supplying a member role in information exchanged in each network;
      
      a Boot Service to coordinate said member role during startup using the type of boot by exchanging said member role in each network; and
      
      a Policy Manager within a Failover Service to distinguish between a communications link failure between member candidates and a real member failure by sending a self-test using the redundant network to determine if said member candidate is functioning according to a specification.
  - 7. The method of claim 6 wherein said network links include different network protocols.
  - 8. The method of claim 6 wherein user configuration and management requests are load balanced across all of said member candidates.
  - 9. The method of claim 6 wherein multi-path programming for attached host and storage devices is load balanced across all of said member candidates and comprises:
    - a port failover policy which is used to intelligently match server storage requests to compatible storage devices comprising;
      
      an Active-Active policy where all paths to an exported virtual device can transfer commands and data simultaneously; and
      
      an Active-Passive policy where only one path to said exported virtual device can transfer commands and data at a time.

10. A system for supporting failover between networked storage systems, coupled between a first storage system and a second storage system and a set of one or more storage systems, comprising:
- a Services Framework to provide a single homogeneous environment distributed across a plurality of processors, cards, and storage systems;
  
  a set of configuration and management software called Services that execute on top of the Services Framework comprising;
  
  a Discovery Service to identify member candidates using a standard protocol; and
  
  a Failover Service to organize the members into various compositions call Failover Sets, including Single, Hierarchical and N-way compositions;
  
  a database management system to store and synchronize the configuration on all members in the failover set;
  
  an Arbitration Service to determines that one member'"'"'s role is Primary, one member'"'"'s role is Secondary, and the remaining member'"'"'s roles are Alternates;
  
  a Boot Service to coordinate the member role during startup using the type of boot; and
  
  a Policy Manager within the Failover Service to provide policies for run-time member behavior including fault characterization and detection, health monitoring, compatibility requirements, corrective action during failover, member restart and re-integration, and the member failure limit exceeded condition.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
NetApp, Inc.
Original Assignee
Network Appliance Incorporated (NetApp, Inc.)
Inventors
Gusev, Andrey, Meyer, Richard, Ng, Chan, Gajjar, Kumar
Primary Examiner(s)
LE, DIEU MINH T

Application Number

US10/076,906
Publication Number

US 20020188711A1
Time in Patent Office

1,539 Days
Field of Search

714/4, 714/5, 714/6, 714/7, 714/8, 714/13, 714/42, 714/43, 714/12
US Class Current

714/4.11
CPC Class Codes

G06F 11/0724   in a multiprocessor or a mu...

G06F 11/0727   in a storage system, e.g. i...

G06F 11/0766   Error or fault reporting or...

G06F 11/0793   Remedial or corrective acti...

G06F 11/1482   by means of middleware or O...

G06F 11/201   between storage system comp...

G06F 11/2025   using centralised failover ...

G06F 11/2089   Redundant storage control f...

G06F 11/2092   Techniques of failing over ...

G06F 11/2094   Redundant storage or storag...

G06F 11/2097   maintaining the standby con...

G06F 12/1458   by checking the subject acc...

G06F 2201/815   Virtual

G06F 3/0605   by facilitating the interac...

G06F 3/0607   by facilitating the process...

G06F 3/0614   Improving the reliability o...

G06F 3/0629   Configuration or reconfigur...

G06F 3/0632   by initialisation or re-ini...

G06F 3/0635   by changing the path, e.g. ...

G06F 3/0665   at area level, e.g. provisi...

G06F 3/067 : Distributed or networked st...

H04L 49/357 : Fibre channel switches

H04L 67/1097 : for distributed storage of ...

View All

Failover processing in a storage system

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Failover processing in a storage system

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links