Failover processing in a storage system
First Claim
Patent Images
1. A method for supporting failover between networked storage systems, coupled between a first storage system and a second storage system and a set of one or more storage systems, comprising:
- providing a single homogeneous environment distributed across a plurality of processors, cards, and storage systems;
identifying member candidates using a standard protocol;
creating Failover Sets, each Failover Set comprising one or more of said member candidates;
using a database to store and synchronize a configuration on all member candidates in a Failover Set;
for each Failover Set, designating one of the member candidates as a Primary, designating one of the member candidates as a Secondary, and designating remaining member candidates as Alternates;
performing startup processing of the member candidates; and
providing policies for run-time member behavior including fault characterization and detection, health monitoring, compatibility requirements, corrective action during failover, member restart and re-integration, and member failure limit exceeded condition.
3 Assignments
0 Petitions
Accused Products
Abstract
Failover processing in storage server system utilizes policies for managing fault tolerance (FT) and high availability (HA) configurations. The approach encapsulates the knowledge of failover recovery between components within a storage server and between storage server systems. This knowledge includes information about what components are participating in a Failover Set, how they are configured for failover, what is the Fail-Stop policy, and what are the steps to perform when “failing-over” a component.
-
Citations
10 Claims
-
1. A method for supporting failover between networked storage systems, coupled between a first storage system and a second storage system and a set of one or more storage systems, comprising:
-
providing a single homogeneous environment distributed across a plurality of processors, cards, and storage systems; identifying member candidates using a standard protocol; creating Failover Sets, each Failover Set comprising one or more of said member candidates; using a database to store and synchronize a configuration on all member candidates in a Failover Set; for each Failover Set, designating one of the member candidates as a Primary, designating one of the member candidates as a Secondary, and designating remaining member candidates as Alternates; performing startup processing of the member candidates; and providing policies for run-time member behavior including fault characterization and detection, health monitoring, compatibility requirements, corrective action during failover, member restart and re-integration, and member failure limit exceeded condition. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system for supporting failover between networked storage systems, coupled between a first storage system and a second storage system and a set of one or more storage systems, comprising:
-
a Services Framework to provide a single homogeneous environment distributed across a plurality of processors, cards, and storage systems; a set of configuration and management software called Services that execute on top of the Services Framework comprising; a Discovery Service to identify member candidates using a standard protocol; and a Failover Service to organize the members into various compositions call Failover Sets, including Single, Hierarchical and N-way compositions; a database management system to store and synchronize the configuration on all members in the failover set; an Arbitration Service to determines that one member'"'"'s role is Primary, one member'"'"'s role is Secondary, and the remaining member'"'"'s roles are Alternates; a Boot Service to coordinate the member role during startup using the type of boot; and a Policy Manager within the Failover Service to provide policies for run-time member behavior including fault characterization and detection, health monitoring, compatibility requirements, corrective action during failover, member restart and re-integration, and the member failure limit exceeded condition.
-
Specification