Third vote consensus in a cluster using shared storage devices
First Claim
1. A system comprising:
- a cluster having a plurality of nodes, each node having a processor;
a storage array having one or more shared storage devices coupled to each node of the cluster;
a local storage device coupled to each node of the cluster; and
a storage I/O stack executing on the processor of each node of the cluster, the storage I/O stack configured to;
maintain a local copy of configuration information of the cluster on the local storage device, the local copy of the configuration information representing a quorum vote for each node of the cluster;
obtain ownership of a shared copy of the configuration information stored on the storage array by an owner node of the cluster, the shared copy of the configuration information representing an additional vote used to establish a quorum for the cluster; and
in response to the owner node failing, claim the additional vote represented by the shared copy of the configuration information at a surviving node of the cluster by fencing the failed node from the shared storage devices of the storage array.
0 Assignments
0 Petitions
Accused Products
Abstract
A third vote consensus technique enables a first node, i.e., a surviving node, of a two-node cluster to establish a quorum and continue to operate in response to failure of a second node of the cluster. Each node maintains configuration information organized as a cluster database (CDB) which may be changed according to a consensus-based protocol. Changes to the CDB are logged on a third copy file system (TCFS) stored on a local copy of TCFS (L-TCFS). A shared copy of the TCFS (i.e., S-TCFS) may be stored on shared storage devices of one or more storage arrays coupled to the nodes. The local copy of the TCFS (i.e., L-TCFS) represents a quorum vote for each node of the cluster, while the S-TCFS represents an additional “tie-breaker” vote of a consensus-based protocol. The additional vote may be obtained from the shared storage devices by the surviving node as a third vote to establish the quorum and enable the surviving node to cast two of three votes (i.e., a majority of votes) needed to continue operation of the cluster. That is, the majority of votes allows the surviving node to update the CDB with the configuration information changes so as to continue proper operation of the cluster.
55 Citations
20 Claims
-
1. A system comprising:
-
a cluster having a plurality of nodes, each node having a processor; a storage array having one or more shared storage devices coupled to each node of the cluster; a local storage device coupled to each node of the cluster; and a storage I/O stack executing on the processor of each node of the cluster, the storage I/O stack configured to; maintain a local copy of configuration information of the cluster on the local storage device, the local copy of the configuration information representing a quorum vote for each node of the cluster; obtain ownership of a shared copy of the configuration information stored on the storage array by an owner node of the cluster, the shared copy of the configuration information representing an additional vote used to establish a quorum for the cluster; and in response to the owner node failing, claim the additional vote represented by the shared copy of the configuration information at a surviving node of the cluster by fencing the failed node from the shared storage devices of the storage array. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method comprising:
-
organizing a plurality of nodes as a cluster, each node of the cluster coupled to a storage array having one or more shared storage devices, each node further coupled to a local storage device; maintaining a local copy of configuration information of the cluster on the local storage device, the local copy of the configuration information representing a quorum vote for each node of the cluster; obtaining ownership of a shared copy of the configuration information stored on the storage array by an owner node of the cluster, the shared copy of the configuration information representing an additional vote used to establish a quorum for the cluster; and in response to the owner node failing, claiming the additional vote represented by the shared copy of the configuration information at a surviving node of the cluster by fencing the failed node from the shared storage devices of the storage array. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. A non-transitory computer readable medium including program instructions for execution on a processor of a storage system, the program instructions configured to:
-
organize a plurality of nodes as a cluster, each node of the cluster coupled to a storage array having one or more shared storage devices, each node further coupled to a local storage device; maintain a local copy of configuration information of the cluster on the local storage device, the local copy of the configuration information representing a quorum vote for each node of the cluster; obtain ownership of a shared copy of the configuration information stored on the storage array by an owner node of the cluster, the shared copy of the configuration information representing an additional vote used to establish a quorum for the cluster; and in response to the owner node failing, claim the additional vote represented by the shared copy of the configuration information at a surviving node of the cluster by fencing the failed node from the shared storage devices of the storage array.
-
Specification