Distributed control protocol for high availability in multi-node storage cluster
First Claim
1. A method comprising:
- receiving, by a first node computing device, a plurality of quorum change messages from a plurality of nodes via a cluster switching fabric, wherein the plurality of nodes comprises at least a second node of an existing cluster and a third node;
determining, by the first node computing device, when there is a quorum change event comprising a new node join event based on absence of one or more parameters from one of the quorum change messages received from the third node; and
establishing, by the first node computing device, one or more high availability (HA) partner relationships in a new cluster based on one or more other parameters from another one of the quorum change messages received from the second node in order to join the third node to the new cluster, when the determining indicates that the quorum change event comprises a new node join event.
1 Assignment
0 Petitions
Accused Products
Abstract
A distributed control protocol dynamically establishes high availability (HA) partner relationships for nodes in a cluster. A HA partner relationship may be established by copying (mirroring) information maintained in a non-volatile random access memory (NVRAM) of a node over a HA interconnect to the NVRAM of a partner node in the cluster. The distributed control protocol leverages a Cluster Liveliness and Availability Manager (CLAM) utility of a storage operating system executing on the nodes to rebalance NVRAM mirroring and alter HA partner relationships of the nodes in the cluster. The CLAM utility is configured to maintain various cluster related issues, such as CLAM quorum events, addition or subtraction of a node in the cluster and other changes in configuration of the cluster. Notably, the CLAM utility is an event based manager that implements the control protocol to keep the nodes informed of any cluster changes through event generation and propagation.
-
Citations
18 Claims
-
1. A method comprising:
-
receiving, by a first node computing device, a plurality of quorum change messages from a plurality of nodes via a cluster switching fabric, wherein the plurality of nodes comprises at least a second node of an existing cluster and a third node; determining, by the first node computing device, when there is a quorum change event comprising a new node join event based on absence of one or more parameters from one of the quorum change messages received from the third node; and establishing, by the first node computing device, one or more high availability (HA) partner relationships in a new cluster based on one or more other parameters from another one of the quorum change messages received from the second node in order to join the third node to the new cluster, when the determining indicates that the quorum change event comprises a new node join event. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A node computing device, comprising:
-
one or more processors; and a memory coupled to the one or more processors and containing machine readable medium comprising machine executable code having stored thereon instructions for dynamically establishing high availability (HA) partner relationships, the one or more processors configured to execute the machine executable code to cause the one or more processors to; receive a plurality of quorum change messages from a plurality of nodes via a cluster switching fabric, wherein the plurality of nodes comprises at least a first node of an existing cluster and a second node; determine when there is a quorum change event comprising a new node join event based on an absence of one or more parameters from one of the quorum change messages received from the second node; and establish one or more HA partner relationships in a new cluster based on one or more other parameters from another one of the quorum change messages received from the first node in order to join the second node to the new cluster, when the determining indicates that the quorum change event comprises a new node join event. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A non-transitory machine readable medium having stored thereon instructions for dynamically establishing high availability (HA) partner relationships comprising machine executable code which when executed by at least one machine causes the machine to:
-
receive a plurality of quorum change messages from plurality of nodes via a cluster switching fabric, wherein the plurality of nodes comprises at least a first node of an existing cluster and a second node; determine when there is a quorum change event comprising a new node join event based on an absence of one or more parameters from one of the quorum change messages received from the second node; and establish one or more HA partner relationships in a new cluster based on one or more other parameters from another one of the quorum change messages received from the first node in order to join the second node to the new cluster, when the determining indicates that the quorum change event comprises a new node join event. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification