Leadership lease protocol for data replication groups
First Claim
Patent Images
1. A computer-implemented method, comprising:
- determining a particular node of a plurality of nodes of a data replication group has been elected as a master node and setting a current state to a leased state, the plurality of nodes implementing a consensus protocol for replicating data across the plurality of nodes;
transmitting, by the master node, heartbeat messages to individual nodes of the plurality of nodes;
on a condition that responses to the heartbeat messages are not received by the master node from a quorum of the plurality of nodes within a heartbeat interval, suspending transmission of further heartbeat messages;
on a condition that responses to the heartbeat messages are not received by the master node from a quorum of the plurality of nodes within a wait period, the wait interval being a period during which the master node waits to receive responses to pending heartbeat messages, the wait interval being greater than the heartbeat interval, setting the current state to an expiring state; and
after expiration of a safety interval, the safety interval being greater than the wait interval, commencing election of a new master node.
1 Assignment
0 Petitions
Accused Products
Abstract
Data replication groups may be used to store data in a distributed computing environment. A data replication group may include a set of nodes executing a consensus protocol to maintain data durably. In order to increase efficiency and performance of the data replication, a particular node of the data replication group may be assigned the role of master node. The role of master node may be lease in accordance with a consensus protocol. If the lease is not renewed within an interval of time election/selection of a new master node may be commenced.
62 Citations
20 Claims
-
1. A computer-implemented method, comprising:
-
determining a particular node of a plurality of nodes of a data replication group has been elected as a master node and setting a current state to a leased state, the plurality of nodes implementing a consensus protocol for replicating data across the plurality of nodes; transmitting, by the master node, heartbeat messages to individual nodes of the plurality of nodes; on a condition that responses to the heartbeat messages are not received by the master node from a quorum of the plurality of nodes within a heartbeat interval, suspending transmission of further heartbeat messages; on a condition that responses to the heartbeat messages are not received by the master node from a quorum of the plurality of nodes within a wait period, the wait interval being a period during which the master node waits to receive responses to pending heartbeat messages, the wait interval being greater than the heartbeat interval, setting the current state to an expiring state; and after expiration of a safety interval, the safety interval being greater than the wait interval, commencing election of a new master node. - View Dependent Claims (2, 3, 4)
-
-
5. A system, comprising:
-
one or more processors; and memory that includes instructions that, as a result of being executed by the one or more processors, cause the system to; during a first interval, transmit a set of heartbeat messages to individual nodes of a plurality of nodes of a data replication group, the plurality of nodes implementing a consensus protocol, where a current state corresponds to a role of master node being leased; and in response to a failure to receive a set of responses to the set of heartbeat messages from a quorum of the plurality of nodes during a second interval; modify the current state such that the current state indicates that the role of master node is expiring and no longer transmits heartbeat messages; transmit, after a third interval, a set of election requests to the plurality of nodes. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12)
-
-
13. A set of one or more non-transitory computer-readable storage media having stored thereon executable instructions that, as a result of being executed by one or more processors of a computer system, cause the computer system to:
-
transmit a set of messages to a plurality of nodes of a data replication group, the plurality of nodes implementing a consensus protocol including at least one node having a role of master node of the data replication group, wherein receipt of a response from a quorum of the plurality of nodes during a first interval results in a renewal of a lease; after not receiving a set of responses to the set of messages from a quorum of the plurality of nodes within a second interval, indicate that the lease of the role of master node is expiring; and at the expiration of a third interval, select a node of the plurality of nodes to obtain the role of master node. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification