Dynamic recovery from a split-brain failure in edge nodes
First Claim
1. A non-transitory machine readable medium of a first edge node of a network storing a program which when executed by at least one processing unit of the edge node determines whether the first edge node should be an active edge node or a standby edge node, the program comprising sets of instructions for:
- sending a first message to a controller cluster of the network in response to the first edge node transitioning from a standby state to an active state;
after sending the first message, receiving, from the controller cluster, a second message that identifies a state of the controller cluster;
receiving, from the controller cluster, a third message that identifies a state of a second edge node of the network;
determining, based on the received second and third messages, that the first edge node should not be an active edge node; and
changing a state of the first edge node to standby from active, in response to the determination that the first edge node should not be an active edge node.
0 Assignments
0 Petitions
Accused Products
Abstract
Some embodiments provide a method for employing the management and control system of a network to dynamically recover from a split-brain condition in the edge nodes of the network. The method of some embodiments takes a corrective action to automatically recover from a split-brain failure occurred at a pair of high availability (HA) edge nodes of the network. The HA edge nodes include an active machine and a standby machine. The active edge node actively passes through the network traffic (e.g., north-south traffic for a logical network), while the standby edge node is synchronized and ready to transition to the active state, should a failure occur. Both HA nodes share the same configuration settings and only one is active until a path, link, or system failure occurs. The active edge node also provides stateful services (e.g., stateful firewall, load balancing, etc.) to the data compute nodes of the network.
320 Citations
16 Claims
-
1. A non-transitory machine readable medium of a first edge node of a network storing a program which when executed by at least one processing unit of the edge node determines whether the first edge node should be an active edge node or a standby edge node, the program comprising sets of instructions for:
-
sending a first message to a controller cluster of the network in response to the first edge node transitioning from a standby state to an active state; after sending the first message, receiving, from the controller cluster, a second message that identifies a state of the controller cluster; receiving, from the controller cluster, a third message that identifies a state of a second edge node of the network; determining, based on the received second and third messages, that the first edge node should not be an active edge node; and changing a state of the first edge node to standby from active, in response to the determination that the first edge node should not be an active edge node. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for determining whether a first edge node of a network should be an active edge node or a standby edge node, the method comprising:
-
sending a first message to a controller cluster of the network in response to the first edge node transitioning from a standby state to an active state; after sending the first message, receiving, from the controller cluster, a second message that identifies a state of the controller cluster; receiving, from the controller cluster, a third message that identifies a state of a second edge node of the network; determining, based on the received second and third messages, that the first edge node should not be an active edge node; and changing a state of the first edge node to standby from active, in response to the determining that the first edge node should not to be an active edge node. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
Specification