Systems and methods for resolving split-brain scenarios in computer clusters
First Claim
1. A computer-implemented method for resolving split-brain scenarios in computer clusters, the method being performed by a coordination point server comprising at least one processor, the method comprising:
- identifying a plurality of nodes within a computer cluster that are configured to collectively perform at least one task;
receiving, at the coordination point server from a node within the computer cluster, a failure notification that identifies a link-based communication failure experienced by the node that prevents the nodes within the computer cluster from collectively performing the task;
upon receiving the failure notification at the coordination point server, immediately causing the coordination point server to initiate an arbitration event within the computer cluster in order to identify a subset of the nodes that is to assume responsibility for performing the task subsequent to the link-based communication failure;
wherein the coordination point server initiates the arbitration event by immediately prompting each node within the computer cluster to participate in the arbitration event.
7 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method for resolving split-brain scenarios in computer clusters may include (1) identifying a plurality of nodes within a computer cluster that are configured to collectively perform at least one task, (2) receiving, from a node within the computer cluster, a failure notification that identifies a link-based communication failure experienced by the node that prevents the nodes within the computer cluster from collectively performing the task, and, upon receiving the failure notification, (3) immediately prompting each node within the computer cluster to participate in an arbitration event in order to identify a subset of the nodes that is to assume responsibility for performing the task subsequent to the link-based communication failure. Various other methods, systems, and computer-readable media are also disclosed.
48 Citations
20 Claims
-
1. A computer-implemented method for resolving split-brain scenarios in computer clusters, the method being performed by a coordination point server comprising at least one processor, the method comprising:
-
identifying a plurality of nodes within a computer cluster that are configured to collectively perform at least one task; receiving, at the coordination point server from a node within the computer cluster, a failure notification that identifies a link-based communication failure experienced by the node that prevents the nodes within the computer cluster from collectively performing the task; upon receiving the failure notification at the coordination point server, immediately causing the coordination point server to initiate an arbitration event within the computer cluster in order to identify a subset of the nodes that is to assume responsibility for performing the task subsequent to the link-based communication failure; wherein the coordination point server initiates the arbitration event by immediately prompting each node within the computer cluster to participate in the arbitration event. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system for resolving split-brain scenarios in computer clusters, the system comprising:
-
an identification module installed on a coordination point server and programmed to identify a plurality of nodes within a computer cluster that are configured to collectively perform at least one task; an arbitration module installed on the coordination point server and programmed to; receive, at the coordination point server from a node within the computer cluster, a failure notification that identifies a link-based communication failure experienced by the node that prevents the nodes within the computer cluster from collectively performing the task; upon receiving the failure notification at the coordination point server, immediately cause the coordination point server to initiate an arbitration event within the computer cluster in order to identify a subset of the nodes that is to assume responsibility for performing the task subsequent to the link-based communication failure; wherein the arbitration module causes the coordination point server to initiate the arbitration event by immediately prompting each node within the computer cluster to participate in the arbitration event; at least one processor configured to execute the identification module and the arbitration module. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A non-transitory computer-readable-storage medium comprising one or more computer-executable instructions that, when executed by at least one processor of a coordination point server, cause the coordination point server to:
-
identify a plurality of nodes within a computer cluster that are configured to collectively perform at least one task; receive, at the coordination point server from a node within the computer cluster, a failure notification that identifies a link-based communication failure experienced by the node that prevents the nodes within the computer cluster from collectively performing the task; upon receiving the failure notification at the coordination point server, immediately cause the coordination point server to initiate an arbitration event within the computer cluster in order to identify a subset of the nodes that is to assume responsibility for performing the task subsequent to the link-based communication failure; wherein the coordination point server initiates the arbitration event by immediately prompting each node within the computer cluster to participate in the arbitration event. - View Dependent Claims (20)
-
Specification