High-availability computer cluster with failover support based on a resource map
First Claim
1. A method comprising:
- representing a cluster of computer resources as nodes in a dependency graph, the nodes including a plurality of articulation points, wherein removal of an articulation point due to a resource failure results in a disconnected dependency graph;
if a failed resource corresponds to an articulation point, performing a failover for a resource group that the failed resource is a member, wherein all of the nodes in a cluster are aware of status of all resources on all other nodes, andif a failover remote resource only satisfies a portion of all resource requirements, then failover still occurs and the failover remote resource runs in a degraded mode,wherein the resources comprise local resources and the failover is to a local resource if the failed resource does not affect all of the local resources, and upon a resource failing, analyzing the dependency graph to determine if an operational state of a primary node is reachable.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of the invention relate to handling failures in a cluster of computer resources. The resources are represented as nodes in a dependency graph in which some nodes are articulation points and the removal of any articulation point due to a resource failure results in a disconnected graph. The embodiments perform a failover when a resource corresponding to an articulation point fails. The failover is to a local resource if the failed resource does not affect all local resources. The failover is to a remote resource if no local resource can meet all resource requirements of the failed resource, and to a remote resource running in a degraded mode if the remote resource cannot meet all of the requirements.
-
Citations
17 Claims
-
1. A method comprising:
-
representing a cluster of computer resources as nodes in a dependency graph, the nodes including a plurality of articulation points, wherein removal of an articulation point due to a resource failure results in a disconnected dependency graph; if a failed resource corresponds to an articulation point, performing a failover for a resource group that the failed resource is a member, wherein all of the nodes in a cluster are aware of status of all resources on all other nodes, and if a failover remote resource only satisfies a portion of all resource requirements, then failover still occurs and the failover remote resource runs in a degraded mode, wherein the resources comprise local resources and the failover is to a local resource if the failed resource does not affect all of the local resources, and upon a resource failing, analyzing the dependency graph to determine if an operational state of a primary node is reachable. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system comprising:
-
a plurality of interconnected computer resources; logic using a hardware processor for representing the computer resources as nodes in a dependency graph, the nodes including a plurality of articulation points wherein removal of an articulation point due to a resource failure results in a disconnected dependency graph; and logic for performing a failover for a resource group that a failed resource is a member if the failed resource corresponds to an articulation point, wherein all of the nodes in a cluster are aware of status of all resources on all other nodes, wherein if a failover of a remote resource only satisfies a portion of all resource requirements, then failover still occurs and the failover remote resource runs in a degraded mode, wherein the resources comprise local resources and the failover is to a local resource if the failed resource does not affect all of the local resources, wherein upon a resource failing, analyzing the dependency graph to determine if an operational state of a primary node is reachable, and wherein resource dependencies determine an order that specific resources within a resource group are brought online or offline when the resource group is brought online or offline. - View Dependent Claims (13, 14)
-
-
15. A computer program product comprising a computer readable hardware storage device having computer readable program code embodied therewith, the computer readable program code comprising:
-
computer readable program code configured to represent the computer resources as nodes in a dependency graph, the nodes including a plurality of articulation points wherein removal of an articulation point due to a resource failure results in a disconnected dependency graph; and computer readable program code configured to perform a failover for a resource group that a failed resource is a member if the failed resource corresponds to an articulation point, wherein all of the nodes in a cluster are aware of status of all resources on all other nodes, wherein if a failover remote resource only satisfies a portion of all resource requirements, then failover still occurs and the failover remote resource runs in a degraded mode, wherein the resources comprise local resources and the failover is to a local resource if the failed resource does not affect all of the local resources, wherein upon a resource failing, analyzing the dependency graph to determine if an operational state of a primary node is reachable, and wherein resource dependencies determine an order that specific resources within a resource group are brought online or offline when the resource group is brought online or offline. - View Dependent Claims (16, 17)
-
Specification