Method and system for coordinated multiple cluster failover
First Claim
1. A method for coordinating availability of data processing resources between first and second clusters of nodes, the method comprising:
- receiving a disruption event associated with the first cluster;
deriving a local action code from a hypercluster rules list, the local action code corresponding to the disruption event and containing a cluster activation sequence; and
transmitting the local action code to an active cluster manager for execution of the cluster activation sequence.
16 Assignments
0 Petitions
Accused Products
Abstract
Hyperclusters are a cluster of clusters. Each cluster has associated with it one or more resource groups, and independent node failures within the clusters are handled by platform specific clustering software. The management of coordinated failovers across dependent or independent resources running on heterogeneous platforms is contemplated. A hypercluster manager running on all of the nodes in a cluster communicates with platform specific clustering software regarding any failure conditions, and utilizing a rule-based decision making system, determines actions to take on the node. A plug-in extends exit points definable in non-hypercluster clustering technologies. The failure notification is passed to other affected resource groups in the hypercluster.
-
Citations
32 Claims
-
1. A method for coordinating availability of data processing resources between first and second clusters of nodes, the method comprising:
-
receiving a disruption event associated with the first cluster; deriving a local action code from a hypercluster rules list, the local action code corresponding to the disruption event and containing a cluster activation sequence; and transmitting the local action code to an active cluster manager for execution of the cluster activation sequence. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. An apparatus for coordinating availability of data processing resources between a local node in a first cluster and a remote node, the apparatus comprising:
-
a local event receiver for capturing local disruption events; an event translator for translating the local disruption event to a universal event code; a hypercluster event receiver for capturing remote disruption events from one of the nodes of the second cluster; and a router for correlating the universal event code to a cluster activation sequence in accordance with a set of hypercluster rules. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. An article of manufacture comprising a program storage medium readable by a computer, the medium tangibly embodying one or more programs of instructions executable by the computer to perform a method for coordinating availability of data processing resources between first and second clusters of nodes, the method comprising:
-
receiving a disruption event associated with the first cluster; deriving a local action code from a hypercluster rules list, the local action code corresponding to the disruption event and containing a cluster activation sequence; and transmitting the local action code to an active cluster manager for execution of the cluster activation sequence. - View Dependent Claims (27, 28, 29, 30, 31, 32)
-
Specification