Method and apparatus for managing operations of clustered computer systems
First Claim
1. A method for managing operation of a clustered computing system, the clustered computing system including at least a cluster of computing nodes and at least one peripheral device, wherein said clustered computing system is configured to interact with a user as a single entity, said method comprising:
- (a) determining whether one or more of the computing nodes in the cluster have become one or more non-responsive nodes;
(b) determining a sub-cluster vote for a sub-cluster of one or more responsive computing nodes, wherein the sub-cluster represents a portion of the cluster that remains responsive;
(c) obtaining a total votes for the clustered computing system;
(d) determining whether the sub-cluster vote is at least a majority of the total votes;
(e) initiating shut down of the one or more computing nodes within the sub-cluster when said determining (d) determines that the sub-cluster vote is not at least a majority of the total votes; and
wherein said determining of sub-cluster vote includes soliciting a proxy vote from the at least one device using a reservation key.
2 Assignments
0 Petitions
Accused Products
Abstract
Improved techniques for managing operations of clustered computing system are disclosed. The improved techniques provide protection against potential problems encountered in operation of clustered computing. More particularly, the improved techniques can be implemented as an integral solution that provide protection against undesired partitions in space and partitions in time. The improved techniques do not require any human intervention.
151 Citations
28 Claims
-
1. A method for managing operation of a clustered computing system, the clustered computing system including at least a cluster of computing nodes and at least one peripheral device, wherein said clustered computing system is configured to interact with a user as a single entity, said method comprising:
-
(a) determining whether one or more of the computing nodes in the cluster have become one or more non-responsive nodes;
(b) determining a sub-cluster vote for a sub-cluster of one or more responsive computing nodes, wherein the sub-cluster represents a portion of the cluster that remains responsive;
(c) obtaining a total votes for the clustered computing system;
(d) determining whether the sub-cluster vote is at least a majority of the total votes;
(e) initiating shut down of the one or more computing nodes within the sub-cluster when said determining (d) determines that the sub-cluster vote is not at least a majority of the total votes; and
wherein said determining of sub-cluster vote includes soliciting a proxy vote from the at least one device using a reservation key. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A method for managing startup operations of a cluster of computing nodes in a clustered computing system including at least one peripheral device, said method comprising:
-
(a) determining a cluster vote for the cluster with each node being assigned a node vote and each at least one peripheral device being assigned a proxy vote and wherein the cluster vote includes the node votes and proxy votes associated with the cluster, (b) obtaining a total votes for the clustered computing system wherein the total votes include each node vote and each proxy vote in the clustered computing system;
(c) determining whether the cluster vote is at least a majority of the total votes; and
(d) initiating shut down of the computing nodes within the cluster when said (c) determining determines that the cluster vote is not at least a majority of the total votes. - View Dependent Claims (16)
-
-
17. A clustered computing system, comprising:
-
a cluster of computing nodes having at least two computing nodes and at least one peripheral device, with each node being assigned a node vote and said at least one peripheral device being assigned a proxy vote; and
an integrity protector provided on each one of the computing nodes, the integrity protector determining a vote count for a set of computing nodes in the cluster, the set of nodes representing at least a portion of the cluster, and the integrity protector determining whether the set of computing nodes should be shut down based on the vote count. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. A computer readable media including computer program code for managing operation of a clustered computing system, the clustered computing system including at least one cluster of computing nodes and a peripheral device, said computer readable media comprising:
-
computer program code for determining whether one of the computing nodes in the cluster has become a non-responsive node in a non-responsive sub-cluster;
computer program code for determining a sub-cluster vote for a sub-cluster wherein the sub-cluster votes include votes for said computing nodes and said peripheral device, wherein the sub-cluster representing a portion of the cluster that remains responsive;
computer program code for obtaining a total votes for said clustered computing system, wherein the total votes include votes for the computing nodes and said peripheral device;
computer program code for determining whether the sub-cluster vote is at least a majority of the total votes; and
computer program code for initiating shut down of the computing nodes within the sub-cluster when said computer program code for determining whether the sub-cluster vote is at least a majority of the total votes determines that the sub-cluster vote is not at least a majority of total votes. - View Dependent Claims (27, 28)
-
Specification