Mothed and apparatus for improving software availability of cluster computer system
First Claim
1. A method for improving software availability of a cluster computer system including a number of primary servers and spare servers, said method comprising the following steps of:
- collecting system state information about the number of primary servers to monitor unstableness of the servers;
if at least one of the servers is judged unstable as a result of monitoring, judging existence of a spare server or other primary server having spare capacity;
if at least one of the spare servers or the primary servers having spare capacity exists, duplexing all processes of the unstable primary server to the spare server or the other primary server having spare capacity according to a currently set operation mode; and
upon completing duplexing, providing the unstable server with a system rejuvenation control signal for executing rejuvenation.
1 Assignment
0 Petitions
Accused Products
Abstract
The invention relates to a method and apparatus for improving software availability of a cluster computer system via a software rejuvenation technique, in which a program is temporarily stopped at an adequate time point that a manager of a cluster computer system constituted by several servers can expect, and then restarted. In the invention, both aspects of software and hardware are considered, a proactive fault-tolerance technique is utilized via software rejuvenation and availability is improved through determination of the optimal rejuvenation period according to a software unstable rate and a hardware failure rate of the cluster system so that features of a high-available computer system can be ensured efficient in cost.
-
Citations
14 Claims
-
1. A method for improving software availability of a cluster computer system including a number of primary servers and spare servers, said method comprising the following steps of:
-
collecting system state information about the number of primary servers to monitor unstableness of the servers;
if at least one of the servers is judged unstable as a result of monitoring, judging existence of a spare server or other primary server having spare capacity;
if at least one of the spare servers or the primary servers having spare capacity exists, duplexing all processes of the unstable primary server to the spare server or the other primary server having spare capacity according to a currently set operation mode; and
upon completing duplexing, providing the unstable server with a system rejuvenation control signal for executing rejuvenation. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An apparatus for improving software availability of a cluster computer system including a number of primary servers and spare servers, said apparatus comprising:
-
system monitoring means for collecting system state information about the number of primary servers to grasp an unstable state of each of the servers;
cluster controlling means for providing a control signal for duplexing all processes of a primary server to a spare server or other primary server having spare capacity according to a currently set operation mode if the primary server is unstable as a result of system monitoring in said system monitoring means, and for providing the unstable primary server with a rejuvenation signal for system rejuvenation if the unstable primary server maintains an unstable system state for a certain time period; and
duplexing means for duplexing all processes of the unstable primary server to the spare server or the other server having spare capacity according to a duplexing control signal about the set mode provided from said cluster controlling means. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A record medium readable by a digital processing apparatus and containing programs of command languages which can be executed by the digital processing apparatus for execution of a method for improving software availability of a cluster computer system including a number of primary servers and spare servers, said programs in the record medium can be executed in the following steps of:
-
collecting system state information about the number of primary servers to monitor unstableness of the servers;
if at least one of the servers is judged unstable as a result of monitoring, judging existence of a spare server or other primary server having spare capacity;
if at least one of the spare servers or the primary servers having spare capacity exists, duplexing all processes of the unstable primary server to the spare server or the other primary server having spare capacity according to a currently set operation mode; and
upon completing duplexing, providing the unstable server with a system rejuvenation control signal for executing rejuvenation.
-
Specification