Fault tolerance software system with periodic external self-test failure detection
First Claim
1. A method for use in providing improved fault tolerance in a computing system comprising at least one computing machine, the method comprising the steps of:
- executing a control program in conjunction with a fault tolerance software system running on the at least one computing machine; and
initiating via the control program a test script program which sends one or more requests to a monitored program, wherein the test script program processes corresponding responses to the one or more requests, and generates at least one return value utilizable by the control program to indicate a failure condition in the monitored program.
27 Assignments
0 Petitions
Accused Products
Abstract
Fault tolerance is improved in a computing system which includes one or more computing machines by (i) executing a control thread or other control program in conjunction with a fault tolerance software system running on at least one of the machines, and (ii) initiating via the control program a test script program which sends one or more requests to a monitored program. The test script program also processes corresponding responses to the one or more requests, and generates a return value utilizable by the control program to indicate a failure condition in the monitored program. The computing system may be configured in accordance with a client-server architecture, with the fault tolerance software system and the monitored program both running on a server of the system. The test script program is preferably implemented in an object-oriented programming language such as Java, such that one or more components of the test script program comprise a base class from which one or more other components of the test script program are generatable for use with the monitored program.
82 Citations
18 Claims
-
1. A method for use in providing improved fault tolerance in a computing system comprising at least one computing machine, the method comprising the steps of:
-
executing a control program in conjunction with a fault tolerance software system running on the at least one computing machine; and initiating via the control program a test script program which sends one or more requests to a monitored program, wherein the test script program processes corresponding responses to the one or more requests, and generates at least one return value utilizable by the control program to indicate a failure condition in the monitored program. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. An apparatus for use in providing improved fault tolerance in a computing system, the apparatus comprising:
at least one computing machine having a processor and a memory, the processor being operatively coupled to the memory, wherein the processor is operative;
(i) to execute a control program in conjunction with a fault tolerance software system running on the at least one computing machine, and (ii) to initiate via the control program a test script program which sends one or more requests to a monitored program, wherein the test script program processes corresponding responses to the one or more requests, and generates at least one return value utilizable by the control program to indicate a failure condition in the monitored program.
-
18. A storage medium for storing program code for use in providing improved fault tolerance in a computing system comprising at least one computing machine, wherein the program code when executed on the at least one computing machine performs the steps of:
-
executing a control program in conjunction with a fault tolerance software system running on the at least one computing machine; and initiating via the control program a test script program which sends one or more requests to a monitored program, wherein the test script program processes corresponding responses to the one or more requests, and generates at least one return value utilizable by the control program to indicate a failure condition in the monitored program.
-
Specification