Software-based watchdog method and apparatus
First Claim
1. A method for monitoring a software process in a computer system having an operating system, the method comprising the steps of:
- issuing a operating system call for information regarding processes being executed by the computer system;
determining whether the software process is executing properly based upon the response to the operating system call;
restarting the software process when a determination is made that the software process is not executing properly.
1 Assignment
0 Petitions
Accused Products
Abstract
In a computer system which allows simultaneous operation of multiple processes, a software watchdog process operates to monitor a primary process through operating system calls. If the response to an operating system call shows that the primary process is not operating or is over utilizing CPU time, then the primary process is restarted. The software watchdog process may also check and correct configuration and data files before restarting the primary process. Alternatively, rather than using operating system calls, the software watchdog process and primary process may communication through a loop back TCP/IP address for monitoring purposes.
-
Citations
16 Claims
-
1. A method for monitoring a software process in a computer system having an operating system, the method comprising the steps of:
-
issuing a operating system call for information regarding processes being executed by the computer system;
determining whether the software process is executing properly based upon the response to the operating system call;
restarting the software process when a determination is made that the software process is not executing properly. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for monitoring a software process comprising the steps of:
-
opening a communication channel with the software process;
sending a message to the software process on the communication channel, wherein the software process responds to the message;
receiving a response to the message; and
restarting the software process if a response to the message is not received. - View Dependent Claims (9)
-
-
10. A fault tolerant computer system comprising:
-
a central processing unit for executing a plurality of processes;
an operating system for controlling the execution of the plurality of processes on the central processing unit and providing information regarding execution of the plurality of processes in response to an operating system call;
a first process to be executed on the central processing unit;
a second process to be executed on the central processing unit, the second process including;
means for issuing an operating system call;
means for determining whether the first process is executing properly based upon a response to the operating system call;
means for restarting the first process when a determination is made that the first process is not executing properly. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A fault tolerant computer system comprising:
-
a central processing unit for executing a plurality of processes;
a communication link allowing communication between two of the plurality of processes executing on the central processing unit;
a first process executing on the central processing unit, the first process including means for responding on the communication link to a message received on the communication link when the first process is executing;
a second process executing on the central processing unit, the second process including;
means for transmitting a message to the first process on the communication link;
means for receiving a response on the communication link from the first process;
means for restarting the first process when a response is not received.
-
Specification