Fault recovery in systems utilizing redundant processor arrangements
First Claim
1. Apparatus for use in a system comprised of a first processing system and a second processing system, the first and second processing systems being redundant processing systems whereby one of the first and second processing systems is an active processing system and the other one of the first and second processing systems is a standby processing system, the apparatus being used to prevent endless switchovers between the active processing system and the standby processing system, a switchover being an event wherein the active processing system becomes the standby processing system and vice versa, wherein the first processing system is associated with a first watch dog timer (WDT) and the second processing system is associated with a second WDT, wherein each of the first and second processing systems attempts to reset the WDT associated therewith within a predetermined period of time, and wherein the WDT associated with each processing system sends a restart signal to that processing system if the WDT has not been reset thereby, the apparatus comprising:
- each of the WDTs further comprising means for outputting a YESORNO-OK signal which indicates whether or not the WDT has issued a restart signal andfor applying the YESORNO-OK signal from both of the WDTs to switchover control logic means;
the switchover control logic means being means, in response to the YESORNO-OK signals from both of the WDTs, for generating at least one switchover signal to cause a switchover wherein the active processing system becomes the standby processing system and vice versa if the YESORNO-OK signal from the WDT associated with the active processing system indicates that the active processing system has issued a restart signal and the YESORNO-OK signal from the WDT associated with the standby processing system indicates that the standby processing system has not issued a restart signal and, if the standby processing system has also issued a restart signal, the switchover control logic means being means for generating a reboot signal to cause a reboot of the system;
the switchover control logic means further comprising;
timer means for generating a timing signal at a predetermined time interval; and
switchover counter means, responsive to at least one of the at least one switchover signal, for counting the number of switchovers which occur, the switchover counter means being further responsive to the timing signal for clearing the count of switchovers;
wherein the switchover counter means further comprises means for generating the reboot signal to cause a reboot of the system whenever the count of switchovers maintained by the switchover counter means exceeds a predetermined amount.
11 Assignments
0 Petitions
Accused Products
Abstract
Apparatus to prevent endless switchover attempts between processors in redundant processor systems where each processor resets an associated watch dog timer (WDT). Whenever a WDT times out, WDT sends a restart signal to its associated processor and a failure signal to switchover control logic. The switchover control logic causes a switch from the active processor to the standby processor if the standby processor is healthy and is properly resetting its WDT and, if the standby processor is not, the switchover control logic will generate a signal to cause a cold reboot of the entire system. However, if the standby processor is healthy, the switchover control logic will generate a signal to cause a switchover to the standby and will generate a signal to increment a switchover counter. The value of the switchover counter is compared with a predetermined threshold value. If the value of the switchover counter matches the predetermined threshold value, a signal is generated to cause a cold reboot of the entire system. A timer associated with the switchover counter periodically clears the switchover counter. Thus, if the system is switching back and forth between the redundant processors at a rate which causes the switchover counter to exceed the predetermined threshold before the switchover counter can be cleared by the timer, the system will perform a cold reboot.
-
Citations
5 Claims
-
1. Apparatus for use in a system comprised of a first processing system and a second processing system, the first and second processing systems being redundant processing systems whereby one of the first and second processing systems is an active processing system and the other one of the first and second processing systems is a standby processing system, the apparatus being used to prevent endless switchovers between the active processing system and the standby processing system, a switchover being an event wherein the active processing system becomes the standby processing system and vice versa, wherein the first processing system is associated with a first watch dog timer (WDT) and the second processing system is associated with a second WDT, wherein each of the first and second processing systems attempts to reset the WDT associated therewith within a predetermined period of time, and wherein the WDT associated with each processing system sends a restart signal to that processing system if the WDT has not been reset thereby, the apparatus comprising:
-
each of the WDTs further comprising means for outputting a YESORNO-OK signal which indicates whether or not the WDT has issued a restart signal and for applying the YESORNO-OK signal from both of the WDTs to switchover control logic means; the switchover control logic means being means, in response to the YESORNO-OK signals from both of the WDTs, for generating at least one switchover signal to cause a switchover wherein the active processing system becomes the standby processing system and vice versa if the YESORNO-OK signal from the WDT associated with the active processing system indicates that the active processing system has issued a restart signal and the YESORNO-OK signal from the WDT associated with the standby processing system indicates that the standby processing system has not issued a restart signal and, if the standby processing system has also issued a restart signal, the switchover control logic means being means for generating a reboot signal to cause a reboot of the system; the switchover control logic means further comprising; timer means for generating a timing signal at a predetermined time interval; and switchover counter means, responsive to at least one of the at least one switchover signal, for counting the number of switchovers which occur, the switchover counter means being further responsive to the timing signal for clearing the count of switchovers; wherein the switchover counter means further comprises means for generating the reboot signal to cause a reboot of the system whenever the count of switchovers maintained by the switchover counter means exceeds a predetermined amount. - View Dependent Claims (2, 3, 4)
-
-
5. Apparatus for use in a system comprised of a first processing system and a second processing system, the first and second processing systems being redundant processing systems whereby one of the first and second processing systems is an active processing system and the other one of the first and second processing systems is a standby processing system, the apparatus being used to prevent endless switchovers between the active processing system and the standby processing system, a switchover being an event wherein the active processing system becomes the standby processing system and vice versa, wherein the first processing system is associated with a first watch dog timer (WDT) and the second processing system is associated with a second WDT, wherein each of the first and second processing systems attempts to reset the WDT associated therewith within a predetermined period of time, and wherein the WDT associated with each processing system sends a restart signal to that processing system if the WDT has not been reset thereby, the apparatus comprising:
-
each of the WDTs further comprising means for outputting a YESORNO-OK signal which indicates whether or not the WDT has issued a restart signal and for applying the YESORNO-OK signal from the first and second WDT to both the first and second switchover control logic means; the first and second switchover control logic means being means, in response to the YESORNO-OK signals from both of the WDTs, the generating at least one switchover signal to cause a switchover wherein the active processing system becomes the standby processing system and vice versa if the YESORNO-OK signal from the WDT associated with the active processing system indicates that the active processing system has issued a restart signal and the YESORNO-OK signal from the WDT associated with the standby processing system indicates that the standby processing system has not issued a restart signal and, if the standby processing system has also issued a restart signal, the switchover control logic means being means for generating a reboot signal to cause a reboot of the system; the first and second switchover control logic means further comprising; timer means for generating a timing signal at a predetermined time interval; and switchover counter means, responsive to at least one of the at least one switchover signal, for counting the number of switchover which occur, the switchover counter means being further responsive to the timing signal for clearing the count of switchovers; wherein the switchover counter means further comprises means for generating the reboot signal to cause a reboot of the system whenever the count of switchovers maintained by the switchover counter means exceeds a predetermined amount.
-
Specification