Multi-CPU failure detection/recovery system and method for the same
First Claim
Patent Images
1. A multi-processor system including processors, comprising:
- a failure state detection unit to detect a failure in an operating program and an interruption failure; and
a recovery unit to determine, when the failure state detection unit has detected a failure, whether or not recovery of data involved in the failure is possible on the basis of content of the detected failure, and to recover the data when recovery is determined to be possible,whereinwhen the failure is detected, a processor in which the operating program is running requests a processor other than a processor in which the failure is detected to perform a recovery process; and
an interruption failure is detected by comparing a time that has elapsed since exclusive control was obtained with a measured maximum value of the interruption prohibition time obtained and by comparing a time that has elapsed since obtainment of the exclusive control failed first with the measured maximum value of the interruption prohibition time obtained.
1 Assignment
0 Petitions
Accused Products
Abstract
A multi-CPU system including plural CPUs, comprising a failure state detection unit for detecting a failure in an operating program, and a recovery unit for determining, when the failure state detection unit has detected a failure, whether or not recovery of data involved in the failure is possible on the basis of content of the detected failure, and for recovering the data when recovery is determined to be possible.
18 Citations
18 Claims
-
1. A multi-processor system including processors, comprising:
-
a failure state detection unit to detect a failure in an operating program and an interruption failure; and a recovery unit to determine, when the failure state detection unit has detected a failure, whether or not recovery of data involved in the failure is possible on the basis of content of the detected failure, and to recover the data when recovery is determined to be possible, wherein when the failure is detected, a processor in which the operating program is running requests a processor other than a processor in which the failure is detected to perform a recovery process; and an interruption failure is detected by comparing a time that has elapsed since exclusive control was obtained with a measured maximum value of the interruption prohibition time obtained and by comparing a time that has elapsed since obtainment of the exclusive control failed first with the measured maximum value of the interruption prohibition time obtained. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of detecting a failure and automatic recovery in a multi-processor system including processors, comprising:
-
detecting a failure in an operating program and an interruption failure; determining, when the failure state detection step has detected a failure, whether or not recovery of data involved in the failure is possible on the basis of content of the detected failure, and recovering the data when recovery is determined to be possible, wherein when the failure is detected, a processor in which the operating program is running requests a processor other than a processor in which the failure is detected to perform a recovery process; and an interruption failure is detected by comparing a time that has elapsed since exclusive control was obtained with a measured maximum value of the interruption prohibition time obtained and by comparing a time that has elapsed since obtainment of the exclusive control failed first with the measured maximum value of the interruption prohibition time obtained.
-
-
10. A computer-readable storage medium storing a program executed in a multi-processor system including processors, causing the multi-processor system to implement:
-
a function of detecting a failure in an operating program and an interruption failure; and a function of determining, when the failure state detection function has detected a failure, whether or not recovery of data involved in the failure is possible on the basis of content of the detected failure, and for recovering the data when recovery is determined to be possible, wherein when the failure is detected, a processor in which the operating program is running requests a processor other than a processor in which the failure is detected to perform a recovery process; and an interruption failure is detected by comparing a time that has elapsed since exclusive control was obtained with a measured maximum value of the interruption prohibition time obtained and by comparing a time that has elapsed since obtainment of the exclusive control failed first with the measured maximum value of the interruption prohibition time obtained. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification