Apparatus and method of repairing a processor array for a failure detected at runtime
First Claim
1. A method of correcting an error in a processor of a computing system , comprising:
- identifying a failure of a cache array of a processor;
deferring deallocation of the processor until reboot of the computing system;
determining, at reboot of the computing system, if redundancy in the cache array may be used to correct the failure of the cache array; and
deallocating the processor, at reboot of the computing system, if it is determined that redundancy in the cache array cannot be used to correct the failure of the cache array.
1 Assignment
0 Petitions
Accused Products
Abstract
An apparatus and method of repairing a processor array for a failure detected at runtime in a system supporting persistent component deallocation are provided. The apparatus and method of the present invention allow redundant array bits to be used for recoverable faults detected in arrays during run time, instead of only at system boot, while still maintaining the dynamic and persistent processor deallocation features of the computing system. With the apparatus and method of the present invention, a failure of a cache array is detected and a determination is made as to whether a repairable failure threshold is exceeded during runtime. If this threshold is exceeded, a determination is made as to whether cache array redundancy may be applied to correct the failure, i.e. a bit error. If so, the cache array redundancy is applied without marking the processor as unavailable. At some time later, the system undergoes a re-initial program load (re-IPL) at which time it is determined whether a second failure of the processor occurs. If a second failure occurs, a determination is made as to whether any status bits are set for arrays other than the cache array that experienced the present failure, if so, the processor is marked unavailable. If not, a determination is made as to whether cache redundancy can be applied to correct the failure. If so, the failure is corrected using the cache redundancy. If not, the processor is marked unavailable.
-
Citations
36 Claims
-
1. A method of correcting an error in a processor of a computing system , comprising:
-
identifying a failure of a cache array of a processor;
deferring deallocation of the processor until reboot of the computing system;
determining, at reboot of the computing system, if redundancy in the cache array may be used to correct the failure of the cache array; and
deallocating the processor, at reboot of the computing system, if it is determined that redundancy in the cache array cannot be used to correct the failure of the cache array. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. An apparatus for correcting an error in a processor of a computing system, comprising:
-
means for identifying a failure of a cache array of a processor;
means for deferring deallocation of the processor until reboot of the computing system;
means for determining, at reboot of the computing system, if redundancy in the cache array may be used to correct the failure of the cache array; and
means for deallocating the processor, at reboot of the computing system, if it is determined that redundancy in the cache array cannot be used to correct the failure of the cache array. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A computer program product in a computer readable medium for correcting an error in a processor of a computing system, comprising:
-
first instructions for identifying a failure of a cache array of a processor;
second instructions for deferring deallocation of the processor until reboot of the computing system;
third instructions for determining, at reboot of the computing system, if redundancy in the cache array may be used to correct the failure of the cache array; and
fourth instructions for deallocating the processor, at reboot of the computing system, if it is determined that redundancy in the cache array cannot be used to correct the failure of the cache array. - View Dependent Claims (26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36)
-
Specification