Method and apparatus for isolating failing hardware in a PCI recoverable error
First Claim
Patent Images
1. A method in a data processing system for isolating failing hardware in the data processing system, the method comprising:
- responsive to detecting a recovery attempt from an error for an operation involving a hardware component, storing an indication of the attempt; and
responsive to the error exceeding a threshold, placing the hardware component in an unavailable state.
1 Assignment
0 Petitions
Accused Products
Abstract
A method, apparatus, and computer implemented instructions for isolating failing hardware in a data processing system. In response to detecting a recovery attempt from an error, an indication of the attempt is stored. A hardware component associated with the error is placed in an unavailable state in response to the error exceeding a threshold for errors.
81 Citations
35 Claims
-
1. A method in a data processing system for isolating failing hardware in the data processing system, the method comprising:
-
responsive to detecting a recovery attempt from an error for an operation involving a hardware component, storing an indication of the attempt; and
responsive to the error exceeding a threshold, placing the hardware component in an unavailable state. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method in a data processing system for handling errors, the method comprising:
-
responsive to an occurrence of an error, determining whether the error is a recoverable error;
responsive to a determination that the error is a recoverable error, identifying slots on the bus indicating an error state;
incrementing an error counter for each identified slot; and
responsive to the error counter exceeding a threshold, placing the slot into a permanently unavailable state. - View Dependent Claims (11, 15, 16, 17, 18, 19, 20, 21, 22, 24, 26, 27, 28, 29, 30, 31, 32, 33, 35)
-
-
12. A data processing system comprising:
-
a bus system;
a communications unit connected to the bus system;
a memory connected to the bus system, wherein the memory includes as set of instructions; and
a processing unit connected to the bus system, wherein the processing unit executes the set of instructions to store an indication of a recovery attempt from an error in response to detecting the recovery attempt; and
place the hardware component in an unavailable state in response to the error exceeding a threshold.
-
-
13. A data processing system comprising:
-
a bus system;
a communications unit connected to the bus system;
a memory connected to the bus system, wherein the memory includes as set of instructions; and
a processing unit connected to the bus system, wherein the processing unit executes the set of instructions to determine whether the error is a recoverable error in response to an occurrence of an error;
identify slots on the bus indicating an error state in response to a determination that the error is a recoverable error;
increment an error counter for each identified slot; and
place the slot into a permanently unavailable state in response to the error counter exceeding a threshold.
-
-
14. A data processing system for isolating failing hardware in the data processing system, the data processing system comprising:
-
storing means, responsive to detecting a recovery attempt from an error, for storing an indication of the attempt; and
placing means, responsive to the error occurring in the more than a threshold for a hardware component, for placing the hardware component in an unavailable state.
-
-
23. A data processing system for handling errors, the data processing system comprising:
-
determining means, responsive to an occurrence of an error, for determining whether the error is a recoverable error;
identifying means, responsive to a determination that the error is a recoverable error, for identifying slots on the bus indicating an error state;
incrementing means for incrementing an error counter for each identified slot; and
placing means, responsive to the error counter exceeding a threshold, for placing the slot into a permanently unavailable state.
-
-
25. A computer program product in a computer readable medium for isolating failing hardware in the data processing system, the computer program product comprising:
-
first instructions, responsive to detecting a recovery attempt from an error, for storing an indication of the attempt; and
second instructions, responsive to the error occurring in the more than a threshold for a hardware component, for placing the hardware component in an unavailable state.
-
-
34. A computer program product in a computer readable medium for handling errors, the computer program product comprising:
-
first instructions, responsive to an occurrence of an error, for determining whether the error is a recoverable error;
second instructions, responsive to a determination that the error is a recoverable error, for identifying slots on the bus indicating an error state;
third instructions for incrementing an error counter for each identified slot; and
fourth instructions, responsive to the error counter exceeding a threshold, for placing the slot into a permanently unavailable state.
-
Specification