In-situ computing system failure avoidance
First Claim
Patent Images
1. A computing system device failure avoidance method for a computing system including at least one device, the method comprising:
- identifying at least one failure mechanism of each device affected by time variation of an operating parameter of the device;
assigning a respective time to replace for each failure mechanism;
assigning a respective remaining time to replace initially equal to the respective time to replace for each failure mechanism,tracking the operating parameter periodically at a tracking interval;
tracking a respective remaining time to replace for each failure mechanism periodically at the tracking interval, including,determining an effective operating time of the respective device during each tracking interval based on at least one value of the operating parameter tracked during a respective tracking interval, andsubtracting the effective operating time from the respective remaining time to replace; and
replacing a respective device responsive to one of the respective estimated time to replace reaching a respective threshold value.
6 Assignments
0 Petitions
Accused Products
Abstract
A remaining time to replace can be updated taking into account time variation of a failure mechanism of a device. Starting with an initial remaining time to replace, an effective operating time can be determined periodically based on an operating parameter measured at a tracking interval, and remaining time to replace can be updated by subtracting the effective operating time. The technique can be applied to multiple failure mechanisms and to multiple devices and/or components each having multiple failure mechanisms.
23 Citations
20 Claims
-
1. A computing system device failure avoidance method for a computing system including at least one device, the method comprising:
-
identifying at least one failure mechanism of each device affected by time variation of an operating parameter of the device; assigning a respective time to replace for each failure mechanism; assigning a respective remaining time to replace initially equal to the respective time to replace for each failure mechanism, tracking the operating parameter periodically at a tracking interval; tracking a respective remaining time to replace for each failure mechanism periodically at the tracking interval, including, determining an effective operating time of the respective device during each tracking interval based on at least one value of the operating parameter tracked during a respective tracking interval, and subtracting the effective operating time from the respective remaining time to replace; and replacing a respective device responsive to one of the respective estimated time to replace reaching a respective threshold value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. An in-situ computing system device failure avoidance method for a computing system including at least one device, the method comprising:
-
assigning a respective time to replace for each device of the at least one device; assigning a respective remaining time to replace initially equal to the respective time to replace; tracking at least one operating parameter of each device, including measuring an operating parameter of the device at least once for each device during a tracking interval; determining a respective effective operation time for each device based on a respective measured value of the operating parameter and the tracking interval; subtracting the respective effective operation time from the respective remaining time to replace; monitoring for a failure of the at least one device; responsive to a failure of the at least one device, maintaining the respective device and continuing operation in response to the failure being recoverable, and notifying a user in response to the failure not being recoverable; and responsive to a respective remaining time to replace having reached a threshold value, replacing the respective device. - View Dependent Claims (13, 14, 15)
-
-
16. A computing system failure avoidance computer program product for a computing system including at least one device and at least one processing unit in communication with at least one non-transitory computer readable storage medium, the computer program product being stored on the at least one non-transitory computer readable storage medium and including instructions in the form of computer executable code that when loaded and executed by the processing unit cause the processing unit to perform a method comprising:
-
identifying at least one failure mechanism of each of the at least one device affected by time variation of an operating parameter of the respective device; assigning a respective time to replace for each failure mechanism; assigning a respective remaining time to replace for each failure mechanism initially equal to the respective time to replace, tracking the operating parameter periodically at a tracking interval; tracking the respective remaining time to replace for each failure mechanism, including, determining an effective operating time of each device during each tracking interval based on at least one value of the operating parameter tracked during a respective tracking interval, and subtracting the effective operating time from the respective remaining time to replace; and replacing a respective device in response to one of the respective remaining time to replace reaching a respective threshold value. - View Dependent Claims (17, 18, 19, 20)
-
Specification