Failure Analysis Based on Time-Varying Failure Rates
First Claim
1. A method comprising:
- obtaining failure rate data correlating the elapsed operating time with the probability of failure for each of a plurality of components of a computer system;
tracking the elapsed operating time of each component of the computer system; and
in response to a failure of the computer system, automatically determining the probability of failure of each component at the time of failure of the computer system from the failure rate data, and generating a component replacement list indicating the component having the highest probability of failure at the time of failure.
2 Assignments
0 Petitions
Accused Products
Abstract
Failure analysis method and apparatus using failure rate data in coordination with the power on hours to more efficiently resolve computer system failures without occupying system memory or processor bandwidth. In response to a system failure, a baseboard management controller (BMC) notes the time of failure and the elapsed operating time of system components. In response to a failure of the computer system, the BMC accesses industry standard failure rate data correlating the elapsed operating time with the probability of failure for each component. By cross-referencing the time of failure with the failure rate data, the BMC automatically determines the probability of failure of each component at the time of failure of the computer system. The BMC generates a component replacement list identifying the component that currently has the highest probability of failure.
78 Citations
15 Claims
-
1. A method comprising:
-
obtaining failure rate data correlating the elapsed operating time with the probability of failure for each of a plurality of components of a computer system; tracking the elapsed operating time of each component of the computer system; and in response to a failure of the computer system, automatically determining the probability of failure of each component at the time of failure of the computer system from the failure rate data, and generating a component replacement list indicating the component having the highest probability of failure at the time of failure. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer program product including computer usable program code embodied on a baseboard management controller of a computer system, the computer usable program code for performing failure analysis on components of the computer system, the computer usable program code including:
-
computer usable program code for obtaining failure rate data correlating the elapsed operating time with the probability of failure for each component; computer usable program code for tracking the elapsed operating time of each component; and computer usable program code for, in response to a failure of the computer system, automatically determining the probability of failure of each component at the time of failure of the computer system from the failure rate data, and generating a component replacement list indicating the component having the highest probability of failure at the time of failure. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A computer system comprising:
-
a plurality of components; and a baseboard management controller containing computer usable program code for performing failure analysis, the computer usable program code including computer usable program code for obtaining failure rate data correlating the elapsed operating time with the probability of failure for each component, computer usable program code for tracking the elapsed operating time of each component, and computer usable program code for, in response to a failure of the computer system, automatically determining the probability of failure of each component at the time of failure of the computer system from the failure rate data, and generating a component replacement list indicating the component having the highest probability of failure at the time of failure.
-
Specification