×

Detecting and correcting a failure sequence in a computer system before a failure occurs

  • US 7,181,651 B2
  • Filed: 02/11/2004
  • Issued: 02/20/2007
  • Est. Priority Date: 02/11/2004
  • Status: Active Grant
First Claim
Patent Images

1. A method for detecting a failure sequence or other undesirable system behavior in a computer system and subsequently taking a corresponding remedial action, comprising:

  • receiving instrumentation signals from the computer system while the computer system is operating;

    determining from the instrumentation signals if the computer system is in a failure sequence that is likely to lead to undesirable system behavior, such as a system crash, wherein determining if the computer system is in a failure sequence involves;

    determining correlations between instrumentation signals in the computer system, wherein determining the correlations involves using a non-linear, non-parametric regression technique to determine the correlations, whereby the correlations can subsequently be used to generate estimated signals,deriving estimated signals for a number of instrumentation signals, wherein each estimated signal is derived from correlations with other instrumentation signals, andcomparing an actual signal with an estimated signal for a number of instrumentation signal to determine whether the computer system is in a failure sequence;

    wherein the determination involves considering predetermined multivariate correlations between multiple instrumentation signals and a failure sequence that is likely to lead to undesirable system behavior; and

    if the computer system is in a failure sequence that is likely to lead to undesirable system behavior, taking a remedial action.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×