×

Systems and methods for predictive failure management

  • US 7,730,364 B2
  • Filed: 04/05/2007
  • Issued: 06/01/2010
  • Est. Priority Date: 04/05/2007
  • Status: Active Grant
First Claim
Patent Images

1. A system for using continuous failure predictions for proactive failure management in distributed cluster systems, comprising:

  • a sampling subsystem configured to continuously monitor and collect operation states of different system components;

    an analysis subsystem configured to build classification models to perform on-line failure predictions; and

    a failure prevention subsystem configured to take preventive actions on failing components based on failure warnings generated by the analysis subsystem, wherein a pre-failure state of a component is dynamically decided based on a reward function that denotes an optimal trade-off between failure impact and prediction error cost.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×