Failure recognition, notification, and prevention for learning and self-healing capabilities in a monitored system
First Claim
1. A method for failure recognition, comprising:
- monitoring a system to collect monitoring data;
detecting a failure of the system;
identifying a failure point for the detected failure in a data space defined by the monitoring data; and
associating at least one predefined action with the identified failure point.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention provides failure recognition, notification, and prevention for learning and self-healing capabilities in a monitored system. A system to collect monitoring data is monitored. A failure of the system is detected; A failure point for the detected failure in a data space defined by the monitoring data is identified and at least one predefined action with the identified failure point is associated. This process is repeated for a plurality of system failures. When a state of the system is determined to be approaching an identified failure point, the at least one predefined action associated with that identified failure point is performed.
36 Citations
29 Claims
-
1. A method for failure recognition, comprising:
-
monitoring a system to collect monitoring data;
detecting a failure of the system;
identifying a failure point for the detected failure in a data space defined by the monitoring data; and
associating at least one predefined action with the identified failure point. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A method for deploying computing infrastructure, comprising integrating computer usable code into a computing system, wherein the computer useable code, in combination with the computing system is capable of performing the following:
-
monitoring a system to collect monitoring data;
detecting a failure of the system;
identifying a failure point for the detected failure in a data space defined by the monitoring data; and
associating at least one predefined action with the identified failure point.
-
-
17. A system for failure recognition, comprising:
-
means for monitoring a system to collect monitoring data;
means for detecting at least one failure of the monitored system;
means for identifying a failure point for each detected failure in a data space defined by the monitoring data; and
means for associating at least one predefined action with each identified failure point. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26)
-
-
27. A computer program product for failure recognition, the computer program product comprising:
-
a computer usable medium having computer useable program code embodied therein, the computer useable program code comprising;
computer usable program code configured to monitor a system to collect monitoring data;
computer usable program code configured to detect a failure of the system;
computer usable program code configured to identify a failure point for the detected failure in a data space defined by the monitoring data; and
computer usable program code configured to associate at least one predefined action with the identified failure point.
-
-
28. A failure learning process, comprising:
-
monitoring a system to collect monitoring data;
detecting a failure of the system;
identifying a failure point for the detected failure in a data space defined by the monitoring data;
associating at least one predefined action with the identified failure point;
storing information regarding the identified failure point and each predefined action associated therewith; and
repeating each of the above steps to learn additional failure points. - View Dependent Claims (29)
-
Specification