Monitoring and resolving deadlocks, contention, runaway CPU and other virtual machine production issues
First Claim
1. A computer program product, comprising:
- a computer-readable storage medium having computer-readable program code embodied therewith, the computer-readable program code comprising;
computer-readable program code configured to monitor a set of health status metrics of a virtual machine at a first level;
computer-readable program code configured to analyze data of the monitored health status metrics to determine that an instability has occurred when the data exceeds defined bounds for the computing system health status metrics;
computer-readable program code configured to respond to the instability by monitoring additional health status metrics, whereby a level of monitoring of the system is increased from the first level to a second level, greater than the first level;
computer-readable program code configured to repair the system by taking corrective action based on the instability; and
computer-readable program code configured to stop monitoring at least one of the set of monitored health status metrics to reduce the level of monitoring to a third level once the instability has been resolved, wherein the third level is less than the second level.
1 Assignment
0 Petitions
Accused Products
Abstract
Resolving virtual machine (VM) issues, by executing VM and operating system (OS) diagnostic monitors, including, monitoring a set of VM and OS health status metrics of a system at a first level, analyzing data of the monitored health status metrics to determine that an instability has occurred when the data exceeds defined bounds for the health status metrics, responding to the instability by monitoring additional VM and OS health status metrics, whereby a level of monitoring of the system is increased from the first level to a second level, greater than the first level, identifying the instability, repairing the system by taking corrective action based on the identified instability; and removing at least one of the set of monitoring and profiling tools to reduce the level of monitoring to a third level once the instability has been resolved, wherein the third level is less than the second level.
-
Citations
17 Claims
-
1. A computer program product, comprising:
a computer-readable storage medium having computer-readable program code embodied therewith, the computer-readable program code comprising; computer-readable program code configured to monitor a set of health status metrics of a virtual machine at a first level; computer-readable program code configured to analyze data of the monitored health status metrics to determine that an instability has occurred when the data exceeds defined bounds for the computing system health status metrics; computer-readable program code configured to respond to the instability by monitoring additional health status metrics, whereby a level of monitoring of the system is increased from the first level to a second level, greater than the first level; computer-readable program code configured to repair the system by taking corrective action based on the instability; and computer-readable program code configured to stop monitoring at least one of the set of monitored health status metrics to reduce the level of monitoring to a third level once the instability has been resolved, wherein the third level is less than the second level. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
9. A system, comprising:
-
one or more physical computer processors; and a memory containing a program, which when executed by the one or more physical computer processors is configured to perform an operation comprising; monitoring a set of health status metrics of a system at a first level; analyzing data of the monitored health status metrics to determine that an instability has occurred when the data exceeds defined bounds for the computing system health status metrics; responding to the instability by monitoring additional health status metrics, whereby a level of monitoring of the system is increased from the first level to a second level, greater than the first level; repairing the system by taking corrective action based on the instability; and refraining from monitoring at least one of the set of monitored health status metrics to reduce the level of monitoring to a third level once the instability has been resolved, wherein the third level is less than the second level. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17)
-
Specification