Problem determination and diagnosis in shared dynamic clouds
First Claim
1. An article of manufacture comprising a computer readable storage device having computer readable instructions for problem determination and diagnosis in a shared dynamic cloud environment during live virtual machine migration tangibly embodied thereon which, when implemented, cause a computer to carry out a plurality of method steps comprising:
- monitoring each virtual machine and each physical server in a shared dynamic cloud environment for at least one metric;
identifying a symptom of a problem within the shared dynamic cloud environment and generating an event based on said monitoring, wherein said event corresponds to said symptom;
analyzing the event to determine a deviation from normal behavior, wherein said normal behavior is determined based on said monitoring; and
classifying the event as a cloud-based anomaly or an application fault based on a comparison of the event with multiple fault signatures, wherein said multiple fault signatures capture a set of deviations from normal behavior associated with (i) one or more cloud-based anomalies and (ii) one or more application faults, and wherein said set of deviations comprises information pertaining to a given virtual machine and a physical server hosting the given virtual machine in the shared dynamic cloud environment, and wherein said classifying comprises;
matching the deviation from normal behavior associated with the event with one of the one or more fault signatures based on the at least one monitored metric monitored for each virtual machine and for each physical server corresponding to said event.
2 Assignments
0 Petitions
Accused Products
Abstract
An apparatus and an article of manufacture for problem determination and diagnosis in a shared dynamic cloud environment include monitoring each virtual machine and physical server in the shared dynamic cloud environment for at least one metric, identifying a symptom of a problem and generating an event based on said monitoring, analyzing the event to determine a deviation from normal behavior, and classifying the event as a cloud-based anomaly or an application fault based on existing knowledge.
-
Citations
11 Claims
-
1. An article of manufacture comprising a computer readable storage device having computer readable instructions for problem determination and diagnosis in a shared dynamic cloud environment during live virtual machine migration tangibly embodied thereon which, when implemented, cause a computer to carry out a plurality of method steps comprising:
-
monitoring each virtual machine and each physical server in a shared dynamic cloud environment for at least one metric; identifying a symptom of a problem within the shared dynamic cloud environment and generating an event based on said monitoring, wherein said event corresponds to said symptom; analyzing the event to determine a deviation from normal behavior, wherein said normal behavior is determined based on said monitoring; and classifying the event as a cloud-based anomaly or an application fault based on a comparison of the event with multiple fault signatures, wherein said multiple fault signatures capture a set of deviations from normal behavior associated with (i) one or more cloud-based anomalies and (ii) one or more application faults, and wherein said set of deviations comprises information pertaining to a given virtual machine and a physical server hosting the given virtual machine in the shared dynamic cloud environment, and wherein said classifying comprises; matching the deviation from normal behavior associated with the event with one of the one or more fault signatures based on the at least one monitored metric monitored for each virtual machine and for each physical server corresponding to said event. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system for problem determination and diagnosis in a shared dynamic cloud environment during live virtual machine migration, comprising:
-
at least one distinct software module, each distinct software module being embodied on a tangible computer-readable medium; a memory; and at least one processor coupled to the memory and operative for; monitoring each virtual machine and each physical server in a shared dynamic cloud environment for at least one metric; identifying a symptom of a problem within the shared dynamic cloud environment and generating an event based on said monitoring, wherein said event corresponds to said symptom; analyzing the event to determine a deviation from normal behavior, wherein said normal behavior is determined based on said monitoring; and classifying the event as a cloud-based anomaly or an application fault based on a comparison of the event with multiple fault signatures, wherein said multiple fault signatures capture a set of deviations from normal behavior associated with (i) one or more cloud-based anomalies and (ii) one or more application faults, and wherein said set of deviations comprises information pertaining to a given virtual machine and a physical server hosting the given virtual machine in the shared dynamic cloud environment, and wherein said classifying comprises; matching the deviation from normal behavior associated with the event with one of the one or more fault signatures based on the at least one monitored metric monitored for each virtual machine and for each physical server corresponding to said event. - View Dependent Claims (8, 9, 10)
-
-
11. A system for problem determination and diagnosis in a shared dynamic cloud environment during live virtual machine migration, comprising:
-
a memory; at least one processor coupled to the memory; and at least one distinct software module, each distinct software module being embodied on a tangible computer-readable medium, the at least one distinct software module comprising; a monitoring engine module, executing on the processor, for monitoring each virtual machine and each physical server in a shared dynamic cloud environment for at least one metric and outputting a monitoring data time-series corresponding to each metric; an event generation engine module, executing on the processor, for identifying a symptom of a problem within the shared dynamic cloud environment and generating an event based on said monitoring, wherein said event corresponds to said symptom; a problem determination engine module, executing on the processor, for analyzing the event to determine and locate a deviation from normal behavior, wherein said normal behavior is determined based on said monitoring; and a diagnosis engine module, executing on the processor, for classifying the event as a cloud-based anomaly or an application fault based on a comparison of the event with multiple fault signatures, wherein said multiple fault signatures capture a set of deviations from normal behavior associated with (i) one or more cloud-based anomalies and (ii) one or more application faults, and wherein said set of deviations comprises information pertaining to a given virtual machine and a physical server hosting the given virtual machine in the shared dynamic cloud environment, and wherein said classifying comprises; matching the deviation from normal behavior associated with the event with one of the one or more fault signatures based on the at least one monitored metric monitored for each virtual machine and for each physical server corresponding to said event.
-
Specification