Self-healing and dynamic optimization of VM server cluster management in multi-cloud platform
First Claim
Patent Images
1. A method for managing a virtual machine server cluster in a multi-cloud platform, comprising:
- supporting a group of statistics for selecting metric statistics, the group of statistics including each of average, maximum value, minimum value, last value, standard, sum of historical values, sum of squares of historical values, and count of values;
classifying a first quality metric into a load metric class, wherein members of the load metric class indicate a load on a virtual machine;
selecting a load metric statistic based on the classifying the first quality metric into the load metric class, the load metric statistic being selected from the group of statistics;
accumulating values for one or more load metric partial sums from performance monitoring data relating to the first quality metric, the load metric partial sums being selected to calculate a value of the load metric statistic;
calculating the value of the load metric statistic from the load metric partial sums accumulated from performance monitoring data relating to the first quality metric;
classifying a second quality metric into a utilization metric class, wherein members of the utilization metric class indicate a utilization of hardware by the virtual machine;
selecting a utilization metric statistic based on the classifying the second quality metric into the utilization metric class, the utilization metric statistic being selected from the group of statistics and being different from the load metric statistic;
accumulating values for one or more utilization metric partial sums from performance monitoring data relating to the second quality metric, the utilization metric partial sums being selected to calculate a value of the utilization metric statistic;
calculating the value of the utilization metric statistic from the load metric partial sums accumulated from performance monitoring data relating to the second quality metric;
determining an adaptive threshold range for the first quality metric based on the value of the load metric statistic and based on the classifying the first quality metric into the load metric class;
determining an adaptive threshold range for the second quality metric based on the value of the utilization metric statistic and based on the classifying the second quality metric into the utilization metric class;
determining that a monitoring value for one of the first quality metric and the second quality metric is outside the adaptive threshold range for the one quality metric;
performing a self-healing and dynamic optimization task based on the determining that the monitoring value is outside the adaptive threshold range;
determining that a value of one of the partial sums accumulated from performance monitoring data relating to one of the quality metrics exceeds a limit imposed to prevent arithmetic overflow of a value storage; and
dividing values of each partial sum accumulated from performance monitoring data relating to the one of the quality metrics by two.
1 Assignment
0 Petitions
Accused Products
Abstract
Virtual machine server clusters are managed using self-healing and dynamic optimization to achieve closed-loop automation. The technique uses adaptive thresholding to develop actionable quality metrics for benchmarking and anomaly detection. Real-time analytics are used to determine the root cause of KPI violations and to locate impact areas. Self-healing and dynamic optimization rules are able to automatically correct common issues via no-touch automation in which finger-pointing between operations staff is prevalent, resulting in consolidation, flexibility and reduced deployment time.
-
Citations
18 Claims
-
1. A method for managing a virtual machine server cluster in a multi-cloud platform, comprising:
-
supporting a group of statistics for selecting metric statistics, the group of statistics including each of average, maximum value, minimum value, last value, standard, sum of historical values, sum of squares of historical values, and count of values; classifying a first quality metric into a load metric class, wherein members of the load metric class indicate a load on a virtual machine; selecting a load metric statistic based on the classifying the first quality metric into the load metric class, the load metric statistic being selected from the group of statistics; accumulating values for one or more load metric partial sums from performance monitoring data relating to the first quality metric, the load metric partial sums being selected to calculate a value of the load metric statistic; calculating the value of the load metric statistic from the load metric partial sums accumulated from performance monitoring data relating to the first quality metric; classifying a second quality metric into a utilization metric class, wherein members of the utilization metric class indicate a utilization of hardware by the virtual machine; selecting a utilization metric statistic based on the classifying the second quality metric into the utilization metric class, the utilization metric statistic being selected from the group of statistics and being different from the load metric statistic; accumulating values for one or more utilization metric partial sums from performance monitoring data relating to the second quality metric, the utilization metric partial sums being selected to calculate a value of the utilization metric statistic; calculating the value of the utilization metric statistic from the load metric partial sums accumulated from performance monitoring data relating to the second quality metric; determining an adaptive threshold range for the first quality metric based on the value of the load metric statistic and based on the classifying the first quality metric into the load metric class; determining an adaptive threshold range for the second quality metric based on the value of the utilization metric statistic and based on the classifying the second quality metric into the utilization metric class; determining that a monitoring value for one of the first quality metric and the second quality metric is outside the adaptive threshold range for the one quality metric; performing a self-healing and dynamic optimization task based on the determining that the monitoring value is outside the adaptive threshold range; determining that a value of one of the partial sums accumulated from performance monitoring data relating to one of the quality metrics exceeds a limit imposed to prevent arithmetic overflow of a value storage; and dividing values of each partial sum accumulated from performance monitoring data relating to the one of the quality metrics by two. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer-readable storage device having stored thereon computer readable instructions for managing a virtual machine server cluster in a multi-cloud platform, wherein execution of the computer readable instructions by a processor causes the processor to perform operations comprising:
-
supporting a group of statistics for selecting metric statistics, the group of statistics including each of average, maximum value, minimum value, last value, standard, sum of historical values, sum of squares of historical values, and count of values; classifying a first quality metric into a load metric class, wherein members of the load metric class indicate a load on a virtual machine; selecting a load metric statistic based on the classifying the first quality metric into the load metric class, the load metric statistic being selected from the group of statistics; accumulating values for one or more load metric partial sums from performance monitoring data relating to the first quality metric, the load metric partial sums being selected to calculate a value of the load metric statistic; calculating the value of the load metric statistic from the load metric partial sums accumulated from performance monitoring data relating to the first quality metric; classifying a second quality metric into a utilization metric class, wherein members of the utilization metric class indicate a utilization of hardware by the virtual machine; selecting a utilization metric statistic based on the classifying the second quality metric into the utilization metric class, the utilization metric statistic being selected from the group of statistics and being different from the load metric statistic; accumulating values for one or more utilization metric partial sums from performance monitoring data relating to the second quality metric, the utilization metric partial sums being selected to calculate a value of the utilization metric statistic; calculating the value of the utilization metric statistic from the load metric partial sums accumulated from performance monitoring data relating to of the second quality metric; determining an adaptive threshold range for the first quality metric based on the value of the load metric statistic and based on the classifying the first quality metric into the load metric class; determining an adaptive threshold range for the second quality metric based on the value of the utilization metric statistic and based on the classifying the second quality metric into the utilization metric class; determining that a monitoring value for one of the first quality metric and the second quality metric is outside the adaptive threshold range for the one quality metric; and performing a self-healing and dynamic optimization task based on the so determining that the monitoring value is outside the adaptive threshold range; determining that a value of one of the partial sums accumulated from performance monitoring data relating to one of the quality metrics exceeds a limit imposed to prevent arithmetic overflow of a value storage; and dividing values of each partial sum accumulated from performance monitoring data relating to the one of the quality metrics by two. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A system for managing a virtual machine server cluster in a multi-cloud platform, comprising:
-
a processor resource; a performance measurement interface connecting the processor resource to the virtual machine server cluster; and a computer-readable storage device having stored thereon computer readable instructions, wherein execution of the computer readable instructions by the processor resource causes the processor resource to perform operations comprising; supporting a group of statistics for selecting metric statistics, the group of statistics including each of average, maximum value, minimum value, last value, standard, sum of historical values, sum of squares of historical values, and count of values; receiving, by the performance measurement interface, performance monitoring data of a first quality metric and a second quality metric; classifying, by the processor, the first quality metric into a load metric class, wherein members of the load metric class indicate a load on a virtual machine; selecting a load metric statistic based on the classifying the first quality metric into the load metric class, the load metric statistic being selected from the group of statistics; accumulating values for one or more load metric partial sums from the performance monitoring data of the first quality metric, the load metric partial sums being selected to calculate a value of the load metric statistic; calculating, by the processor, the value of the load metric statistic from the load metric partial sums accumulated from the performance monitoring data of the first quality metric; classifying, by the processor, the second quality metric into a utilization metric class, wherein members of the utilization metric class indicate a utilization of hardware by the virtual machine; selecting a utilization metric statistic based on the classifying the second quality metric into the utilization metric class, the utilization metric statistic being selected from the group of statistics and being different from the load metric statistic; accumulating values for one or more utilization metric partial sums from performance monitoring data relating to the second quality metric, the utilization metric partial sums being selected to calculate a value of the utilization metric statistic; calculating, by the processor, the value of the utilization metric statistic from the load metric partial sums accumulated from performance monitoring data relating to the second quality metric; determining, by the processor, an adaptive threshold range for the first quality metric based on the value of the load metric statistic and based on the classifying the first quality metric into the load metric class; determining, by the processor, an adaptive threshold range for the second quality metric based on the value of the utilization metric statistic and based on the classifying the second quality metric into the utilization metric class; determining, by the processor, that a monitoring value for one of the first quality metric and the second quality metric is outside the adaptive threshold range for the one quality metric; and performing, by the processor, a self-healing and dynamic optimization task based on the determining that the monitoring value for the second quality metric is outside the adaptive threshold range, the self-healing and dynamic optimization task comprising adding a computing resource if the statistical value is above an upper threshold and removing a computing resource if the statistical value is below a lower threshold; determining that a value of one of the partial sums accumulated from performance monitoring data relating to one of the quality metrics exceeds a limit imposed to prevent arithmetic overflow of a value storage; and dividing values of each partial sum accumulated from performance monitoring data relating to the one of the quality metrics by two. - View Dependent Claims (17, 18)
-
Specification