Distributed storage resource scheduler and load balancer
First Claim
1. A method of managing distributed storage resources including at least a first storage unit and a second storage unit, comprising:
- while the first storage unit and the second storage unit are online, monitoring workloads associated with objects stored in the first storage unit and the second storage unit at multiple points in time over a time interval, and monitoring performance of the first storage unit and the second storage unit, each of the monitored workloads being a function of time-correlated samples of measured data including the number of outstanding input output requests to an associated object and an average size of input output requests to the associated object;
computing normalized load metrics for the first storage unit based on time-correlated sums of the workloads monitored on the first storage unit over the time interval, each of the workloads monitored on the first storage unit over the time interval being associated with a respective one of the objects stored in the first storage unit, and the monitored performance of the first storage unit, wherein each of the time-correlated sums of the workloads monitored on the first storage unit is computed at a respective point in time as a summation of the workloads monitored on the first storage unit at the respective point in time;
computing normalized load metrics for the second storage unit based on time-correlated sums of the workloads monitored on the second storage unit over the time interval, each of the workloads monitored on the second storage unit over the time interval being associated with a respective one of the objects stored in the second storage unit, and the monitored performance of the second storage unit, wherein each of the time-correlated sums of the workloads monitored on the second storage unit is computed at a respective point in time as a summation of the workloads monitored on the second storage unit at the respective point in time; and
identifying one or more of the objects as candidates for migration between the first storage unit and the second storage unit based on the computed normalized load metrics of the first storage unit and the second storage unit.
1 Assignment
0 Petitions
Accused Products
Abstract
Distributed storage resources having multiple storage units are managed based on data collected from online monitoring of workloads on the storage units and performance characteristics of the storage units. The collected data is sampled at discrete time intervals over a time period of interest, such as a congested time period. Normalized load metrics are computed for each storage unit based on time-correlated sums of the workloads running on the storage unit over the time period of interest and the performance characteristic of the storage unit. Workloads that are migration candidates and storage units that are migration destinations are determined from a representative value of the computed normalized load metrics, which may be the 90th percentile value or a weighted sum of two or more different percentile values.
-
Citations
27 Claims
-
1. A method of managing distributed storage resources including at least a first storage unit and a second storage unit, comprising:
-
while the first storage unit and the second storage unit are online, monitoring workloads associated with objects stored in the first storage unit and the second storage unit at multiple points in time over a time interval, and monitoring performance of the first storage unit and the second storage unit, each of the monitored workloads being a function of time-correlated samples of measured data including the number of outstanding input output requests to an associated object and an average size of input output requests to the associated object; computing normalized load metrics for the first storage unit based on time-correlated sums of the workloads monitored on the first storage unit over the time interval, each of the workloads monitored on the first storage unit over the time interval being associated with a respective one of the objects stored in the first storage unit, and the monitored performance of the first storage unit, wherein each of the time-correlated sums of the workloads monitored on the first storage unit is computed at a respective point in time as a summation of the workloads monitored on the first storage unit at the respective point in time; computing normalized load metrics for the second storage unit based on time-correlated sums of the workloads monitored on the second storage unit over the time interval, each of the workloads monitored on the second storage unit over the time interval being associated with a respective one of the objects stored in the second storage unit, and the monitored performance of the second storage unit, wherein each of the time-correlated sums of the workloads monitored on the second storage unit is computed at a respective point in time as a summation of the workloads monitored on the second storage unit at the respective point in time; and identifying one or more of the objects as candidates for migration between the first storage unit and the second storage unit based on the computed normalized load metrics of the first storage unit and the second storage unit. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A method of migrating workloads between a first storage unit and a second storage unit of a shared storage system that includes physically separate storage arrays, comprising:
-
while the first storage unit and the second storage unit are online, monitoring workloads associated with objects stored in the first storage unit and the second storage unit at multiple points in time over a time interval, each of the monitored workloads being a function of time-correlated samples of measured data including the number of outstanding input output requests to an associated object and an average size of input output requests to the associated object; computing normalized load metrics for the first storage unit based on time-correlated sums of the workloads monitored on the first storage unit over the time interval, each of the workloads monitored on the first storage unit over the time interval being associated with a respective one of the objects stored in the first storage unit, and a monitored performance of the first storage unit, wherein each of the time-correlated sums of the workloads monitored on the first storage unit is computed at a respective point in time as a summation of the workloads monitored on the first storage unit at the respective point in time; computing normalized load metrics for the second storage unit based on time-correlated sums of the workloads monitored on the second storage unit over the time interval, each of the workloads monitored on the second storage unit over the time interval being associated with a respective one of the objects stored in the second storage unit, and a monitored performance of the second storage unit, wherein each of the time-correlated sums of the workloads monitored on the second storage unit is computed at a respective point in time as a summation of the workloads monitored on the second storage unit at the respective point in time; and migrating one of the workloads between the first storage unit and the second storage unit based on the computed normalized load metrics of the first storage unit and the second storage unit. - View Dependent Claims (19, 20, 21, 22, 23)
-
-
24. A non-transitory computer-readable storage medium comprising instructions which, when executed in a computing device coupled to distributed storage resources including at least a first storage unit and a second storage unit, causes the computing device to carry out the steps of:
-
while the first storage unit and the second storage unit are online, monitoring workloads associated with objects stored in the first storage unit and the second storage unit at multiple points in time over a time interval, and monitoring performance of the first storage unit and the second storage unit, each of the monitored workloads being a function of time-correlated samples of measured data including the number of outstanding input output requests to an associated object and an average size of input output requests to the associated object; computing normalized load metrics for the first storage unit based on time-correlated sums of the workloads monitored on the first storage unit over the time interval, each of the workloads monitored on the first storage unit over the time interval being associated with a respective one of the objects stored in the first storage unit, and the monitored performance of the first storage unit, wherein each of the time-correlated sums of the workloads monitored on the first storage unit is computed at a respective point in time as a summation of the workloads monitored on the first storage unit at the respective point in time; computing normalized load metrics for the second storage unit based on time-correlated sums of the workloads monitored on the second storage unit over the time interval, each of the workloads monitored on the second storage unit over the time interval being associated with a respective one of the objects stored in the second storage unit, and the monitored performance of the second storage unit, wherein each of the time-correlated sums of the workloads monitored on the second storage unit is computed at a respective point in time as a summation of the workloads monitored on the second storage unit at the respective point in time; and identifying one or more of the objects as candidates for migration between the first storage unit and the second storage unit based on the computed normalized load metrics of the first storage unit and the second storage unit. - View Dependent Claims (25)
-
-
26. A non-transitory computer-readable storage medium comprising instructions which, when executed in a computing device coupled to distributed storage resources including at least a first storage unit and a second storage unit, causes the computing device to carry out the steps of:
-
while the first storage unit and the second storage unit are online, monitoring workloads on the first storage unit and the second storage unit at multiple points in time over a time interval, each of the monitored workloads being a function of time-correlated samples of measured data including the number of outstanding input output requests to an associated object and an average size of input output requests to the associated object; computing normalized load metrics for the first storage unit based on time-correlated sums of the workloads monitored on the first storage unit over the time interval; computing normalized load metrics for the second storage unit based on time-correlated sums of the workloads monitored on the second storage unit over the time interval; and migrating one of the workloads between the first storage unit and the second storage unit based on the computed normalized load metrics of the first storage unit and the second storage unit. - View Dependent Claims (27)
-
Specification