Apparatus and method for efficient scheduling of tasks
First Claim
1. A method for controlling a data center comprising a plurality of servers, each server having an activated state available to receive processing tasks, a non-activated state incapable of receiving processing tasks, an energy consumption when in a non-activated state, and a higher energy consumption when in an activated state, wherein an energy efficiency for a given processing task load, as estimated from performance delivered per unit of expended energy is increased by placing a portion of the servers in the non-activated state and allocating the given processing task load between a subset of the plurality of servers in the activated state, wherein a transition from the non-activated state to the activated state of a respective server incurs a latency, each server having a peak load capacity, comprising:
- defining a desired service level for the datacenter comprising a minimum processing rate for completing respective processing tasks which is shorter than the latency;
receiving a plurality of processing tasks;
predicting a future load of processing tasks;
determining a predicted minimum number of servers which must be in the activated state and within the peak load capacity sufficient for handling the received plurality of processing tasks, and the predicted future load of processing tasks, while achieving the desired service level, the predicted minimum number of servers accounting for the latency to transition a respective server from the non-activated state to the activated state in the event that the subset of the plurality of servers in the activated state is insufficient; and
processing the plurality of processing tasks with the determined minimum number of servers in the activated state within their respective peak load capacity.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method of scheduling tasks, comprising receiving activity and performance data from registers or storage locations maintained by hardware and an operating system; storing calibration coefficients associated with the activity and performance data; computing an energy dissipation rate based on at least the activity and performance data; and scheduling tasks under the operating system based on the computed energy dissipation rate.
127 Citations
20 Claims
-
1. A method for controlling a data center comprising a plurality of servers, each server having an activated state available to receive processing tasks, a non-activated state incapable of receiving processing tasks, an energy consumption when in a non-activated state, and a higher energy consumption when in an activated state, wherein an energy efficiency for a given processing task load, as estimated from performance delivered per unit of expended energy is increased by placing a portion of the servers in the non-activated state and allocating the given processing task load between a subset of the plurality of servers in the activated state, wherein a transition from the non-activated state to the activated state of a respective server incurs a latency, each server having a peak load capacity, comprising:
-
defining a desired service level for the datacenter comprising a minimum processing rate for completing respective processing tasks which is shorter than the latency; receiving a plurality of processing tasks; predicting a future load of processing tasks; determining a predicted minimum number of servers which must be in the activated state and within the peak load capacity sufficient for handling the received plurality of processing tasks, and the predicted future load of processing tasks, while achieving the desired service level, the predicted minimum number of servers accounting for the latency to transition a respective server from the non-activated state to the activated state in the event that the subset of the plurality of servers in the activated state is insufficient; and processing the plurality of processing tasks with the determined minimum number of servers in the activated state within their respective peak load capacity. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for controlling a data center comprising a plurality of servers, each server being adapted to process requests for processing tasks from at least one load distribution switch, the data center having a desired service level representing a minimum task processing rate, each server having at least a state in which it is available to process tasks consuming a first amount of power, a state in which it is unavailable to process tasks consuming a second amount of power lower than the first amount of power, and a latency to switch from the state in which it is unavailable to process tasks to the state in which it is available to process tasks which precludes activation of a server in the state in which it is unavailable to process tasks to the state in which it is available to process tasks after receipt of a respective task while achieving the desired service level, comprising:
-
receiving, by the at least one load distribution switch, a plurality of requests for processing of tasks; predicting a future load of requests for processing tasks; determining a minimum number of the plurality of servers that need to be available to process the plurality of requests for processing tasks and the predicted future load of requests for processing tasks, without exceeding at least one criterion for any of the plurality of servers available to process the load associated with a contingent predicted ability to achieve the desired service level; and allocating the requests for processing the load by the load distribution switch to the determined minimum number of the plurality of servers. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A method for controlling a data center comprising a plurality of servers, each server having:
-
a first state having a first power consumption and being unavailable for processing tasks, a second state having a second power consumption and being available for receiving requests for processing tasks without processing tasks, and a third state being available for receiving requests for processing tasks and for processing tasks, having a range of power consumptions which increase with increasing processing of tasks, a lower end of the range of power consumptions of the third state being greater than or equal to the second power consumption, and the second power consumption being greater than the first power consumption, a latency being incurred for changing a server from the first state to the second state, the method comprising; receiving a plurality of tasks for processing, the tasks having a service requirement, the service requirement requiring a processing latency less than the latency incurred for changing a server from the first state to the second state; and optimally allocating the plurality of tasks to a number of servers in the third state, the number being proactively selected to; minimize an aggregate power consumption of the servers in the first, second and third states, statistically remain within at least one acceptable load processing criterion for each server, the load processing criterion limiting a maximum number of tasks that can be processed by the respective server, and statistically remain within the service requirement for each of the plurality of tasks. - View Dependent Claims (17, 18, 19, 20)
-
Specification