Apparatus and method for efficient scheduling of tasks

US 9,377,837 B2
Filed: 03/20/2015
Issued: 06/28/2016
Est. Priority Date: 07/21/2009
Status: Active Grant

First Claim

Patent Images

1. A method for controlling a data center comprising a plurality of servers, each server having an activated state available to receive processing tasks, a non-activated state incapable of receiving processing tasks, an energy consumption when in a non-activated state, and a higher energy consumption when in an activated state, wherein an energy efficiency for a given processing task load, as estimated from performance delivered per unit of expended energy is increased by placing a portion of the servers in the non-activated state and allocating the given processing task load between a subset of the plurality of servers in the activated state, wherein a transition from the non-activated state to the activated state of a respective server incurs a latency, each server having a peak load capacity, comprising:

defining a desired service level for the datacenter comprising a minimum processing rate for completing respective processing tasks which is shorter than the latency;

receiving a plurality of processing tasks;

predicting a future load of processing tasks;

determining a predicted minimum number of servers which must be in the activated state and within the peak load capacity sufficient for handling the received plurality of processing tasks, and the predicted future load of processing tasks, while achieving the desired service level, the predicted minimum number of servers accounting for the latency to transition a respective server from the non-activated state to the activated state in the event that the subset of the plurality of servers in the activated state is insufficient; and

processing the plurality of processing tasks with the determined minimum number of servers in the activated state within their respective peak load capacity.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method of scheduling tasks, comprising receiving activity and performance data from registers or storage locations maintained by hardware and an operating system; storing calibration coefficients associated with the activity and performance data; computing an energy dissipation rate based on at least the activity and performance data; and scheduling tasks under the operating system based on the computed energy dissipation rate.

127 Citations

20 Claims

1. A method for controlling a data center comprising a plurality of servers, each server having an activated state available to receive processing tasks, a non-activated state incapable of receiving processing tasks, an energy consumption when in a non-activated state, and a higher energy consumption when in an activated state, wherein an energy efficiency for a given processing task load, as estimated from performance delivered per unit of expended energy is increased by placing a portion of the servers in the non-activated state and allocating the given processing task load between a subset of the plurality of servers in the activated state, wherein a transition from the non-activated state to the activated state of a respective server incurs a latency, each server having a peak load capacity, comprising:
- defining a desired service level for the datacenter comprising a minimum processing rate for completing respective processing tasks which is shorter than the latency;
  
  receiving a plurality of processing tasks;
  
  predicting a future load of processing tasks;
  
  determining a predicted minimum number of servers which must be in the activated state and within the peak load capacity sufficient for handling the received plurality of processing tasks, and the predicted future load of processing tasks, while achieving the desired service level, the predicted minimum number of servers accounting for the latency to transition a respective server from the non-activated state to the activated state in the event that the subset of the plurality of servers in the activated state is insufficient; and
  
  processing the plurality of processing tasks with the determined minimum number of servers in the activated state within their respective peak load capacity.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method according to claim 1, further comprising measuring a metric selected from at least one of the group consisting of a processing core utilization, an input/output device utilization, cache coherency maintenance activity, random access memory utilization, for each respective server, wherein said determining is selectively dependent on the measured metric.
  - 3. The method according to claim 1, wherein the peak load capacity is associated with a maximum temperature of at least one physical server.
  - 4. The method according to claim 1, wherein the plurality of servers comprise a plurality of virtual servers.
  - 5. The method according to claim 1, further comprising placing a maximum number of the plurality of servers in the non-activated state incapable of receiving processing tasks, when a remaining portion of the plurality of servers are available to process the plurality of processing tasks without exceeding a load processing threshold criterion.
  - 6. The method according to claim 1, further comprising activating at least one additional server to make it available for processing of the processing tasks, if prior to activating, a server task processing parameter exceeds a threshold task processing criterion.
  - 7. The method according to claim 1, further comprising deactivating at least one server if the received plurality of processing tasks is insufficient to maintain each of the plurality of servers in the activated state within a predetermined processing load range.
  - 8. The method according to claim 1, wherein each server in the activated state has a respective queue of processing tasks waiting to be performed, further comprising allocating at least one processing task to a queue of a respective server in the activated state based on at least a predicted power consumption efficiency of the data center.

9. A method for controlling a data center comprising a plurality of servers, each server being adapted to process requests for processing tasks from at least one load distribution switch, the data center having a desired service level representing a minimum task processing rate, each server having at least a state in which it is available to process tasks consuming a first amount of power, a state in which it is unavailable to process tasks consuming a second amount of power lower than the first amount of power, and a latency to switch from the state in which it is unavailable to process tasks to the state in which it is available to process tasks which precludes activation of a server in the state in which it is unavailable to process tasks to the state in which it is available to process tasks after receipt of a respective task while achieving the desired service level, comprising:
- receiving, by the at least one load distribution switch, a plurality of requests for processing of tasks;
  
  predicting a future load of requests for processing tasks;
  
  determining a minimum number of the plurality of servers that need to be available to process the plurality of requests for processing tasks and the predicted future load of requests for processing tasks, without exceeding at least one criterion for any of the plurality of servers available to process the load associated with a contingent predicted ability to achieve the desired service level; and
  
  allocating the requests for processing the load by the load distribution switch to the determined minimum number of the plurality of servers.
- View Dependent Claims (10, 11, 12, 13, 14, 15)
- - 10. The method according to claim 9, further comprising measuring a metric selected from at least one of the group consisting of a processing core utilization, an input/output device utilization, cache coherency maintenance activity, random access memory utilization, for each respective server, wherein said determining is selectively dependent on the measured metric.
  - 11. The method according to claim 9, wherein the at least one criteria comprises a maximum temperature associated with at least one physical server.
  - 12. The method according to claim 9, further comprising making a maximum number of the plurality of servers unavailable to process the predicted future load of the plurality of requests for processing of the tasks, when a remaining portion of the plurality of servers are available to process the predicted future load of the plurality of requests for processing of the tasks is predicted to be sufficient without exceeding the at least one criterion.
  - 13. The method according to claim 9, further comprising converting at least one server of the plurality of servers from the state in which it is unavailable to process tasks to the state in which it is available to process tasks, after a server load processing parameter exceeds the at least one criterion.
  - 14. The method according to claim 9, further comprising converting at least one server of the plurality of servers from the state in which it is available to process tasks to the state in which it is unavailable to process tasks, if the minimum number of the plurality of servers is insufficient to maintain each of the plurality of servers in the activated state within a predetermined processing load range.
  - 15. The method according to claim 11, wherein each server in the state in which it is available to process tasks has a respective queue of processing tasks waiting to be performed, further comprising allocating at least one processing task to a queue of a respective server available to process tasks based on at least a predicted power consumption efficiency of the data center.

16. A method for controlling a data center comprising a plurality of servers, each server having:
- a first state having a first power consumption and being unavailable for processing tasks,a second state having a second power consumption and being available for receiving requests for processing tasks without processing tasks, anda third state being available for receiving requests for processing tasks and for processing tasks, having a range of power consumptions which increase with increasing processing of tasks,a lower end of the range of power consumptions of the third state being greater than or equal to the second power consumption, and the second power consumption being greater than the first power consumption, a latency being incurred for changing a server from the first state to the second state,the method comprising;
  
  receiving a plurality of tasks for processing, the tasks having a service requirement, the service requirement requiring a processing latency less than the latency incurred for changing a server from the first state to the second state; and
  
  optimally allocating the plurality of tasks to a number of servers in the third state, the number being proactively selected to;
  
  minimize an aggregate power consumption of the servers in the first, second and third states,statistically remain within at least one acceptable load processing criterion for each server, the load processing criterion limiting a maximum number of tasks that can be processed by the respective server, andstatistically remain within the service requirement for each of the plurality of tasks.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The method according to claim 16, further comprising activating at least one server from the first state based on at least a predicted future inability to meet a processing performance criterion.
  - 18. The method according to claim 16, further comprising deactivating at least one server to the first state based on at least an estimation of a future rate of requests for processing tasks.
  - 19. The method according to claim 16, wherein said optimally allocating is further dependent on a determined rate of requests for processing tasks.
  - 20. The method according to claim 16, further comprising at least one of:
    - converting at least one server of the plurality of servers from the first state to become available, if the number of servers in the second and third states is not sufficient to statistically remain within the service requirement for each of the plurality of tasks; and
      
      converting at least one server of the plurality of servers to the first state to become unavailable, if the number of servers in the second and third states is statistically excessive to remain within at least one efficient usage acceptable load processing range for each server.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
The Research Foundation for The State University of New York (State University of New York)
Original Assignee
The Research Foundation for The State University of New York (State University of New York)
Inventors
Ghose, Kanad
Primary Examiner(s)
PHAN, RAYMOND NGAN

Application Number

US14/663,602
Publication Number

US 20150192979A1
Time in Patent Office

466 Days
Field of Search

713300-340, 709201-203, 709227-229
US Class Current

1/1
CPC Class Codes

A61K 36/185   Magnoliopsida (dicotyledons)

B03B 1/00   Conditioning for facilitati...

B03B 11/00   Feed or discharge devices i...

B03B 5/02   using shaken, pulsated or s...

B03B 5/58   Bowl classifiers

G05D 23/19   characterised by the use of...

G06F 1/20   Cooling means

G06F 1/206   comprising thermal management

G06F 1/3203   Power management, i.e. even...

G06F 1/3206   Monitoring of events, devic...

G06F 1/3209   Monitoring remote activity,...

G06F 1/3228   Monitoring task completion,...

G06F 13/409   Mechanical coupling back pa...

G06F 2009/4557   Distribution of virtual mac...

G06F 9/45558   Hypervisor-specific managem...

G06F 9/4893   taking into account power o...

G06F 9/5094   where the allocation takes ...

G06K 19/0723   the record carrier comprisi...

G06K 19/07705   the visual interface being ...

G06K 19/07722   the record carrier being mu...

H04L 69/329 : in the application layer [O...

H04L 9/40 : Network security protocols

H05K 7/20836 : Thermal management, e.g. se...

Y02D 10/00 : Energy efficient computing,...

View All

Apparatus and method for efficient scheduling of tasks

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

127 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Apparatus and method for efficient scheduling of tasks

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

127 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links