SCHEDULING MAPREDUCE TASKS BASED ON ESTIMATED WORKLOAD DISTRIBUTION
First Claim
1. A method comprising:
- receiving a set of task statistics corresponding to task execution within a MapReduce job;
estimating a completion time for a set of tasks to be executed to provide an estimated completion time;
calculating a soft decision point based on a convergence of a workload distribution corresponding to a set of executed tasks;
calculating a hard decision point based on the estimated completion time for the set of tasks to be executed;
determining a selected decision point based on the soft decision point and the hard decision point; and
scheduling upcoming tasks for execution based on the selected decision point.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for scheduling MapReduce tasks includes receiving a set of task statistics corresponding to task execution within a MapReduce job, estimating a completion time for a set of tasks to be executed to provide an estimated completion time, calculating a soft decision point based on a convergence of a workload distribution corresponding to a set of executed tasks, calculating a hard decision point based on the estimated completion time for the set of tasks to be executed, determining a selected decision point based on the soft decision point and the hard decision point, and scheduling upcoming tasks for execution based on the selected decision point. The method may also include estimating a map task completion time and estimating a shuffle operation completion time. A computer program product and computer system corresponding to the method are also disclosed.
-
Citations
20 Claims
-
1. A method comprising:
-
receiving a set of task statistics corresponding to task execution within a MapReduce job; estimating a completion time for a set of tasks to be executed to provide an estimated completion time; calculating a soft decision point based on a convergence of a workload distribution corresponding to a set of executed tasks; calculating a hard decision point based on the estimated completion time for the set of tasks to be executed; determining a selected decision point based on the soft decision point and the hard decision point; and scheduling upcoming tasks for execution based on the selected decision point. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer program product comprising:
-
one or more computer readable storage media and program instructions stored on the one or more computer readable storage media, the program instructions comprising instructions to; receive a set of task statistics corresponding to task execution within a MapReduce job; estimate a completion time for a set of tasks to be executed to provide an estimated completion time; calculate a soft decision point based on a convergence of a workload distribution corresponding to a set of executed tasks; calculate a hard decision point based on the estimated completion time for the set of tasks to be executed; determine a selected decision point based on the soft decision point and the hard decision point; and schedule upcoming tasks for execution based on the selected decision point. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A system comprising:
-
one or more computer processors; one or more computer-readable storage media; program instructions stored on the computer-readable storage media for execution by at least one of the one or more processors, the program instructions comprising instructions to; receive a set of task statistics corresponding to task execution within a MapReduce job; estimate a completion time for a set of tasks to be executed to provide an estimated completion time; calculate a soft decision point based on a convergence of a workload distribution corresponding to a set of executed tasks; calculate a hard decision point based on the estimated completion time for the set of tasks to be executed; determine a selected decision point based on the soft decision point and the hard decision point; and schedule upcoming tasks for execution based on the selected decision point. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification