Estimating a performance parameter of a job having map and reduce tasks after a failure
First Claim
Patent Images
1. A method comprising:
- receiving, by a system comprising a processor, a job profile that includes characteristics of a job to be executed, wherein the characteristics of the job profile relate to map tasks and reduce tasks of the job, wherein the map tasks produce intermediate results based on input data, and the reduce tasks produce an output based on the intermediate results;
in response to an indication of a failure in the system associated with execution of the job, computing, by the system, numbers of failed map tasks and failed reduce tasks of the job based on a time of the failure, wherein the indication of the failure comprises an indication that a computing node of plural computing nodes has failed;
computing, by the system, numbers of remaining map tasks and remaining reduce tasks based on the numbers of failed map tasks and failed reduce tasks;
providing, by the system, a performance model based on the job profile, the computed numbers of remaining map tasks and remaining reduce tasks, and an allocated amount of resources for the job, wherein providing the performance model is based on the allocated amount of resources that has been reduced from a previous allocation of resources due to the failure; and
estimating, by the system, a performance parameter of the job using the performance model.
2 Assignments
0 Petitions
Accused Products
Abstract
A job profile includes characteristics of a job to be executed, where the characteristics of the job profile relate to map tasks and reduce tasks of the job, and where the map tasks produce intermediate results based on input data, and the reduce tasks produce an output based on the intermediate results. In response to a failure in a system, numbers of failed map tasks and reduce tasks of the job based on a time of the failure are computed, and numbers of remaining map tasks and reduce tasks are computed. A performance model is provided, and a performance parameter of the job is estimated using the performance model.
50 Citations
16 Claims
-
1. A method comprising:
-
receiving, by a system comprising a processor, a job profile that includes characteristics of a job to be executed, wherein the characteristics of the job profile relate to map tasks and reduce tasks of the job, wherein the map tasks produce intermediate results based on input data, and the reduce tasks produce an output based on the intermediate results; in response to an indication of a failure in the system associated with execution of the job, computing, by the system, numbers of failed map tasks and failed reduce tasks of the job based on a time of the failure, wherein the indication of the failure comprises an indication that a computing node of plural computing nodes has failed; computing, by the system, numbers of remaining map tasks and remaining reduce tasks based on the numbers of failed map tasks and failed reduce tasks; providing, by the system, a performance model based on the job profile, the computed numbers of remaining map tasks and remaining reduce tasks, and an allocated amount of resources for the job, wherein providing the performance model is based on the allocated amount of resources that has been reduced from a previous allocation of resources due to the failure; and estimating, by the system, a performance parameter of the job using the performance model. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. An article comprising at least one non-transitory machine-readable storage medium storing instructions that upon execution cause a system to:
-
receive a job profile that includes characteristics of a job to be executed, wherein the characteristics of the job profile relate to map tasks and reduce tasks of the job, wherein the map tasks produce intermediate results based on input data, and the reduce tasks produce an output based on the intermediate results; in response to an indication of a failure in the system associated with execution of the job, compute numbers of failed map tasks and failed reduce tasks of the job based on a time of the failure; compute numbers of remaining map tasks and remaining reduce tasks based on the numbers of failed map tasks and failed reduce tasks; determine whether resources of the system are to be replenished after the failure; provide a performance model based on the job profile, the computed numbers of remaining map tasks and remaining reduce tasks, and an allocated amount of resources for the job, wherein providing the performance model is based on the allocated amount of resources after the replenishing of resources; and estimate a performance parameter of the job using the performance model. - View Dependent Claims (8, 9, 10)
-
-
11. A system comprising:
-
a storage medium to store a job profile that includes characteristics of a job to be executed, wherein the characteristics of the job profile relate to map tasks and reduce tasks of the job, wherein the map tasks produce intermediate results based on input data, and the reduce tasks produce an output based on the intermediate results; and at least one processor to; detect occurrence of a failure in the system; in response to detecting the failure, determine whether the failure occurred during a map stage or a reduce stage, wherein the map tasks are to be performed during the map stage, and the reduce tasks are to be performed during the reduce stage; compute a number of failed map tasks and a number of failed reduce tasks dependent upon whether the failure occurred during the map stage or the reduce stage; compute a number of remaining map tasks and a number of remaining reduce tasks based on the numbers of failed map tasks and failed reduce tasks; update a performance model using the numbers of remaining map tasks and remaining reduce tasks; and estimate a performance parameter of the job using the updated performance model. - View Dependent Claims (12, 13, 14)
-
-
15. A system comprising:
-
a storage medium to store a job profile that includes characteristics of a job to be executed, wherein the characteristics of the job profile relate to map tasks and reduce tasks of the job, wherein the map tasks produce intermediate results based on input data, and the reduce tasks produce an output based on the intermediate results; and at least one processor to; detect occurrence of a failure in the system; in response to detecting the failure, compute a number of failed map tasks and a number of failed reduce tasks based on a time of the failure; compute a number of remaining map tasks and a number of remaining reduce tasks based on the numbers of failed map tasks and failed reduce tasks; update a performance model using the numbers of remaining map tasks and remaining reduce tasks and in response to an indication that failed resources are not to be replenished, wherein the update of the performance model includes reducing allocations of resources; and estimate a performance parameter of the job using the updated performance model. - View Dependent Claims (16)
-
Specification