ADAPTIVE DATACENTER TOPOLOGY FOR DISTRIBUTED FRAMEWORKS JOB CONTROL THROUGH NETWORK AWARENESS
First Claim
1. A method, comprising:
- receiving a priority of a distributed computing job, an intermediate traffic type of the distributed computing job, and a set of candidate compute nodes available to process the distributed computing job, the candidate compute nodes each available to process at least one input split of the distributed computing job; and
selecting a mapper node from the candidate compute nodes, for one of the input splits, wherein the mapper node is selected based on the priority and the intermediate traffic type of the distributed computing job, wherein the mapper compute node is further selected upon determining that;
the mapper node is not experiencing an error; and
a resource utilization score for the mapper node does not exceed a utilization threshold.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems, methods, and computer program products to perform an operation comprising receiving a priority of a distributed computing job, an intermediate traffic type of the distributed computing job, and a set of candidate compute nodes available to process the distributed computing job, the candidate compute nodes each available to process at least one input split of the distributed computing job, and selecting a mapper node from the candidate compute nodes, for one of the input splits, wherein the mapper node is selected based on the priority and the intermediate traffic type of the distributed computing job, wherein the mapper compute node is further selected upon determining that the mapper node is not affected by an error, and a resource utilization score for the mapper node does not exceed a utilization threshold.
20 Citations
20 Claims
-
1. A method, comprising:
-
receiving a priority of a distributed computing job, an intermediate traffic type of the distributed computing job, and a set of candidate compute nodes available to process the distributed computing job, the candidate compute nodes each available to process at least one input split of the distributed computing job; and selecting a mapper node from the candidate compute nodes, for one of the input splits, wherein the mapper node is selected based on the priority and the intermediate traffic type of the distributed computing job, wherein the mapper compute node is further selected upon determining that; the mapper node is not experiencing an error; and a resource utilization score for the mapper node does not exceed a utilization threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system, comprising:
-
a computer processor; and a memory containing a program, which when executed by the processor, performs an operation comprising; receiving a priority of a distributed computing job, an intermediate traffic type of the distributed computing job, and a set of candidate compute nodes available to process the distributed computing job, the candidate compute nodes each available to process at least one input split of the distributed computing job; and selecting a mapper node from the candidate compute nodes, for one of the input splits, wherein the mapper node is selected based on the priority and the intermediate traffic type of the distributed computing job, wherein the mapper compute node is further selected upon determining that; the mapper node is not experiencing an error; and a resource utilization score for the mapper node does not exceed a utilization threshold. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer program product, comprising:
computer-readable code that when executed by a processor, performs an operation to optimize placement of a distributed computing job on a plurality of compute nodes, the operation comprising; receiving a priority of a distributed computing job, an intermediate traffic type of the distributed computing job, and a set of candidate compute nodes available to process the distributed computing job, the candidate compute nodes each available to process at least one input split of the distributed computing job; and selecting a mapper node from the candidate compute nodes, for one of the input splits, wherein the mapper node is selected based on the priority and the intermediate traffic type of the distributed computing job, wherein the mapper compute node is further selected upon determining that; the mapper node is not experiencing an error; and a resource utilization score for the mapper node does not exceed a utilization threshold. - View Dependent Claims (16, 17, 18, 19, 20)
Specification