COST-BASED OPTIMIZATION OF CONFIGURATION PARAMETERS AND CLUSTER SIZING FOR HADOOP
First Claim
1. A method comprising:
- at a processor and memory;
receiving at least one measure of performance of a MapReduce job;
determining a job profile based on the at least one measure of performance; and
providing the job profile for one of user interface and prediction processes.
2 Assignments
0 Petitions
Accused Products
Abstract
Cost-based optimization of configuration parameters and cluster sizing for distributed data processing systems are disclosed. According to an aspect, a method includes receiving at least one job profile of a MapReduce job. The method also includes using the at least one job profile to predict execution of the MapReduce job within a plurality of different predetermined settings of a distributed data processing system. Further, the method includes determining one of the predetermined settings that optimizes performance of the MapReduce job. The method may also include automatically adjusting the distributed data processing system to the determined predetermined setting.
111 Citations
36 Claims
-
1. A method comprising:
-
at a processor and memory; receiving at least one measure of performance of a MapReduce job; determining a job profile based on the at least one measure of performance; and providing the job profile for one of user interface and prediction processes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method comprising:
-
at a processor and memory; receiving at least one job profile of a MapReduce job; using the at least one job profile to predict execution of the MapReduce job within a predetermined setting of a distributed data processing system; and determining performance of the MapReduce job based on the prediction. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
-
21. A method comprising:
-
at a processor and memory; receiving at least one job profile of a MapReduce job; using the at least one job profile to predict execution of the MapReduce job within a plurality of different predetermined settings of a distributed data processing system; and determining one of the predetermined settings that optimizes performance of the MapReduce job. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30, 31)
-
-
32. A method comprising:
-
at a processor and memory; receiving at least one job profile of a MapReduce job; and using the at least one job profile to determine one of the predetermined settings that optimizes performance of the MapReduce job within a plurality of different predetermined settings of a distributed data processing system - View Dependent Claims (33)
-
-
34. A computing device comprising:
-
a processor and memory configured to; receive at least one measure of performance of a MapReduce job; determine a job profile based on the at least one measure of performance; and provide the job profile for one of user interface and prediction processes.
-
-
35. A computing device comprising:
-
a processor and memory configured to; receive at least one job profile of a MapReduce job; use the at least one job profile to predict execution of the MapReduce job within a predetermined setting of a distributed data processing system; and determine performance of the MapReduce job based on the prediction.
-
-
36. A computing device comprising:
-
a processor and memory configured to; receive at least one job profile of a MapReduce job; use the at least one job profile to predict execution of the MapReduce job within a plurality of different predetermined settings of a distributed data processing system; and determine one of the predetermined settings that optimizes performance of the MapReduce job.
-
Specification