TRANSPARENT EFFICIENCY FOR IN-MEMORY EXECUTION OF MAP REDUCE JOB SEQUENCES
First Claim
Patent Images
1. A method for executing a map reduce sequence, comprising:
- executing, by one or more processors, all jobs in the sequence by a collection of a plurality of processes with each process running zero or more mappers, combiners, partitioners and reducers for each job, and transparently sharing heap state between the jobs to improve metrics associated with the job; and
communicating among the processes to coordinate completion of map, shuffle and reduce phases, and completion of said all jobs in the sequence.
1 Assignment
0 Petitions
Accused Products
Abstract
Executing a map reduce sequence may comprise executing all jobs in the sequence by a collection of a plurality of processes with each process running zero or more mappers, combiners, partitioners and reducers for each job, and transparently sharing heap state between the jobs to improve metrics associated with the job. Processes may communicate among themselves to coordinate completion of map, shuffle and reduce phases, and completion of said all jobs in the sequence.
-
Citations
25 Claims
-
1. A method for executing a map reduce sequence, comprising:
-
executing, by one or more processors, all jobs in the sequence by a collection of a plurality of processes with each process running zero or more mappers, combiners, partitioners and reducers for each job, and transparently sharing heap state between the jobs to improve metrics associated with the job; and communicating among the processes to coordinate completion of map, shuffle and reduce phases, and completion of said all jobs in the sequence. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A method for executing a map reduce sequence, comprising:
-
executing, by one or more processors, all jobs in the sequence by a collection of a plurality of processes with each process running zero or more mappers, combiners, partitioners and reducers for each job; employing a cache memory comprising an association of input descriptors to an in-memory representation of a key value sequence obtained by running a corresponding input format descriptor on the input descriptor, and an association of output descriptors with the in-memory representation of the key value sequence consumed by the corresponding output format descriptor to produce the data associated with the output descriptor, to enable transparently sharing heap state between the jobs; and communicating among the processes to coordinate completion of map, shuffle and reduce phases, and completion of said all jobs in the sequence. - View Dependent Claims (22, 23, 24, 25)
-
Specification