General Distributed Reduction For Data Parallel Computing
First Claim
1. A machine implemented method for distributed parallel processing, comprising:
- receiving an expression from a sequential program that is executing at a first machine;
automatically generating an execution plan including a map phase and a reduction phase for executing the expression in parallel at nodes of a compute cluster; and
providing the execution plan to an execution engine that controls parallel execution of the expression in the compute cluster.
2 Assignments
0 Petitions
Accused Products
Abstract
General-purpose distributed data-parallel computing using high-level computing languages is described. Data parallel portions of a sequential program written in a high-level language are automatically translated into a distributed execution plan. Map and reduction computations are automatically added to the plan. Patterns in the sequential program can be automatically identified to trigger map and reduction processing. Direct invocation of map and reduction processing is also provided. One or more portions of the reduce computation are pushed to the map stage and dynamic aggregation is inserted when possible. The system automatically identifies opportunities for partial reductions and aggregation, but also provides a set of extensions in a high-level computing language for the generation and optimization of the distributed execution plan. The extensions include annotations to declare functions suitable for these optimizations.
166 Citations
20 Claims
-
1. A machine implemented method for distributed parallel processing, comprising:
-
receiving an expression from a sequential program that is executing at a first machine; automatically generating an execution plan including a map phase and a reduction phase for executing the expression in parallel at nodes of a compute cluster; and providing the execution plan to an execution engine that controls parallel execution of the expression in the compute cluster. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. One or more processor readable storage devices having processor readable code stored thereon, the processor readable code programs one or more processors to perform a method comprising:
-
accessing an expression from a sequential program that is executing at a first machine, the expression including related select and groupby functions; automatically generating an execution plan for parallel processing of the expression by a distributed execution engine, the execution plan including a map phase corresponding to the select function and a reduce phase corresponding to the groupby function, the map phase specifying a partition of an input dataset for distribution to a plurality of nodes in the distributed execution engine; optimizing the execution plan by decomposing the groupby function and pushing at least a portion of the decomposed groupby function from the reduce phase to the map phase, the at least a portion of the decomposed groupby function including a partial reduction to reduce a size of the input dataset prior to the partition; and providing the execution plan to the distributed execution engine for controlling parallel execution of the expression. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A distributed parallel processing system, comprising:
-
a compute cluster including a plurality of nodes, each node including at least one processor; an execution provider that accesses expressions from a sequential program that is running at a first machine, the execution provider determines that at least one expression from the sequential program can be expressed using map and reduction computations, the execution provider automatically generates an execution plan graph and code for vertices in the execution plan graph for parallel processing of the at least one expression, the execution provider generates at least one map phase and at least one reduce phase for the execution plan graph, the code for vertices in the execution plan graph implement the map and reduction computations; and an execution engine that receives the execution plan graph and the code from the execution provider and manages parallel execution of the expression in the compute cluster based on the execution plan graph and the code. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification