Please download the dossier by clicking on the dossier button x
×

OPTIMIZING DATA PARTITIONING FOR DATA-PARALLEL COMPUTING

  • US 20130152057A1
  • Filed: 12/13/2011
  • Published: 06/13/2013
  • Est. Priority Date: 12/13/2011
  • Status: Active Grant
First Claim
Patent Images

1. A system for optimizing data partitioning for a distributed execution engine, the system comprising:

  • a code/EPG analysis module for deriving properties of a data-parallel program code in each vertex in a corresponding execution plan graph (EPG) compiled from the data-parallel program code;

    a complexity module for at least deriving the computational complexity of each vertex in the EPG;

    a data analysis module for generating a plurality of compact data representations corresponding to an input data for processing by the data-parallel program code;

    a statistics and samples module for determining the relationship between input data size versus computational and input-output (I/O) costs;

    a cost modeling and estimation module for estimating the runtime cost of each vertex in the EPG and the overall runtime cost represented by the EPG; and

    a cost optimization module for determining a data partitioning plan.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×