×

Optimal operator placement for distributed query processing

  • US 9,892,163 B2
  • Filed: 08/01/2014
  • Issued: 02/13/2018
  • Est. Priority Date: 11/26/2013
  • Status: Active Grant
First Claim
Patent Images

1. A computer program product comprising a non-transitory machine-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform operations comprising:

  • generating, for a multi-operation database process to be performed in a distributed database management system comprising a plurality of computing nodes comprising respective data processors and having respective node locations, a plurality of sub-plans, each of the plurality of sub-plans comprising a different distribution of node locations of a plurality of operators among the plurality of computing nodes, the plurality of operators being those necessary to complete the multi-operation database process;

    calculating a total minimum global cost for each sub-plan of the plurality of sub-plans; and

    selecting an optimal plan from the plurality of sub-plans, the optimal plan having a lowest total minimum global cost, the selecting including pruning at least one sub-plan from the plurality of sub-plans by at least eliminating at least one sub-plan having a cost greater than a best sub-tree cost for a location of a node location;

    wherein the plurality of computing nodes includes a first node and a second node, and a first sub-plan of the plurality of sub-plans includes a first join operation and a second join operation, wherein;

    the first join operation comprises transferring a first number of rows of a first table from the second node to the first node, and joining the first number of rows with a second number of rows of a second table at the first node, the first number of rows being greater than the second number of rows, andthe second join operation comprises joining a result of the first join operation with a third number of rows of a third table at the first node, the third number of rows greater than the first number of rows.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×