×

System, method, and computer program product for progressive query processing

  • US 7,716,215 B2
  • Filed: 11/14/2007
  • Issued: 05/11/2010
  • Est. Priority Date: 10/31/2003
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method for accelerating database query processing, the method comprising:

  • placing a number of checkpoints in a particular query execution plan generated for a database query, the particular query execution plan being based on estimated optimization parameter values derived from historic database statistics,wherein a first subset of the number of checkpoints are placed at a first set of points in the particular query execution plan, the first set of points comprising points where an entire intermediate result is materialized before proceeding with further operators in the particular query execution plan, andwherein an explicit materialization is added to the particular query execution plan just before a second subset of the number of checkpoints to cause materialization of an intermediate result, the second subset of the number of checkpoints comprising checkpoints placed on the outer side of nested-loop joins;

    executing the particular query execution plan, wherein the executing selectively retains temporarily materialized views by storing partial query results until a lazy removal condition occurs, the lazy removal conditions comprising (a) updating of at least one table contributing to a materialized view, and (b) determining that new storage space is needed for other materialized views;

    computing, during the executing the particular execution plan upon encountering a given checkpoint within the number of checkpoints, a difference between the estimated optimization parameter values and actual optimization parameter values observed during the executing the particular query execution plan to determine a significance of parameter estimation errors, wherein the actual optimization parameter values comprise cardinality, memory, communication costs, and I/O operations;

    determining, at the given checkpoint and based on the computing, that an amount of query execution time for the particular query execution plan is above a remaining execution time threshold and that the significance of parameter estimation errors is above an error threshold;

    suspending, in response to determining that the amount of query execution time is above the remaining execution time threshold and that the significance of parameter estimation errors is above an error threshold, execution of the particular query execution plan;

    re-optimizing, in response to determining that the amount of query execution time is above the remaining execution time threshold and that the significance of parameter estimation errors is above an error threshold, the particular query, wherein the re-optimizing comprises;

    generating a number of alternative query execution plans, including query execution plans using the intermediate result and query execution plans that do not use the intermediate result;

    assigning a respective execution cost to each alternative query execution plan, the respective execution cost being based upon database execution statistics determined during the executing; and

    choosing, based on the respective execution cost of each alternative query execution plan, an optimal alternative query execution plan as a re-optimized query plan;

    restarting query execution with the re-optimized query plan, wherein execution of the re-optimized query plan selectively reuses the intermediate results;

    buffering rows produced by the executing until the checkpoint is evaluated so as to enable pipelining with some delay, wherein exhaustion of temporary space used for the buffering triggers the re-optimization;

    transferring, prior to the restarting, each result row produced by the executing the particular query execution plan to a parent operator in a pipelined manner, the transferring comprising storing respective identifiers for the each result row into a side table using an insert query execution operator just below a return query plan operator;

    executing, after the restarting, an anti join between the side table and a new result stream to compensate for the rows returned, wherein executing the anti-join comprises identifying records in the side table by a respective unique derived record identifier assigned to respective records within the side table during the executing the particular query execution plan;

    selectively returning records at each execution cycle when the records have not previously been returned; and

    outputting query results.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×