×

Progressive refinement of a federated query plan during query execution

  • US 7,877,381 B2
  • Filed: 03/24/2006
  • Issued: 01/25/2011
  • Est. Priority Date: 03/24/2006
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer implemented method for progressively refining a federated query execution plan in a federated data system, the computer implemented method comprising:

  • executing a federated query according to a federated query execution plan for the federated data system, wherein the federated data system has a federated server, a plurality of data sources including a federated database and a plurality of remote databases, and further wherein the federated data system sends distributed requests to local databases and remote databases for processing, and wherein the remote databases do not store statistical data so that the federated query execution plan is initially compiled without knowledge about the remote databases; and

    refining the federated query execution plan during execution of the federated query, wherein refining the federated query execution plan during execution of the federated query comprises;

    placing at least one re-optimization constraint, in the federated query execution plan, at a remote database where remote data, including a statistical data, is received and further processed by the federated data system during compilation of the federated query;

    responsive to the at least one re-optimization constraint being violated during execution of the federated query execution plan, optimizing the federated query execution plan by refining a model of the federated query execution plan based on the statistical data received from the remote databases during a partial execution of the federated query plan in order to compensate for a lack of data availability, accuracy, or completeness of federated statistics in regard to the remote databases;

    recompiling the federated query; and

    re-executing the federated query execution plan; and

    wherein a plurality of checkpoints are placed in the query execution plan after a first round of re-optimization so that so that all other queries to any remote database are configured to also trigger re-optimizations and all intermediate results for all rounds of re-optimization are mapped back into the query execution plan; and

    wherein a CHECK operator identifies whether an actual cardinality is within a validity range and triggers at least one re-optimization when the actual cardinality is not within the validity range, and further comprising;

    repeating the placing, optimizing, recompiling, and re-executing steps until a query result is achieved;

    wherein in a plurality of multiple re-optimizations of a query with a limited initial knowledge about a plurality of data, each re-optimization adds an actual knowledge about a single table only, and where |σ

    (p6)T|>



    (p4,5)S|>



    (p1,2,3)R| and an initial query execution plan chooses a plurality of wrong physical join operators, but chooses a correct join order, and further assuming that |σ

    (p1,2,3)R| is larger than a default estimate for a plurality of accesses to S and T, a re-optimization places a partial result from an access to a table R in a last join so that a re-optimized plan using a more efficient join operator for a second join results in a worse overall query performance; and

    wherein compensation for the worse overall query performance is made by introducing several rounds of re-optimization, each round adding an additional knowledge about additional parts of the plan until a whole plan is covered with an actual knowledge and a final access plan is developed in accordance with the following sequence;




    (p1,2,3)R|<



    (p4,5)S|<



    (p6)T|=>

    ((

    S


    T) 



    1)


    (p4,5)S|<



    (p6)T|<

    |Tempσ

    (p1,2,3)R|=>

    ((

    T


    R) 



    2)


    (p6)T|<

    |Tempσ

    (p1,2,3)R|<

    |Tempσ

    (p4,5)S|=>

    ((

    R


    S) 



    3)
    |Tempσ

    (p1,2,3)R|<

    |Tempσ

    (p4,5)S|<

    |Tempσ

    (p6)T|=>

    ((

    S


    T) 



    4),so that the join order in

         4) is identical to the join order in

         1).

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×