Iterative dynamic programming system for query optimization with bounded complexity
First Claim
1. A method for selecting an execution plan P for a N-way join query in a computer-implemented database system having a plurality of stored database relations, wherein said N-way join query is represented by a join graph G connecting each Ri of a plurality N+1 of relation nodes by a predicate edgeEij to another Rj of said relation node plurality to form a plurality N of connected node pairs (Ri, Ri) and wherein 1≦
- i,j≦
N+1and N>
1 are positive integers, said method comprising the steps of;
(a) selecting from said join graph G for a threshold value TL a subgraph GL connecting a plurality NL ≦
TL of said relation nodes;
(b) selecting the optimal execution plan PL from among all feasible execution plans for joining said subgraph GL of said relation nodes;
(c) replacing said relation node subgraph GL in said join graph G with a relation node RL representing the relation produced by said optimal execution plan PL ; and
(d) repeating said selecting steps (a)-(b) and said replacing step (c) until said execution plan P is selected for said join graph G.
1 Assignment
0 Petitions
Accused Products
Abstract
A query optimizer for optimizing join queries in a relational database system by iterative application of dynamic programming (DP) to select optimal subgraph join execution plans. Unlike traditional DP optimization methods, bounds on search space time and space complexity can be established and adjusted by imposing a subgraph threshold. Each bounded subgraph is selected using a greedy heuristic (GH) hill-climbing procedure or other similarly useful technique to build a low-cost execution plan. The low-cost GH subgraph execution plan is then discarded in favor of an optimal DP subgraph execution plan selected by a dynamic programming optimizer for each subgraph identified by the bounded GH optimization process. The complexity bound may be dynamically tuned to improve execution plan quality responsive to changes in query complexity.
161 Citations
6 Claims
-
1. A method for selecting an execution plan P for a N-way join query in a computer-implemented database system having a plurality of stored database relations, wherein said N-way join query is represented by a join graph G connecting each Ri of a plurality N+1 of relation nodes by a predicate edgeEij to another Rj of said relation node plurality to form a plurality N of connected node pairs (Ri, Ri) and wherein 1≦
- i,j≦
N+1and N>
1 are positive integers, said method comprising the steps of;(a) selecting from said join graph G for a threshold value TL a subgraph GL connecting a plurality NL ≦
TL of said relation nodes;(b) selecting the optimal execution plan PL from among all feasible execution plans for joining said subgraph GL of said relation nodes; (c) replacing said relation node subgraph GL in said join graph G with a relation node RL representing the relation produced by said optimal execution plan PL ; and (d) repeating said selecting steps (a)-(b) and said replacing step (c) until said execution plan P is selected for said join graph G. - View Dependent Claims (2)
- i,j≦
-
3. A method for selecting an execution plan P for a N-way join query in a computer-implemented database system having a plurality of stored database relations, wherein said N-way join query is represented by a join graph G connecting each R1 of a plurality N+1 of relation nodes by a predicate edge Eij to another Rj of said relation node plurality to form a plurality N of connected node pairs (Ri, Rj), wherein each said relation node Ri has a joinder number Ji corresponding to the number of said relations embraced thereby and wherein 1≦
- i, j≦
N+1 and N>
1 are positive integers, said method comprising the steps of;(a) selecting from said join graph G the connected node pair (Ri, Rj) having the lowest join execution cost of all said connected node pairs in said join graph G; (b) comparing to a threshold TL the sum Jk =Ji +Jj of the joinder numbers associated with said selected node pair (Ri, Rj), wherein Ji ≧
Jj ;(c) if Jk >
TL selecting the optimal execution plan Pi from among all feasible execution plans for joining the plurality Ji of said relations embraced by said relation node Ri and resetting said joinder number Ji =1 therefor to indicate that said relation node Ri now embraces only the relation produced by said plan Pi, wherein 1≦
k≦
N+1 is a positive integer;
otherwise(d) replacing said connected node pair (Ri, Rj) with a replacement node Rk having a joinder number Jk =Ji +Jj in said join graph G; and (e) if at least one said connected node pair (Ri, Rj) remains in said join graph G, repeating said steps (a)-(d);
otherwise(f) repeating said selecting step (c) for said replacement node Rk to select said execution plan P for said join graph G.
- i, j≦
-
4. A query optimizer subsystem for selecting an execution plan P for a N-way join query in a computer-implemented database system having a data processor and a plurality of stored database relations, said subsystem comprising:
-
input means for accepting said N-way join query represented as a join graph G connecting each Ri of a plurality N+1 of relation nodes by a predicate edge Eij to another Rj of said relation node plurality to form a plurality N of connected node pairs (Ri, Rj) and wherein 1<
i, j<
N+1 and N>
1 are positive integers;greedy heuristic means coupled to said input means for selecting from said join graph G for a threshold value TL a subgraph GL connecting a plurality NL ≦
TL of said relation nodes according to a greedy heuristic procedure for minimizing the join execution cost for said subgraph GL ;dynamic programming means coupled to said greedy heuristic means for selecting the optimal execution plan PL from among all feasible execution plans for joining said subset GL of said relation nodes; merger means coupled to said greedy heuristic means for replacing said relation node subgraph GL in said join graph G with a relation node RL representing the relation produced by said optimal execution plan PL ; and
iteration means in said data processor for causing said greedy heuristic means, said dynamic programming means and said merger means to repeatedly select and replace said subgraph GL until said optimal execution plan P is selected for said join graph G.
-
-
5. A query optimizer subsystem for selecting an execution plan P for a N-way join query in a computer-implemented database system having a data processor and a plurality of stored database relations, said subsystem comprising:
-
input means in said data processor for accepting said N-way join query represented as a join graph G connecting each Ri of a plurality N+1 of relation nodes by a predicate edge Eij to another Ri of said relation node plurality to form a plurality N of connected node pairs (Ri, Rx), wherein each said relation node Ri has a joinder number Ji corresponding to the number of said relations embraced thereby and wherein 1≦
i, j≦
N+1 and N>
1 are positive integers;two-way plan selection means coupled to said input means for selecting from said join graph G the connected node pair (Ri, Rj) having the lowest join execution cost of all said connected node pairs in said join graph G; threshold means coupled to said two-way plan selection means for comparing to a threshold TL the sum Jk =Ji +Jj of the joinder numbers associated with said selected node pair (Ri, Rx), wherein Ji ≧
Jj ;dynamic programming means coupled to said thresholding means for selecting, when Jk >
TL, the optimal execution plan Pi from among all feasible execution plans for joining the plurality Ji of said relations embraced by said relation node Ri and for resetting said joinder number Ji =1 therefor to indicate that said relation node Ri now embraces only the relation produced by said plan Pi, wherein 1≦
k≦
N+l is a positive integer;
first merger means coupled to said thresholding means for replacing, when Jk ≦
TL, said connected node pair (R1, Rj) with a replacement node Rk having a joinder number Jk =Ji +Jj in said join graph G; anditeration means for causing said two-way plan selection means, said thresholding means, said dynamic programming means and said merger means to repeatedly select, compare, select and replace said connected node pairs until said execution plan P is selected for said join graph G.
-
-
6. A computer program product for use with a query optimizer system for selecting an execution plan P for a N-way join query in a computer-implemented database system having a plurality of stored database relations, wherein said N-way join query is represented by a join graph G connecting each Ri of a plurality N+1 of relation nodes by a predicate edge Eij to another Ri of said relation node plurality to form a plurality N of connected node pairs (Ri, Rl) and wherein 1<
- i, j<
N+1 and N>
1 are positive integers, said computer program product comprising;a recording medium; first means, recorded on said recording medium, for directing said system to select from said join graph G for a threshold value TL a subgraph GF connecting a plurality NL ≦
TL of said relation node;second means, recorded on said recording medium, for directing said system to select the optimal execution plan PL from among all feasible execution plans for joining said subgraph GL of said relation nodes; means, recorded on said recording medium, for directing said system to replace said relation node subgraph GL in said join graph G with a relation node RL representing the relation produced by said optimal execution plan PL ; and
means, recorded on said recording medium, for directing said system to repeat operation of said first means for directing to select, said second means for directing to select and said means for directing to replace until said execution plan P is selected for said join graph G.
- i, j<
Specification