Reducing query response time using tree balancing
First Claim
1. In a multidatabase system (MDBS) for accessing data in a plurality of relational computer databases on a distributed network of database machines, a method of structuring a database query to reduce query response time using a two phase optimization, comprising the steps of:
- inputting the database query;
as a first phase, optimization,transforming the database query into a left deep join query tree having a root query, a plurality of subordinate (descendant) query nodes and a plurality of table nodes, each subordinate query node having a left child subtree and a right child subtree;
as a second phase optimization.estimating a response time for the root query and for each of the plurality of subordinate query nodes;
measuring a response time to access each of the plurality of table nodes and a cost to access each subtree;
balancing the left deep join query tree into a bushy query tree, wherein the cost for each left child subtree is substantially equal to the cost for the right child subtree; and
executing the database query in one or more of said relational databases in accordance with an execution plan operating according to the balanced bushy query tree.
3 Assignments
0 Petitions
Accused Products
Abstract
A method for optimizing data retrieval from a multidatabase system by restructuring a database query tree to optimize query response time in a two step optimization process. First, the query tree is transformed into a left deep join tree having a root query, a plurality of subordinate (descendant) query nodes and a plurality of table nodes, each subordinate query node having a left child subtree and a right child subtree. This transformation is usually the result of a first optimization scheme such as System-R. A response time for the root query and for each of the plurality of subordinate query nodes is estimated and access response times to each table node and subtree are estimated. Then, this data is utilized in the balancing of the left deep join query tree so that the cost for access to each left child subtree is substantially equal to the cost for the right child subtree. This balancing step encompasses the second phase of the query tree optimization process and includes using transformation processes such top-down, bottom-up, and a hybrid of the first two. Finally, the query is executed in a relational database to retrieve data responsive to the query in accordance with an execution plan operating according to the balanced query tree.
-
Citations
11 Claims
-
1. In a multidatabase system (MDBS) for accessing data in a plurality of relational computer databases on a distributed network of database machines, a method of structuring a database query to reduce query response time using a two phase optimization, comprising the steps of:
-
inputting the database query; as a first phase, optimization, transforming the database query into a left deep join query tree having a root query, a plurality of subordinate (descendant) query nodes and a plurality of table nodes, each subordinate query node having a left child subtree and a right child subtree; as a second phase optimization. estimating a response time for the root query and for each of the plurality of subordinate query nodes; measuring a response time to access each of the plurality of table nodes and a cost to access each subtree; balancing the left deep join query tree into a bushy query tree, wherein the cost for each left child subtree is substantially equal to the cost for the right child subtree; and executing the database query in one or more of said relational databases in accordance with an execution plan operating according to the balanced bushy query tree. - View Dependent Claims (2, 3, 4, 5)
-
-
6. An apparatus, comprising:
-
(A) a storage medium; (B) a computer executable program stored in the storage medium that structures a database query using a two phase optimization to reduce query response time for accessing data in a plurality of relational databases on a distributed network, comprising (a) a first set of instructions that input the database query; as a first phase optimization, (b) a second set of instructions that transform the database query into a left deep join query tree having a root query, a plurality of subordinate (descendant) query nodes and a plurality of table nodes, each subordinate query node having a left child subtree and a right child subtree; as a second phase optimization, (c) a third set of instructions that estimate a response time for the root query and for each of the plurality of subordinate query nodes; (d) a fourth set of instructions that measure a response time to access each of the plurality of table nodes and a cost to access each subtree; (e) a fifth set of instructions that balance the left deep join query tree into a bushy query tree, wherein the cost for each left child subtree is substantially equal to the cost for the right child subtree; and (f) a sixth set of instructions that execute the database query in one or more of said relational databases in accordance with an execution plan operating according to the balanced bushy query tree. - View Dependent Claims (7, 8, 9, 10, 11)
-
Specification