Techniques for bushy tree execution plans for snowstorm schema
First Claim
1. A computer-implemented method, comprising steps of:
- in response to determining that a particular query refers to one or more fact tables and one or more dimension tables, generating a transformed query based on the particular query to force a bushy tree execution to compute the results for the particular query;
wherein the particular query specifies at least three join predicates;
wherein the at least three join predicates include a first join predicate, a second join predicate, and a third join predicate;
wherein a table referenced in the first join predicate is also referenced in the second join predicate;
wherein a table referenced in the second join predicate is also referenced in the third join predicate;
wherein generating the transformed query includes;
enclosing, in an unmergeable view, a join operation based on the third join predicate;
modifying the second join predicate to reference the unmergeable view;
determining a first execution cost of the transformed query, wherein determining the first execution cost includes generating and evaluating at least one execution plan for computing the transformed query;
determining a second execution cost of at least one version of the particular query, said version of the particular query being either (1) the particular query or (2) another transformed query based on the particular query, wherein determining the second execution cost includes generating and evaluating at least one execution plan of said at least one version of the particular query;
performing a comparison of the first execution cost of the transformed query and the second execution cost of said at least one version of the particular query;
based on the comparison, selecting the transformed query as an optimized version of the particular query; and
wherein the steps are performed by one or more computing devices in response to executing the code of a query optimizer.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods for transforming a query to simulate a bushy tree execution plan for queries containing joins in series are provided. Left deep tree execution plans are supported by most relational database systems but are inefficient at processing queries directed to databases with snowstorm schema. A snowstorm schema contains several large fact tables and many smaller dimension tables, which make reference to one another. Bushy tree execution plans can be much more efficient for processing queries to snowstorm schema. The decision to choose between left-deep and bushy tree execution plans are based on the relative costs of the two execution plans. The methods provided transform queries which are otherwise executed with left deep tree execution plans into queries which are executed with simulated bushy tree execution plans.
-
Citations
14 Claims
-
1. A computer-implemented method, comprising steps of:
-
in response to determining that a particular query refers to one or more fact tables and one or more dimension tables, generating a transformed query based on the particular query to force a bushy tree execution to compute the results for the particular query; wherein the particular query specifies at least three join predicates; wherein the at least three join predicates include a first join predicate, a second join predicate, and a third join predicate; wherein a table referenced in the first join predicate is also referenced in the second join predicate; wherein a table referenced in the second join predicate is also referenced in the third join predicate; wherein generating the transformed query includes; enclosing, in an unmergeable view, a join operation based on the third join predicate; modifying the second join predicate to reference the unmergeable view; determining a first execution cost of the transformed query, wherein determining the first execution cost includes generating and evaluating at least one execution plan for computing the transformed query; determining a second execution cost of at least one version of the particular query, said version of the particular query being either (1) the particular query or (2) another transformed query based on the particular query, wherein determining the second execution cost includes generating and evaluating at least one execution plan of said at least one version of the particular query; performing a comparison of the first execution cost of the transformed query and the second execution cost of said at least one version of the particular query; based on the comparison, selecting the transformed query as an optimized version of the particular query; and wherein the steps are performed by one or more computing devices in response to executing the code of a query optimizer. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A non-transitory computer-readable storage medium storing instructions, wherein the instructions include instructions which, when executed by one or more processors, cause the one or more processors to perform steps comprising:
-
in response to determining that a particular query refers to one or more fact tables and one or more dimension tables, generating a transformed query based on a particular query to force a bushy tree execution to compute the results for the particular query; wherein the particular query specifies at least three join predicates; wherein the at least three join predicates include a first join predicate, a second join predicate, and a third join predicate; wherein a table referenced in the first join predicate is also referenced in the second join predicate; wherein a table referenced in the second join predicate is also referenced in the third join predicate; wherein generating the transformed query includes; enclosing, in an unmergeable view, a join operation based on the third join predicate; modifying the second join predicate to reference the unmergeable view; determining a first execution cost of the transformed query, wherein determining the first execution cost includes generating and evaluating at least one execution plan for computing the transformed query; determining a second execution cost of at least one version of the particular query, said version of the particular query being either (1) the particular query or (2) another transformed query based on the particular query, wherein determining the second execution cost includes generating and evaluating at least one execution plan of said at least one version of the particular query; performing a comparison of the first execution cost of the transformed query and the second execution cost of said at least one version of the particular query; based on the comparison, selecting the transformed query as an optimized version of the particular query; and wherein the steps are performed by one or more computing devices in response to executing the code of a query optimizer. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
Specification