Compile-time optimizations of queries with SQL spreadsheet
First Claim
1. A method for determining orders for evaluating a set of formulas defined by a queries, the method comprising the steps of:
- examining a query;
wherein said query evaluates to a relation;
wherein the query defines an array and a set of formulas that reference the array;
wherein the array has one or more dimensions that correspond to columns of the relation;
wherein a subset of formulas of the set of formulas require cells corresponding to a range of values of said dimension;
grouping, based one or more criteria, the set of formulas into groups of formulas, wherein the groups are associated with an order, and wherein said one or more criteria include that, for each group of said groups of formulas, evaluation of any formula in said each group does not depend;
on another formula in said each group, and does not depend on another formula in a group higher in said order than said each group; and
evaluating in said order each group of said groups of formula.
1 Assignment
0 Petitions
Accused Products
Abstract
Described herein are optimizations and execution strategies for spreadsheet extensions to SQL. The partitioning of data, as specified in a spreadsheet clause, provides a way to parallelize the computation of spreadsheet and to provide and improve scalability. Even if the partitioning is not explicitly specified in the spreadsheet clause, the database optimizer can automatically infer the partitioning in some cases. Efficient hash based access structures on relations can be used for symbolic array addressing, enabling fast computation of formulas. When rewriting SQL statements, formulas whose results are not referenced in outer blocks can be removed from the spreadsheet clause, thus removing unnecessary computations. The predicates from other query blocks can be moved inside query blocks with spreadsheets clauses, thus considerably reducing the amount of data to be processed. Conditions for validity of this transformation are given.
59 Citations
22 Claims
-
1. A method for determining orders for evaluating a set of formulas defined by a queries, the method comprising the steps of:
-
examining a query;
wherein said query evaluates to a relation;
wherein the query defines an array and a set of formulas that reference the array;
wherein the array has one or more dimensions that correspond to columns of the relation;
wherein a subset of formulas of the set of formulas require cells corresponding to a range of values of said dimension;
grouping, based one or more criteria, the set of formulas into groups of formulas, wherein the groups are associated with an order, and wherein said one or more criteria include that, for each group of said groups of formulas, evaluation of any formula in said each group does not depend;
on another formula in said each group, and does not depend on another formula in a group higher in said order than said each group; and
evaluating in said order each group of said groups of formula. - View Dependent Claims (2, 3)
-
-
4. A method for rewriting queries, the method comprising the computer implemented steps of:
-
examining a first query that references a relation to which a second query evaluates;
wherein the second query defines an array and a first set of formulas that reference the array;
wherein the array has one or more dimensions;
wherein the first query includes one or more predicates;
wherein the second query does not include the one or more predicates;
determining whether one or more criteria for rewriting said first query or second query are satisfied; and
if said one or more criteria for rewriting said first query or second query are satisfied, then making, based on said one or more predicates included in said first query, modifications to said second query involving one or more predicate conditions; and
generating a rewritten query based on the modifications to said second query. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A method for rewriting queries, the method comprising the computer implemented steps of:
-
examining a first query that defines an array and a formula that references the array;
wherein the formula has a right side and a left side;
determining whether one or more criteria for rewriting said first query are satisfied;
wherein said one or more criteria include that;
said left side and said right side do not depend on each other, and all aggregate functions referenced on the right side can be rewritten using a window function; and
if said one or more criteria for rewriting said first query are satisfied, then generating a rewritten formula that uses a window function.
-
-
16. A method for rewriting queries, the method comprising the computer implemented steps of:
-
examining a first query that references a relation to which a second query evaluates;
wherein the second query defines an array and a first set of formulas that reference the array;
wherein the first query includes one or more predicates;
wherein the second query does not include the one or more predicates;
generating a rewritten set of formulas by rewriting, based on said one or more predicates, said first set of formulas to produce said rewritten set of formulas;
and generating a rewritten query based on the rewritten set of formulas. - View Dependent Claims (17, 18, 19, 20, 21, 22)
-
Specification