Rewriting a query in terms of a summary based on one-to-one and one-to-many losslessness of joins
First Claim
1. A method for processing queries, the method comprising the steps of:
- receiving a query that does not reference a particular materialized view;
determining whether the particular materialized view satisfies each condition in a set of conditions;
the set of conditions at least including a condition that said materialized view reflects all rows that exist in a common section;
wherein said common section is common to both said materialized view and said query; and
if said materialized view satisfies each condition in said set of conditions, then rewriting said query to produce a rewritten query that references said materialized view.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and system are provided for processing queries. According to one aspect of the invention, a query that does not reference a particular materialized view is rewritten to reference the materialized view. In particular, upon receiving the query, it is determined whether the particular materialized view satisfies each condition in a set of conditions, where the set of conditions at least includes a condition that the materialized view reflects all rows that exist in a common section. The common section is a section of the query that is common to both the materialized view and the query. If the materialized view satisfies each condition in the set of conditions, then the query is rewritten to produce a rewritten query that references the materialized view. The materialized view may be a summary table that includes a summary column. The summary column contains values generated by aggregating values contained in rows produced by a one-to-many lossless join. The one-to-many lossless join is not in the common section. The query includes a cumulative aggregate function. Under these conditions, the method includes generating results of the cumulative aggregate function in the query by dividing values from the summary column by scaling factors.
105 Citations
26 Claims
-
1. A method for processing queries, the method comprising the steps of:
-
receiving a query that does not reference a particular materialized view;
determining whether the particular materialized view satisfies each condition in a set of conditions;
the set of conditions at least including a condition that said materialized view reflects all rows that exist in a common section;
wherein said common section is common to both said materialized view and said query; and
if said materialized view satisfies each condition in said set of conditions, then rewriting said query to produce a rewritten query that references said materialized view. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
the set of conditions includes a condition that each join in a set of non-matching joins is lossless; and
the step of determining whether the particular materialized view satisfies each condition in a set of conditions includes the steps of;
determining the set of non-matching joins, wherein said set of non-matching joins is established to be all joins a) used to create said particular materialized view, but b) not referenced by said query; and
determining whether each join in said set of non-matching joins is lossless.
-
-
3. The method of claim 1 further including determining whether said materialized view is lossless relative to a common section with said query by performing the steps of:
-
generating a join graph of said particular materialized view;
wherein the join graph includes a portion that corresponds to said common section and one or more subtrees that extend from said common section;
determining whether, within each subtree of said one or more subtrees, any joins that are not lossless are preceded by outer joins; and
determining that said particular materialized view is lossless relative to said common section if, within each subtree of said one or more subtrees, all joins that are not lossless are preceded by outer joins.
-
-
4. The method of claim 1 wherein:
-
the common section includes a join that corresponds to a first join in the materialized view and a second join in the query;
the join type of the first join is different than the join type of the second join; and
the set of conditions includes a condition that the join type of the second join is derivable from the join type of the first join.
-
-
5. The method of claim 1 wherein the step of determining whether said materialized view reflects all rows that exist in a common section includes the steps of:
-
prior to receiving said query, empirically testing whether a particular join in said materialized view is lossless, wherein said particular join does not belong to said common section;
storing metadata that indicates whether the particular join is lossless; and
in response to receiving said query, inspecting said metadata to determine whether said particular join is lossless.
-
-
6. The method of claim 5 wherein the step of empirically testing whether a particular join in said materialized view is lossless includes the steps of:
-
empirically testing whether the particular join is lossless when the materialized view is initially built; and
verifying that the particular join remains lossless when the materialized view is refreshed.
-
-
7. The method of claim 1 wherein:
-
the materialized view is a summary table that includes a summary column;
the summary column contains values generated by aggregating values contained in rows produced by a one-to-many lossless join;
the one-to-many lossless join is not in the common section;
the query includes a cumulative aggregate function; and
the method includes generating results of said cumulative aggregate function in said query by dividing values from said summary column by scaling factors.
-
-
8. The method of claim 7 wherein:
-
the cumulative aggregate function in the query has a first argument;
the step of rewriting said query to produce a rewritten query that references said materialized view includes replacing the first argument with a second argument; and
the second argument specifies that said values from said summary column are to be divided by said scaling factors.
-
-
9. The method of claim 7 further comprising the steps of:
-
generating said scaling factors prior to receiving said query; and
storing said scaling factors persistently.
-
-
10. The method of claim 9 wherein the step of storing said scaling factors persistently includes storing said scaling factors in a column of said summary table.
-
11. The method of claim 7 wherein the step of rewriting said query includes modifying said query to cause said scaling factors to be generated on-the-fly during execution of said rewritten query.
-
12. The method of claim 7 wherein:
-
said one-to-many lossless join is performed by combining child-side rows with parent side rows; and
the method further comprises the step of generating said scaling factors by counting how many parent side rows combine with each join value from said child-side rows.
-
-
13. The method of claim 7 wherein:
-
the summary column contains values generated by aggregating values contained in rows produced by a plurality of one-to-many lossless joins;
the plurality of one-to-many lossless joins are not in the common section; and
the method includes generating results of said cumulative aggregate function in said query by dividing values from said summary column by values that are generated by multiplying together scaling factors that correspond to each one-to-many lossless join of said plurality of one-to-many lossless joins.
-
-
14. A computer-readable medium carrying one or more sequences of instructions for processing queries, wherein execution of the one or more sequences of instructions by one or more processors causes the one or more processors to perform the steps of:
-
receiving a query that does not reference a particular materialized view;
determining whether the particular materialized view satisfies each condition in a set of conditions;
the set of conditions at least including a condition that said materialized view reflects all rows that exist in a common section;
wherein said common section is common to both said materialized view and said query; and
if said materialized view satisfies each condition in said set of conditions, then rewriting said query to produce a rewritten query that references said materialized view. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
the set of conditions includes a condition that each join in a set of non-matching joins is lossless; and
the step of determining whether the particular materialized view satisfies each condition in a set of conditions includes the steps of;
determining the set of non-matching joins, wherein said set of non-matching joins is established to be all joins a) used to create said particular materialized view, but b) not referenced by said query; and
determining whether each join in said set of non-matching joins is lossless.
-
-
16. The computer-readable medium of claim 14, the steps further including determining whether said materialized view is lossless relative to a common section with said query by performing the steps of:
-
generating a join graph of said particular materialized view;
wherein the join graph includes a portion that corresponds to said common section and one or more subtrees that extend from said common section;
determining whether, within each subtree of said one or more subtrees, any joins that are not lossless are preceded by outer joins; and
determining that said particular materialized view is lossless relative to said common section if, within each subtree of said one or more subtrees, all joins that are not lossless are preceded by outer joins.
-
-
17. The computer-readable medium of claim 14 wherein:
-
the common section includes a join that corresponds to a first join in the materialized view and a second join in the query;
the join type of the first join is different than the join type of the second join; and
the set of conditions includes a condition that the join type of the second join is derivable from the join type of the first join.
-
-
18. The computer-readable medium of claim 14 wherein the step of determining whether said materialized view reflects all rows that exist in a common section includes the steps of:
-
prior to receiving said query, empirically testing whether a particular join in said materialized view is lossless, wherein said particular join does not belong to said common section;
storing metadata that indicates whether the particular join is lossless; and
in response to receiving said query, inspecting said metadata to determine whether said particular join is lossless.
-
-
19. The computer-readable medium of claim 18 wherein the step of empirically testing whether a particular join in said materialized view is lossless includes the steps of:
-
empirically testing whether the particular join is lossless when the materialized view is initially built; and
verifying that the particular join remains lossless when the materialized view is refreshed.
-
-
20. The computer-readable medium of claim 14 wherein:
-
the materialized view is a summary table that includes a summary column;
the summary column contains values generated by aggregating values contained in rows produced by a one-to-many lossless join;
the one-to-many lossless join is not in the common section;
the query includes a cumulative aggregate function; and
the steps further include generating results of said cumulative aggregate function in said query by dividing values from said summary column by scaling factors.
-
-
21. The computer-readable medium of claim 20 wherein:
-
the cumulative aggregate function in the query has a first argument;
the step of rewriting said query to produce a rewritten query that references said materialized view includes replacing the first argument with a second argument; and
the second argument specifies that said values from said summary column are to be divided by said scaling factors.
-
-
22. The computer-readable medium of claim 20, wherein the steps further comprise of:
-
generating said scaling factors prior to receiving said query; and
storing said scaling factors persistently.
-
-
23. The computer-readable medium of claim 22 wherein the step of storing said scaling factors persistently includes storing said scaling factors in a column of said summary table.
-
24. The computer-readable medium of claim 20 wherein the step of rewriting said query includes modifying said query to cause said scaling factors to be generated on-the-fly during execution of said rewritten query.
-
25. The computer-readable medium of claim 20 wherein:
-
said one-to-many lossless join is performed by combining child-side rows with parent side rows; and
the steps further comprise the step of generating said scaling factors by counting how many parent side rows combine with each join value from said child-side rows.
-
-
26. The computer-readable medium of claim 20 wherein:
-
the summary column contains values generated by aggregating values contained in rows produced by a plurality of one-to-many lossless joins;
the plurality of one-to-many lossless joins are not in the common section; and
the steps further include generating results of said cumulative aggregate function in said query by dividing values from said summary column by values that are generated by multiplying together scaling factors that correspond to each one-to-many lossless join of said plurality of one-to-many lossless joins.
-
Specification