Optimizing parallel queries using interesting distributions
First Claim
1. A computer system, the computer system comprising:
- one or more hardware processors;
system memory coupled to the one or more hardware processors, the system memory storing instructions that are executable by the one or more hardware processors; and
the one or more hardware processors executing the instructions stored in the system memory to optimize a query, including the following;
access a query plan search space for a query of a distributed database, the query plan search space including a hierarchical structure of a root group of logical operators, one or more intermediate groups of logical operators, and one or more leaf groups of logical operators; and
formulate an annotated query plan search space for the query, including;
identify a distribution property for a child group of at least one other group, the at least one other group selected from among;
the root group and the one or more intermediate groups, the distribution property indicating type of distribution relevant to the child group, the distribution property identifying a column that data for a parent group is distributed on, the parent group being above the child group in the hierarchical structure; and
annotate the child group with the type of distribution by attaching an indication of the identified column to the child group to propagate the identified type of distribution to the child group for use in query plan pruning.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention extends to methods, systems, and computer program products for optimizing parallel queries using interesting distributions. For each logical operator in an SQL server MEMO, in a top down manner from a root operator to the leaf operators, interesting distributions for the operators can be identified based on the properties of the operators. Identified interesting distributions can be propagated down to lower operators by annotating the lower operators with the interesting distributions. Thus, a SQL server MEMO can be annotated with interesting distributions propagated top down from root to leaf logical operators to generate an annotated SQL server MEMO. Parallel query plans can then be generated from the annotated SQL server MEMO in a bottom up manner from leaf operators to a root operator. Annotated interesting properties can be used to prune operators, thereby facilitating a more tractable search space for a parallel query plan.
8 Citations
18 Claims
-
1. A computer system, the computer system comprising:
-
one or more hardware processors; system memory coupled to the one or more hardware processors, the system memory storing instructions that are executable by the one or more hardware processors; and the one or more hardware processors executing the instructions stored in the system memory to optimize a query, including the following; access a query plan search space for a query of a distributed database, the query plan search space including a hierarchical structure of a root group of logical operators, one or more intermediate groups of logical operators, and one or more leaf groups of logical operators; and formulate an annotated query plan search space for the query, including; identify a distribution property for a child group of at least one other group, the at least one other group selected from among;
the root group and the one or more intermediate groups, the distribution property indicating type of distribution relevant to the child group, the distribution property identifying a column that data for a parent group is distributed on, the parent group being above the child group in the hierarchical structure; andannotate the child group with the type of distribution by attaching an indication of the identified column to the child group to propagate the identified type of distribution to the child group for use in query plan pruning. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for use at a computer system, the computer system including one or more processors and system memory, the method for optimizing a query, the method comprising:
-
accessing a query plan search space for a query of a distributed database, the query plan search space including a hierarchical structure of a root group of logical operators, one or more intermediate groups of logical operators, and one or more leaf groups of logical operators; and formulating an annotated query plan search space for the query by; identifying a distribution property for a child group of at least one other group, the at least one other group selected from among;
the root group and the one or more intermediate groups, the distribution property indicating type of distribution relevant to the child group, the distribution property identifying a column that data for a parent group is distributed on, the parent group being above the child group in the hierarchical structure; andannotating the child group with the type of distribution by attaching an indication of the identified column to the child group to propagate the identified type of distribution to the child group for use in query plan pruning. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
-
Specification