Dynamic partition selection
First Claim
1. A system comprising one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to perform operations comprising:
- receiving a representation of a query plan generated for a query, the query plan comprising a first plurality of operators that, when executed by one or more computing nodes, cause the one or more computing nodes to compute a result for the query, wherein the query plan includes a join operator and a dynamic scan operator, wherein the dynamic scan operator represents a first computing node obtaining tuples from a table that is partitioned into a plurality of partitions by a partitioning key and transferring the tuples to a second computing node that executes a parent operator of the dynamic scan operator, and the join operator represents a third computing node computing a join operation by comparing first tuples generated by an outer child operator of the join operator to second tuples generated by an inner child operator of the join operator to determine pairs of first tuples and second tuples that have matching attribute values;
generating a partition selector operator corresponding to the dynamic scan operator, wherein the partition selector operator represents a fourth computing node that executes the partition selector operator including determining one or more partition identifiers of partitions of the table and transferring the one or more partition identifiers to the dynamic scan operator of the first computing node;
determining a location in the query plan for the partition selector operator relative to the join operator, including;
determining that the dynamic scan operator is defined in a subtree of the outer child operator of the join operator; and
pushing the partition selector operator to the outer child operator of the join operator in response to determining that the dynamic scan operator is defined in a subtree of the outer child operator of the join operator; and
generating a modified query plan having the partition selector operator at the determined location, wherein the modified query plan includes a second plurality of operators that, when executed by one or more computing nodes, cause the one or more computing nodes to compute a result for the query.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for dynamic partition selection. One of the methods includes receiving a representation of a query plan generated for a query, wherein the query plan includes a dynamic scan operator that represents a first computing node obtaining tuples of one or more partitions of a table from storage and transferring the tuples to a second computing node that executes a parent operator of the dynamic scan operator. A partition selector operator is generated corresponding to the dynamic scan operator. A location in the query plan is determined for the partition selector operator. A modified query plan is generated having the partition selector operator at the determined location.
8 Citations
20 Claims
-
1. A system comprising one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to perform operations comprising:
-
receiving a representation of a query plan generated for a query, the query plan comprising a first plurality of operators that, when executed by one or more computing nodes, cause the one or more computing nodes to compute a result for the query, wherein the query plan includes a join operator and a dynamic scan operator, wherein the dynamic scan operator represents a first computing node obtaining tuples from a table that is partitioned into a plurality of partitions by a partitioning key and transferring the tuples to a second computing node that executes a parent operator of the dynamic scan operator, and the join operator represents a third computing node computing a join operation by comparing first tuples generated by an outer child operator of the join operator to second tuples generated by an inner child operator of the join operator to determine pairs of first tuples and second tuples that have matching attribute values; generating a partition selector operator corresponding to the dynamic scan operator, wherein the partition selector operator represents a fourth computing node that executes the partition selector operator including determining one or more partition identifiers of partitions of the table and transferring the one or more partition identifiers to the dynamic scan operator of the first computing node; determining a location in the query plan for the partition selector operator relative to the join operator, including; determining that the dynamic scan operator is defined in a subtree of the outer child operator of the join operator; and pushing the partition selector operator to the outer child operator of the join operator in response to determining that the dynamic scan operator is defined in a subtree of the outer child operator of the join operator; and generating a modified query plan having the partition selector operator at the determined location, wherein the modified query plan includes a second plurality of operators that, when executed by one or more computing nodes, cause the one or more computing nodes to compute a result for the query. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-implemented method comprising:
-
receiving a representation of a query plan generated for a query, the query plan comprising a first plurality of operators that, when executed by one or more computing nodes, cause the one or more computing nodes to compute a result for the query, wherein the query plan includes a join operator and a dynamic scan operator, wherein the dynamic scan operator represents a first computing node obtaining tuples from a table that is partitioned into a plurality of partitions by a partitioning key and transferring the tuples to a second computing node that executes a parent operator of the dynamic scan operator, and the join operator represents a third computing node computing a join operation by comparing first tuples generated by an outer child operator of the join operator to second tuples generated by an inner child operator of the join operator to determine pairs of first tuples and second tuples that have matching attribute values; generating a partition selector operator corresponding to the dynamic scan operator, wherein the partition selector operator represents a third computing node that executes the partition selector operator including determining one or more partition identifiers of partitions of the table and transferring the one or more partition identifiers to the dynamic scan operator of the first computing node; determining a location in the query plan for the partition selector operator relative to the join operator, including; determining that the join operator includes a predicate expression on a partitioning key; in response to determining that the join operator includes a predicate expression on a partitioning key, annotating the partition selector operator with the predicate expression from the join operator; and pushing the partition selector operator to a subtree of an outer child operator of the join operator; and generating a modified query plan having the annotated partition selector operator at the determined location, wherein the modified query plan includes a second plurality of operators that, when executed by one or more computing nodes, cause the one or more computing nodes to compute a result for the query. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A computer program product, encoded on one or more non-transitory computer storage media, comprising instructions that when executed by one or more computers cause the one or more computers to perform operations comprising:
-
receiving a representation of a query plan generated for a query, the query plan comprising a first plurality of operators that, when executed by one or more computing nodes, cause the one or more computing nodes to compute a result for the query, wherein the query plan includes a join operator and a dynamic scan operator, wherein the dynamic scan operator represents a first computing node obtaining tuples from a table that is partitioned into a plurality of partitions by a partitioning key and transferring the tuples to a second computing node that executes a parent operator of the dynamic scan operator, and the join operator represents a third computing node computing a join operation by comparing first tuples generated by an outer child operator of the join operator to second tuples generated by an inner child operator of the join operator to determine pairs of first tuples and second tuples that have matching attribute values; generating a partition selector operator corresponding to the dynamic scan operator, wherein the partition selector operator represents a fourth computing node that executes the partition selector operator including determining one or more partition identifiers of partitions of the table and transferring the one or more partition identifiers to the dynamic scan operator of the first computing node; determining a location in the query plan for the partition selector operator relative to the join operator, including; determining that the dynamic scan operator is defined in a subtree of the outer child operator of the join operator; and pushing the partition selector operator to the outer child operator of the join operator in response to determining that the dynamic scan operator is defined in a subtree of the outer child operator of the join operator; and generating a modified query plan having the partition selector operator at the determined location, wherein the modified query plan includes a second plurality of operators that, when executed by one or more computing nodes, cause the one or more computing nodes to compute a result for the query. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification