Data-Aware Scalable Parallel Execution of Rollup Operations
First Claim
1. A method comprising:
- monitoring a plurality of numbers of distinct values (NDVs) for a plurality of candidate distribution keys, wherein each candidate distribution key of said plurality of candidate distribution keys comprises one or more group-by keys of an ordered list of group-by columns specified by a database statement that requests performing one or more rollup operations relating to the ordered list of group-by keys;
selecting a distribution key from said plurality of candidate distribution keys based at least in part on results of monitoring the plurality of NDVs;
distributing a set of rows, based at least in part on the selected distribution key, between first parallel executing processes and second parallel executing processes, wherein the one or more rollup operations are performed by the first parallel executing processes and the second parallel executing processes against the set of rows;
wherein the method is performed by one or more computing devices.
1 Assignment
0 Petitions
Accused Products
Abstract
According to one aspect of the invention, for a database statement that specifies rollup operations, a data distribution key may be selected among a plurality of candidate keys. Numbers of distinct values of the candidate keys may be monitored with respect to a particular set of rows. Hash values may also be generated by column values in the candidate keys. The data distribution key may be determined based on results of monitoring the numbers of distinct values of the candidate keys as well as the frequencies of hash values computed based on column values of the candidate keys. Rollup operations may be shared between different stages of parallel executing processes and data may be distributed between the different stages of parallel executing processes based on the selected data distribution key.
15 Citations
20 Claims
-
1. A method comprising:
-
monitoring a plurality of numbers of distinct values (NDVs) for a plurality of candidate distribution keys, wherein each candidate distribution key of said plurality of candidate distribution keys comprises one or more group-by keys of an ordered list of group-by columns specified by a database statement that requests performing one or more rollup operations relating to the ordered list of group-by keys; selecting a distribution key from said plurality of candidate distribution keys based at least in part on results of monitoring the plurality of NDVs; distributing a set of rows, based at least in part on the selected distribution key, between first parallel executing processes and second parallel executing processes, wherein the one or more rollup operations are performed by the first parallel executing processes and the second parallel executing processes against the set of rows; wherein the method is performed by one or more computing devices. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. One or more non-transitory computer-readable media storing instructions which, when executed by one or more processors, cause performance of a method for evaluating reporting window functions, the method comprising:
-
monitoring a plurality of numbers of distinct values (NDVs) for a plurality of candidate distribution keys, wherein each candidate distribution key of said plurality of candidate distribution keys comprises one or more group-by keys of an ordered list of group-by columns specified by a database statement that requests performing one or more rollup operations relating to the ordered list of group-by keys; selecting a distribution key from said plurality of candidate distribution keys based at least in part on results of monitoring the plurality of NDVs; distributing a set of rows, based at least in part on the selected distribution key, between first parallel executing processes and second parallel executing processes, wherein the one or one rollup operations are performed by the first parallel executing processes and the second parallel executing processes against the set of rows. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification