Query processing system that computes GROUPING SETS, ROLLUP, and CUBE with a reduced number of GROUP BYs in a query graph model
First Claim
1. In a relational database management system utilizing a data processor for storing data in the form of at least one table comprised of tuples having one or more columns, wherein the data contained in the relational database management is retrievable by means of query language aggregation queries, a data processor implemented method for increasing the computational efficiency of the calculation of an aggregation query by reducing the number of GROUP BY operations required to specify the aggregation query, the method comprising:
- receiving an aggregation query that includes two or more element lists;
forming the intersection of a first element list of the aggregation query and a second element list of the aggregation query by inputting a first GROUP BY of the first element list to a second GROUP BY of the second element list; and
wherein forming includes stacking a first GROUP BY operation with respect to a second GROUP BY operation to represent results of the GROUP BY of the first element list, the second GROUP BY of the second element list, and at least a third GROUP BY of a third element list produced by the stacking of the first and second GROUP BY operations.
1 Assignment
0 Petitions
Accused Products
Abstract
Method and apparatus for detecting and stacking grouping sets to support GROUP BY operations with GROUPING SETS, ROLLUP and CUBE extensions in relational database management systems, with greatly reduced numbers of grouping sets. A first GROUP BY (element-list1) is input to a second GROUP BY (element-list2), resulting in the GROUP BY of the intersection of the two lists. This intersection property is then useable to reduce the number of GROUP BYs required to implement the grouping by GROUPING SETS, ROLLUPs, and CUBEs required for the online analytical processing of data contained in the database.
101 Citations
14 Claims
-
1. In a relational database management system utilizing a data processor for storing data in the form of at least one table comprised of tuples having one or more columns, wherein the data contained in the relational database management is retrievable by means of query language aggregation queries, a data processor implemented method for increasing the computational efficiency of the calculation of an aggregation query by reducing the number of GROUP BY operations required to specify the aggregation query, the method comprising:
-
receiving an aggregation query that includes two or more element lists; forming the intersection of a first element list of the aggregation query and a second element list of the aggregation query by inputting a first GROUP BY of the first element list to a second GROUP BY of the second element list; and wherein forming includes stacking a first GROUP BY operation with respect to a second GROUP BY operation to represent results of the GROUP BY of the first element list, the second GROUP BY of the second element list, and at least a third GROUP BY of a third element list produced by the stacking of the first and second GROUP BY operations. - View Dependent Claims (2, 3, 4, 5)
-
-
6. In a relational database management system utilizing a data processor for storing data in the form of at least one table comprised of tuples having one or more columns, wherein the data contained in the relational database management is retrievable by means of query language aggregation queries, a data processor implemented method for increasing the computational efficiency of the calculation of an aggregation query by reducing the number of GROUP BY operations required to specify the aggregation query, the method comprising:
-
receiving an aggregation query that includes two or more element lists, wherein the aggregation query includes a non-holistic aggregation operation; and forming the intersection of a first element list of the aggregation query and a second element list of the aggregation query by inputting a first GROUP BY of the first element list to a second GROUP BY of the second element list.
-
-
7. In a relational database management system utilizing a data processor for storing data in the form of at least one table comprised of tuples having one or more columns, wherein the data contained in the relational database management is retrievable by means of SQL aggregation queries, a data processor implemented method for producing a model of an SQL query having a GROUP BY clause limited by a set of grouping-sets, in which nodes of the model represent query operations and nodes are identified by labels, each label representing a GROUP BY operation that results from the application of a GROUP BY operation to all inputs of a represented node, the method comprising:
-
placing a base table T in the model; defining a variable that represents the GROUP BY of a maximal grouping set in the set of grouping sets; finding a first node in the model representing an operation that is a superset of the GROUP BY operation represented by the variable; adding the variable to the model with the first node as its input; removing the maximal set from the set of grouping sets; and for each node in the model other than the first node; determining if the label of the first node with the operation of a second node produces any remaining GROUP BYs in the set of grouping sets and only remaining GROUP BYs in the set of grouping sets; responsive to determining that the label of the first node with the operation of the second node produces remaining GROUP BYs in the set of grouping sets and only remaining GROUP BYs in the set of grouping sets, adding a data flow arc from the first node to the second node, the data flow arc defining an intersection; adding new GROUP BYs produced by the intersection to the label of the second node; and removing new GROUP BYs produced by the intersection from the set of grouping sets. - View Dependent Claims (8, 9)
-
-
10. In a relational database management system utilizing a data processor for storing data in the form of at least one table comprised of tuples and having one or more columns, wherein the data contained in the relational database management system is retrievable by means of SQL aggregation queries, a data processor implemented method for increasing the computational efficiency of the calculation of an aggregation query including a concatenation of ROLLUP operations by reducing the number of GROUP BY operations required to specify the concatenated ROLLUP operations, the method comprising:
-
receiving an aggregation query including a plurality p of concatenated ROLLUP operations, where the ith ROLLUP operation is specified by a list of elements ni ; representing the p concatenated ROLLUP operations with a stack of GROUP BY operations in which the number of GROUP BY operations is less than ((n1+1)×
(n2+1)×
. . . (np+1)); andexecuting the concatenated ROLLUP operations according to the stack of GROUP BY operations. - View Dependent Claims (11, 12, 13, 14)
-
Specification