Query processing system that computes GROUPING SETS, ROLLUP, and CUBE with a reduced number of GROUP BYs in a query graph model

US 5,963,936 A
Filed: 06/30/1997
Issued: 10/05/1999
Est. Priority Date: 06/30/1997
Status: Expired due to Term

First Claim

Patent Images

1. In a relational database management system utilizing a data processor for storing data in the form of at least one table comprised of tuples having one or more columns, wherein the data contained in the relational database management is retrievable by means of query language aggregation queries, a data processor implemented method for increasing the computational efficiency of the calculation of an aggregation query by reducing the number of GROUP BY operations required to specify the aggregation query, the method comprising:

receiving an aggregation query that includes two or more element lists;

forming the intersection of a first element list of the aggregation query and a second element list of the aggregation query by inputting a first GROUP BY of the first element list to a second GROUP BY of the second element list; and

wherein forming includes stacking a first GROUP BY operation with respect to a second GROUP BY operation to represent results of the GROUP BY of the first element list, the second GROUP BY of the second element list, and at least a third GROUP BY of a third element list produced by the stacking of the first and second GROUP BY operations.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Method and apparatus for detecting and stacking grouping sets to support GROUP BY operations with GROUPING SETS, ROLLUP and CUBE extensions in relational database management systems, with greatly reduced numbers of grouping sets. A first GROUP BY (element-list1) is input to a second GROUP BY (element-list2), resulting in the GROUP BY of the intersection of the two lists. This intersection property is then useable to reduce the number of GROUP BYs required to implement the grouping by GROUPING SETS, ROLLUPs, and CUBEs required for the online analytical processing of data contained in the database.

101 Citations

View as Search Results

14 Claims

1. In a relational database management system utilizing a data processor for storing data in the form of at least one table comprised of tuples having one or more columns, wherein the data contained in the relational database management is retrievable by means of query language aggregation queries, a data processor implemented method for increasing the computational efficiency of the calculation of an aggregation query by reducing the number of GROUP BY operations required to specify the aggregation query, the method comprising:
- receiving an aggregation query that includes two or more element lists;
  
  forming the intersection of a first element list of the aggregation query and a second element list of the aggregation query by inputting a first GROUP BY of the first element list to a second GROUP BY of the second element list; and
  
  wherein forming includes stacking a first GROUP BY operation with respect to a second GROUP BY operation to represent results of the GROUP BY of the first element list, the second GROUP BY of the second element list, and at least a third GROUP BY of a third element list produced by the stacking of the first and second GROUP BY operations.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1, wherein stacking represents a GROUP BY clause with multiple GROUPING SETS in the aggregation query.
  - 3. The method of claim 1, wherein stacking represents a GROUP BY clause with at least one ROLLUP in the aggregation query.
  - 4. The method of claim 1, wherein stacking represents a GROUP BY clause with at least two concatenated ROLLUPs in the aggregation query.
  - 5. The method of claim 1, wherein stacking represents a GROUP BY clause with a CUBE in the aggregation query.

6. In a relational database management system utilizing a data processor for storing data in the form of at least one table comprised of tuples having one or more columns, wherein the data contained in the relational database management is retrievable by means of query language aggregation queries, a data processor implemented method for increasing the computational efficiency of the calculation of an aggregation query by reducing the number of GROUP BY operations required to specify the aggregation query, the method comprising:
- receiving an aggregation query that includes two or more element lists, wherein the aggregation query includes a non-holistic aggregation operation; and
  
  forming the intersection of a first element list of the aggregation query and a second element list of the aggregation query by inputting a first GROUP BY of the first element list to a second GROUP BY of the second element list.

7. In a relational database management system utilizing a data processor for storing data in the form of at least one table comprised of tuples having one or more columns, wherein the data contained in the relational database management is retrievable by means of SQL aggregation queries, a data processor implemented method for producing a model of an SQL query having a GROUP BY clause limited by a set of grouping-sets, in which nodes of the model represent query operations and nodes are identified by labels, each label representing a GROUP BY operation that results from the application of a GROUP BY operation to all inputs of a represented node, the method comprising:
- placing a base table T in the model;
  
  defining a variable that represents the GROUP BY of a maximal grouping set in the set of grouping sets;
  
  finding a first node in the model representing an operation that is a superset of the GROUP BY operation represented by the variable;
  
  adding the variable to the model with the first node as its input;
  
  removing the maximal set from the set of grouping sets; and
  
  for each node in the model other than the first node;
  
  determining if the label of the first node with the operation of a second node produces any remaining GROUP BYs in the set of grouping sets and only remaining GROUP BYs in the set of grouping sets;
  
  responsive to determining that the label of the first node with the operation of the second node produces remaining GROUP BYs in the set of grouping sets and only remaining GROUP BYs in the set of grouping sets, adding a data flow arc from the first node to the second node, the data flow arc defining an intersection;
  
  adding new GROUP BYs produced by the intersection to the label of the second node; and
  
  removing new GROUP BYs produced by the intersection from the set of grouping sets.
- View Dependent Claims (8, 9)
- - 8. The method of claim 7, further including iteratively performing all steps following placement of the base table, until the set of grouping sets is empty.
  - 9. The method of claim 7, wherein the model comprises a graph.

10. In a relational database management system utilizing a data processor for storing data in the form of at least one table comprised of tuples and having one or more columns, wherein the data contained in the relational database management system is retrievable by means of SQL aggregation queries, a data processor implemented method for increasing the computational efficiency of the calculation of an aggregation query including a concatenation of ROLLUP operations by reducing the number of GROUP BY operations required to specify the concatenated ROLLUP operations, the method comprising:
- receiving an aggregation query including a plurality p of concatenated ROLLUP operations, where the ith ROLLUP operation is specified by a list of elements n_i ;
  
  representing the p concatenated ROLLUP operations with a stack of GROUP BY operations in which the number of GROUP BY operations is less than ((n1+1)×
  
  (n2+1)×
  
  . . . (np+1)); and
  
  executing the concatenated ROLLUP operations according to the stack of GROUP BY operations.
- View Dependent Claims (11, 12, 13, 14)
- - 11. The method of claim 10, wherein representing includes:
    - constructing a base GROUP BY operation including all elements from each of the ROLLUP operations;
      
      constructing a stack of GROUP BY operations for one ROLLUP operation of the p ROLLUP operations by, for each prefix in a list of elements for the one ROLLUP operation, taking a GROUP BY operation of the prefix and all elements of the remaining ROLLUP operations, excluding the one ROLLUP operation;
      
      creating a base group by inputting the base GROUP BY operation into the stack of GROUP BY operations for the one ROLLUP operation;
      
      creating a second group by constructing a stack of GROUP BY operations for a second ROLLUP operation by, for each prefix in a list of elements for the second ROLLUP operation, taking a GROUP BY operation of the prefix and all elements of the remaining ROLLUP operations, excluding the second ROLLUP operation;
      
      inputting the results of the base group into the second group; and
      
      unioning the results of the base group with the results of the second group.
  - 12. The method of claim 11, wherein the cardinality of the one ROLLUP operation is the highest cardinality of all the ROLLUP operations.
  - 13. The method of claim 12, wherein the cardinality of the second ROLLUP operation is the next highest cardinality of all ROLLUP operations.
  - 14. The method of claim 13, wherein the aggregation query includes a CUBE operation, the concatenation of ROLLUP operations representing the CUBE operation, the method further including:
    - creating a third group by constructing a stack of GROUP BY operations for a third ROLLUP operation by, for each prefix in a list of elements for the third ROLLUP operation, taking a GROUP BY operation of the prefix and all elements of the remaining ROLLUP operations, excluding the third ROLLUP operation;
      
      inputting the results of the base group and the second group into the third group; and
      
      unioning the results of the base group, the second group, and the third group.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Jou, Michelle Meichiou, Pirahesh, Mir Hamid, Lapis, George, Cochrane, Roberta Jo
Primary Examiner(s)
Fetting, Anton
Assistant Examiner(s)
Corrielus, Jean M.

Application Number

US08/885,485
Time in Patent Office

827 Days
Field of Search

707/5, 707/2, 707/3
US Class Current

1/1
CPC Class Codes

G06F 16/24537   of operators

Y10S 707/99932   Access augmentation or opti...

Y10S 707/99933   Query processing, i.e. sear...

Y10S 707/99935   Query augmenting and refini...

Query processing system that computes GROUPING SETS, ROLLUP, and CUBE with a reduced number of GROUP BYs in a query graph model

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

101 Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

Query processing system that computes GROUPING SETS, ROLLUP, and CUBE with a reduced number of GROUP BYs in a query graph model

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

101 Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links