Optimization of SQL queries using hash star join operations
First Claim
1. A method of optimizing a query in a computer having a memory, the query being performed by the computer to retrieve data from a relational database stored in one or more electronic storage devices coupled to the computer, the method comprising the steps of:
- (a) examining the query in the memory of the computer to determine whether the query includes a star join operation on a base table and a plurality of dimension tables; and
(b) performing, in the memory of the computer, a hash star join operation when the query includes the star join operation on the base table and the plurality of dimension tables, further comprising the steps of;
(1) hash partitioning the base table into a plurality of distinct and separate sub-portions thereof;
(2) selecting and projecting rows from all of the dimension tables to separate temporary tables;
(3) splitting the query into a plurality of sub-queries, wherein each of the sub-queries comprises a hash star join operation on the separate temporary tables and one or more of the distinct sub-portions of the base table;
(4) performing each of the sub-queries independently of one another to create intermediate result tables; and
(5) merging the intermediate result tables from each of the performed sub-queries to create an output table.
3 Assignments
0 Petitions
Accused Products
Abstract
A method, apparatus, and article of manufacture for optimizing SQL queries in a relational database management system uses hash star join operations. A hash star join operation is performed in place of the star join operation when the query includes the star join operation on a base table and the plurality of dimension tables. The base table is hash partitioned into a plurality of distinct and separate sub-portions thereof. Selecting and projecting rows from all of the dimension tables to separate temporary dimension tables. The query is split into a plurality of sub-queries, wherein each of the sub-queries comprises a hash star join operation on the temporary dimension tables and one or more of the distinct sub-portions of the base table. Each of the sub-queries are then performed independently of one another to create intermediate result tables. The intermediate result tables from each of the sub-queries are then merged together to create an output table.
172 Citations
21 Claims
-
1. A method of optimizing a query in a computer having a memory, the query being performed by the computer to retrieve data from a relational database stored in one or more electronic storage devices coupled to the computer, the method comprising the steps of:
-
(a) examining the query in the memory of the computer to determine whether the query includes a star join operation on a base table and a plurality of dimension tables; and (b) performing, in the memory of the computer, a hash star join operation when the query includes the star join operation on the base table and the plurality of dimension tables, further comprising the steps of; (1) hash partitioning the base table into a plurality of distinct and separate sub-portions thereof; (2) selecting and projecting rows from all of the dimension tables to separate temporary tables; (3) splitting the query into a plurality of sub-queries, wherein each of the sub-queries comprises a hash star join operation on the separate temporary tables and one or more of the distinct sub-portions of the base table; (4) performing each of the sub-queries independently of one another to create intermediate result tables; and (5) merging the intermediate result tables from each of the performed sub-queries to create an output table. - View Dependent Claims (2, 3, 4, 13, 14, 15)
-
-
5. An apparatus for optimizing a query, comprising:
-
(a) a computer having a memory and one or more electronic storage devices coupled thereto, the data storage devices storing a relational database; (b) means, performed by the computer, for accepting the SQL query into the memory of the computer, the query being performed by the computer to retrieve data from a relational database stored in the electronic storage devices; (c) means, performed by the computer, for examining the query in the memory of the computer to determine whether the query includes a star join operation on a base table and a plurality of dimension tables; and (d) means, performed by the computer, for performing a hash star join operation in the memory of the computer when the query includes the star join operation on the base table and the plurality of dimension tables, further comprising; (1) means for hash partitioning the base table into a plurality of distinct and separate sub-portions thereof; (2) means for selecting and projecting rows from all of the dimension tables to separate temporary tables; (3) means for splitting the query into a plurality of sub-queries, wherein each of the sub-queries comprises a hash star join operation on the separate temporary tables and one or more of the distinct sub-portions of the base table; (4) means for performing each of the sub-queries independently of one another to create intermediate result tables; and (5) means for merging the intermediate result tables from each of the performed sub-queries to create an output table. - View Dependent Claims (6, 7, 8, 16, 17, 18)
-
-
9. A program storage device, readable by a computer, tangibly embodying a program of instructions executable by the computer to perform method steps for executing a query in a computer having a memory, the query being performed by the computer to retrieve data from a relational database stored in one or more electronic storage devices coupled to the computer, the method comprising the steps of:
-
(a) examining the query in the memory of the computer to determine whether the query includes a star join operation on a base table and a plurality of dimension tables; and (b) performing, in the memory of the computer, a hash star join operation when the query includes the star join operation on the base table and the plurality of dimension tables, further comprising the steps of; (1) hash partitioning the base table into a plurality of distinct and separate sub-portions thereof; (2) selecting and projecting rows from all of the dimension tables to separate temporary tables; (3) splitting the query into a plurality of sub-queries, wherein each of the sub-queries comprises a hash star join operation on the separate temporary tables and one or more of the distinct sub-portions of the base table; (4) performing each of the sub-queries independently of one another to create intermediate result tables; and (5) merging the intermediate result tables from each of the performed sub-queries to create an output table. - View Dependent Claims (10, 11, 12, 19, 20, 21)
-
Specification