Star/join query optimization
First Claim
1. A method for planning a database query, comprising:
- receiving a star/join query joining a fact table with multiple dimension tables, said fact table including a plurality of rows each row identified by a row identification code (RID);
forming a first query plan for access of each dimension table to identify resultant rows of each dimension table;
forming a second query plan for access of the fact table, comprising the steps of;
for the resultant rows of each dimension table, planning a fact table index scan over join columns of that dimension table to produce a set of resultant RIDs, said fact table having indices for join columns with each of the dimension tables;
applying staged comparative RID filtering to each dimension table'"'"'s set of resultant RIDs to produce a set of filtered RIDs; and
planning a fetch of all fact table rows corresponding to the set of filtered RIDs to produce a distilled fact table; and
forming a final query plan comprising planning execution of the query using the distilled fact table.
1 Assignment
0 Petitions
Accused Products
Abstract
Unwieldy star/join queries are performed more efficiently using a filtered fact table. Suitable queries include star/join queries with a large fact table joined with multiple subsidiary dimension tables, where indices exist over fact table join columns. The query is analyzed to prepare a query plan for the dimension table accesses. This plan is supplemented by adding nested loop join operations, where the inner table is a dimension table plan and the outer table is an index scan performed over a fact table index of the join column with the dimension table. The plan is also supplemented by filtering records resulting from the nested loop joins using a sequence of dynamic bit vectors, ultimately yielding a list of probable fact table records. The plan is further supplemented by fetching these records to construct a distilled fact which is used, instead of the large original table, to execute the query in considerably less time. If desired, the supplemented query plan and other competing approaches may studied to provide cost estimates, with the least costly approach being actually implemented.
120 Citations
75 Claims
-
1. A method for planning a database query, comprising:
-
receiving a star/join query joining a fact table with multiple dimension tables, said fact table including a plurality of rows each row identified by a row identification code (RID); forming a first query plan for access of each dimension table to identify resultant rows of each dimension table; forming a second query plan for access of the fact table, comprising the steps of; for the resultant rows of each dimension table, planning a fact table index scan over join columns of that dimension table to produce a set of resultant RIDs, said fact table having indices for join columns with each of the dimension tables; applying staged comparative RID filtering to each dimension table'"'"'s set of resultant RIDs to produce a set of filtered RIDs; and planning a fetch of all fact table rows corresponding to the set of filtered RIDs to produce a distilled fact table; and forming a final query plan comprising planning execution of the query using the distilled fact table. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A method for performing a database query, comprising:
-
receiving a star/join query joining a fact table with multiple dimension tables, said fact table including a plurality of rows each row identified by a row identification code (RID); performing access of each dimension table according to the star/join query to identify resultant rows of each dimension table; and for the resultant rows of each dimension table, performing a fact table index scan over join columns of that dimension table to produce a set of resultant RIDs, said fact table having indices for join columns with each of the dimension tables; applying staged comparative RID filtering to each dimension table'"'"'s set of resultant RIDs to produce a set of filtered RIDs; fetching all fact table rows corresponding to the set of filtered RIDs to produce a distilled fact table; and executing the star/join query using the distilled fact table. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. A signal-bearing medium tangibly embodying a program of machine-readable instructions executable by a digital processing apparatus to perform a method for planning a database query, said method comprising:
-
receiving a star/join query joining a fact table with multiple dimension tables, said fact table including a plurality of rows each row identified by a row identification code (RID); forming a first query plan for access of each dimension table to identify resultant rows of each dimension table; forming a second query plan for access of the fact table, comprising the steps of; for the resultant rows of each dimension table, planning a fact table index scan over join columns of that dimension table to produce a set of resultant RIDs, said fact table having indices for join columns with each of the dimension tables; applying staged comparative RID filtering to each dimension table'"'"'s set of resultant RIDs to produce a set of filtered RIDs; and planning a fetch of all fact table rows corresponding to the set of filtered RIDs to produce a distilled fact table; and forming a final query plan comprising planning execution of the query using the distilled fact table. - View Dependent Claims (27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40)
-
-
41. A signal-bearing medium tangibly embodying a program of machine-readable instructions executable by a digital processing apparatus to perform a database query, said method comprising:
-
receiving a star/join query joining a fact table with multiple dimension tables, said fact table including a plurality of rows each row identified by a row identification code (RID); performing access of each dimension table according to the star/join query to identify resultant rows of each dimension table; and for the resultant rows of each dimension table, performing a fact table index scan over join columns of that dimension table to produce a set of resultant RIDs, said fact table having indices for join columns with each of the dimension tables; applying staged comparative RID filtering to each dimension table'"'"'s set of resultant RIDs to produce a set of filtered RIDs; fetching all fact table rows corresponding to the set of filtered RIDs to produce a distilled fact table; and executing the star/join query using the distilled fact table. - View Dependent Claims (42, 43, 44, 45, 46, 47, 48, 49, 50)
-
-
51. A digital data processing apparatus, comprising:
-
a storage unit containing a program of machine-readable instructions; and a processing unit coupled to the storage unit and being programmed to perform steps to evaluate a query by executing the machine-readable instructions, said steps comprising; receiving a star/join query joining a fact table with multiple dimension tables, said fact table including a plurality of rows each row identified by a row identification code (RID); forming a first query plan for access of each dimension table to identify resultant rows of each dimension table; forming a second query plan for access of the fact table, comprising the steps of; for the resultant rows of each dimension table, planning a fact table index scan over join columns of that dimension table to produce a set of resultant RIDs, said fact table having indices for join columns with each of the dimension tables; applying staged comparative RID filtering to each dimension table'"'"'s set of resultant RIDs to produce a set of filtered RIDs; and planning a fetch of all fact table rows corresponding to the set of filtered RIDs to produce a distilled fact table; and forming a final query plan comprising planning execution of the query using the distilled fact table. - View Dependent Claims (52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65)
-
-
66. A digital data processing apparatus, comprising:
-
a storage unit containing a program of machine-readable instructions; and a processing unit coupled to the storage unit and being programed to perform a database a query by executing the machine-readable instructions, said steps comprising; receiving a star/join query joining a fact table with multiple dimension tables, said fact table including a plurality of rows each row identified by a row identification code (RID); performing access of each dimension table according to the star/join query to identify resultant rows of each dimension table; and for the resultant rows of each dimension table, performing a fact table index scan over join columns of that dimension table to produce a set of resultant RIDs, said fact table having indices for join columns with each of the dimension tables; applying staged comparative RID filtering to each dimension table'"'"'s set of resultant RIDs to produce a set of filtered RIDs; fetching all fact table rows corresponding to the set of filtered RIDs to produce a distilled fact table; and executing the star/join query using the distilled fact table. - View Dependent Claims (67, 68, 69, 70, 71, 72, 73, 74, 75)
-
Specification