Use of dynamic dictionary encoding with an associated hash table to support many-to-many joins and aggregations
First Claim
1. A method comprising:
- receiving a query at a database server executing on a computing device;
wherein the query specifies;
a grouping operation that groups rows from a second table based on values from a grouped-by column of a first table;
a join operation that joins rows from the second table based on values from a joined-by column of the first table;
creating, within storage of the computing device, a dense grouping key value-to-join key (DGK-JK) value mapping data structure that comprises a first dense grouping key (DGK) column;
wherein the first DGK column includes a first plurality of dense grouping key (DGK) values and one or more instances of a flag value;
wherein each entry in the DGK-JK value mapping data structure includes either;
one of the first plurality of DGK values at a location within the DGK-JK value mapping data structure, wherein the location corresponds to a particular join key value from a plurality of join key values from the joined-by column of the first table, oran instance of the flag value at the location within the DGK-JK value mapping data structure, wherein the location corresponds to the particular join key value from the plurality of join key values from the joined-by column of the first table;
creating, within the storage of the computing device, a dense grouping key value-to-grouping key (DGK-GK) value mapping data structure that comprises a grouping key column and a second dense grouping key (DGK) column;
wherein the grouping key column includes a plurality of group-by key values from the grouped-by column of the first table;
wherein the second DGK column includes a second plurality of DGK values;
wherein each entry in the DGK-GK value mapping data structure includes one of the plurality of group-by key values, and a corresponding one of the second plurality of DGK values;
creating, within the storage of the computing device, a multiple dense grouping key values-to-join key (multi-DGK-JK) value mapping data structure that is a hash table that includes a hash bucket for each join key value that corresponds to a respective instance of the flag value;
wherein applying a hash function to a given join key value that corresponds to a given instance of the flag value indicates a location of the hash bucket that corresponds to the given join key value;
within the hash bucket for the given join key value, storing an indication of a given plurality of DGK values with which the given join key value is associated;
using the multi-DGK-JK mapping structure, in the grouping operation, as a join-key value hash-based index to the given plurality of DGK values;
determining a result set for the query by performing the grouping operation and the join operation using the DGK-JK value mapping data structure and the multi-DGK-JK value mapping data structure, the DGK-JK value mapping data structure and the multi-DGK-JK value mapping data structure matching the plurality of join key values from the joined-by column of the first table with the second plurality of DGK values;
wherein the method is performed by one or more computing devices.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques are described herein for using a dynamic dictionary encoding with an associated hash table to support many-to-many join and aggregation operations. In an embodiment, within a first storage of a computing device, a first data structure that comprises a first dense grouping key column is created. The dense grouping key column includes a first plurality of dense grouping key values and one or more instances of a flag value. Within the first storage of the computing device, a second data structure is created that comprises a group-by column and a second dense grouping key column. The group-by column includes a plurality of group-by key values and the second dense grouping key column includes a second plurality of dense grouping key values. Within the first storage of the computing device, a third data structure, a hash table, is created that includes a hash bucket for each join key value that corresponds to an instance of the flag value. A result set for a query is determined using a combination of the first data structure, second data structure and/or the third data structure.
-
Citations
16 Claims
-
1. A method comprising:
-
receiving a query at a database server executing on a computing device; wherein the query specifies; a grouping operation that groups rows from a second table based on values from a grouped-by column of a first table; a join operation that joins rows from the second table based on values from a joined-by column of the first table; creating, within storage of the computing device, a dense grouping key value-to-join key (DGK-JK) value mapping data structure that comprises a first dense grouping key (DGK) column; wherein the first DGK column includes a first plurality of dense grouping key (DGK) values and one or more instances of a flag value; wherein each entry in the DGK-JK value mapping data structure includes either; one of the first plurality of DGK values at a location within the DGK-JK value mapping data structure, wherein the location corresponds to a particular join key value from a plurality of join key values from the joined-by column of the first table, or an instance of the flag value at the location within the DGK-JK value mapping data structure, wherein the location corresponds to the particular join key value from the plurality of join key values from the joined-by column of the first table; creating, within the storage of the computing device, a dense grouping key value-to-grouping key (DGK-GK) value mapping data structure that comprises a grouping key column and a second dense grouping key (DGK) column; wherein the grouping key column includes a plurality of group-by key values from the grouped-by column of the first table; wherein the second DGK column includes a second plurality of DGK values; wherein each entry in the DGK-GK value mapping data structure includes one of the plurality of group-by key values, and a corresponding one of the second plurality of DGK values; creating, within the storage of the computing device, a multiple dense grouping key values-to-join key (multi-DGK-JK) value mapping data structure that is a hash table that includes a hash bucket for each join key value that corresponds to a respective instance of the flag value; wherein applying a hash function to a given join key value that corresponds to a given instance of the flag value indicates a location of the hash bucket that corresponds to the given join key value; within the hash bucket for the given join key value, storing an indication of a given plurality of DGK values with which the given join key value is associated; using the multi-DGK-JK mapping structure, in the grouping operation, as a join-key value hash-based index to the given plurality of DGK values; determining a result set for the query by performing the grouping operation and the join operation using the DGK-JK value mapping data structure and the multi-DGK-JK value mapping data structure, the DGK-JK value mapping data structure and the multi-DGK-JK value mapping data structure matching the plurality of join key values from the joined-by column of the first table with the second plurality of DGK values; wherein the method is performed by one or more computing devices. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method comprising:
-
creating, within storage of a computing device, a first data structure that comprises a first dense grouping key column; wherein the first dense grouping key column includes a first plurality of dense grouping key values and one or more instances of a flag value; wherein each entry in the first data structure includes either; one of the first plurality of dense grouping key values at a location within the first data structure, wherein the location corresponds to a particular join key value from a plurality of join key values from a joined-by column of a first table, or an instance of the flag value at the location within the first data structure, wherein the location corresponds to the particular join key value from plurality of join key values from the joined-by column of the first table; creating, within the storage of the computing device, a second data structure that comprises a grouping key column and a second dense grouping key column; wherein the grouping key column includes a plurality of group-by key values from a grouped-by column of the first table; wherein the second dense grouping key column includes a second plurality of dense grouping key values; wherein each entry in the second data structure includes one of the plurality of group-by key values, and a corresponding one of the second plurality of dense grouping key values; creating, within the storage of the computing device, a third data structure that is a hash table that includes a hash bucket for each join key value that corresponds to a respective instance of the flag value; wherein applying a hash function to a given join key value that corresponds to a given instance of the flag value indicates a location of the hash bucket that corresponds to the given join key value; within the hash bucket for the given join key value, storing an indication of a given plurality of dense grouping key values with which the given join key value is associated; determining whether a particular dense grouping key value is stored at the location within the first data structure that corresponds to the particular join key value; in response to determining that the particular dense grouping key value is stored at the location within the first data structure that corresponds to the particular join key value; modifying the entry at the location within the first data structure that corresponds to the particular join key value to a particular instance of the flag value; wherein the method is performed by one or more computing devices.
-
-
8. A method comprising:
-
receiving a query at a database server executing on one or more computing devices; wherein the query specifies; a grouping operation that groups rows from a second table based on values from a grouped-by column of a first table; a join operation that joins rows from the second table based on values from a joined-by column of the first table; creating, within storage of a computing device, a first data structure that comprises a first dense grouping key column; wherein the first dense grouping key column includes a first plurality of dense grouping key values and one or more instances of a flag value; wherein each entry in the first data structure includes either; one of the first plurality of dense grouping key values at a location within the first data structure, wherein the location corresponds to a particular join key value from a plurality of join key values from the joined-by column of the first table, or an instance of the flag value at the location within the first data structure, wherein the location corresponds to the particular join key value from plurality of join key values from the joined-by column of the first table; creating, within the storage of the computing device, a second data structure that comprises a grouping key column and a second dense grouping key column; wherein the grouping key column includes a plurality of group-by key values from the grouped-by column of the first table; wherein the second dense grouping key column includes a second plurality of dense grouping key values; wherein each entry in the second data structure includes one of the plurality of group-by key values, and a corresponding one of the second plurality of dense grouping key values; creating, within the storage of the computing device, a third data structure that is a hash table that includes a hash bucket for each join key value that corresponds to a respective instance of the flag value; wherein applying a hash function to a given join key value that corresponds to a given instance of the flag value indicates a location of the hash bucket that corresponds to the given join key value; within the hash bucket for the given join key value, storing an indication of a given plurality of dense grouping key values with which the given join key value is associated; determining whether a particular entry within the first data structure includes a particular dense grouping key value from the first plurality of dense grouping key values or a particular instance of the flag value; in response to determining that the first data structure includes the particular dense grouping key value from the first plurality of dense grouping key values; determining, using the first data structure and the second data structure, a result set for the query; in response to determining that the first data structure includes the particular instance of the flag value; determining, using the third data structure and the second data structure, a result set for the query; wherein the method is performed by the one or more computing devices.
-
-
9. One or more non-transitory storage media storing one or more sequences of instructions which, when executed by one or more computing devices, cause:
-
receiving a query at a database server executing on a computing device; wherein the query specifies; a grouping operation that groups rows from a second table based on values from a grouped-by column of a first table; a join operation that joins rows from the second table based on values from a second column of the first table; creating, within storage of a computing device, a dense grouping key value-to-join key (DGK-JK) value mapping data structure that comprises a first dense grouping key (DGK) column; wherein the first DGK column includes a first plurality of dense grouping key (DGK) values and one or more instances of a flag value; wherein each entry in the DGK-JK value mapping data structure includes either; one of the first plurality of DGK values at a location within the DGK-JK value mapping data structure, wherein the location corresponds to a particular join key value from a plurality of join key values from the first column of the first table, or an instance of the flag value at the location within the DGK-JK value mapping data structure, wherein the location corresponds to the particular join key value from plurality of join key values from a first column of the first table; creating, within the storage of the computing device, a dense grouping key value-to-grouping key (DGK-GK) value mapping data structure that comprises a group-by column and a second dense grouping key (DGK) column; wherein the group-by column includes a plurality of group-by key values from a second column of the first table; wherein the second DGK column includes a second plurality of DGK values; wherein each entry in the DGK-GK value mapping data structure includes one of the plurality of group-by key values, and a corresponding one of the second plurality of DGK values; creating, within the storage of the computing device, a multiple dense grouping key values-to-join key (multi-DGK-JK) value mapping data structure that is a hash table that includes a hash bucket for each join key value that corresponds to an instance of the flag value; wherein applying a hash function to a join key value that corresponds to an instance of the flag value indicates a location of the hash bucket that corresponds to the join key value; within the hash bucket for a given join key value, storing an indication of a plurality of DGK values with which the given join key value is associated; using the multi-DGK-JK mapping structure, in the grouping operation, as a join-key value hash-based index to the plurality of DGK values; determining a result set for the query by performing the grouping operation and the join operation using the DGK-JK value mapping data structure and the multi-DGK-JK value mapping data structure to match the plurality of join key values from the first column of the first table with the second plurality of DGK values. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. One or more non-transitory storage media storing one or more sequences of instructions which, when executed by one or more computing devices, cause:
-
creating, within storage of a computing device, a first data structure that comprises a first dense grouping key column; wherein the first dense grouping key column includes a first plurality of dense grouping key values and one or more instances of a flag value; wherein each entry in the first data structure includes either; one of the first plurality of dense grouping key values at a location within the first data structure, wherein the location corresponds to a particular join key value from a plurality of join key values from a joined-by column of a first table, or an instance of the flag value at the location within the first data structure, wherein the location corresponds to the particular join key value from plurality of join key values from the joined-by column of the first table; creating, within the storage of the computing device, a second data structure that comprises a grouping key column and a second dense grouping key column; wherein the grouping key column includes a plurality of group-by key values from a grouped-by column of the first table; wherein the second dense grouping key column includes a second plurality of dense grouping key values; wherein each entry in the second data structure includes one of the plurality of group-by key values, and a corresponding one of the second plurality of dense grouping key values; creating, within the storage of the computing device, a third data structure that is a hash table that includes a hash bucket for each join key value that corresponds to a respective instance of the flag value; wherein applying a hash function to a given join key value that corresponds to a given instance of the flag value indicates a location of the hash bucket that corresponds to the given join key value; within the hash bucket for the given join key value, storing an indication of a given plurality of dense grouping key values with which the given join key value is associated; determining whether a particular dense grouping key value is stored at the location within the first data structure that corresponds to the particular join key value; in response to determining that the particular dense grouping key value is stored at the location within the first data structure that corresponds to the particular join key value; modifying the entry at the location within the first data structure that corresponds to the particular join key value to a particular instance of the flag value.
-
-
16. One or more non-transitory storage media storing one or more sequences of instructions which, when executed by one or more computing devices, cause:
-
receiving a query at a database server executing on a computing device; wherein the query specifies; a grouping operation that groups rows from a second table based on values from a grouped-by column of a first table; a join operation that joins rows from the second table based on values from a joined-by column of the first table; creating, within storage of the computing device, a first data structure that comprises a first dense grouping key column; wherein the first dense grouping key column includes a first plurality of dense grouping key values and one or more instances of a flag value; wherein each entry in the first data structure includes either; one of the first plurality of dense grouping key values at a location within the first data structure, wherein the location corresponds to a particular join key value from a plurality of join key values from the joined-by column of the first table, or an instance of the flag value at the location within the first data structure, wherein the location corresponds to the particular join key value from plurality of join key values from the joined-by column of the first table; creating, within the storage of the computing device, a second data structure that comprises a grouping key column and a second dense grouping key column; wherein the grouping key column includes a plurality of group-by key values from the grouped-by column of the first table; wherein the second dense grouping key column includes a second plurality of dense grouping key values; wherein each entry in the second data structure includes one of the plurality of group-by key values, and a corresponding one of the second plurality of dense grouping key values; creating, within the storage of the computing device, a third data structure that is a hash table that includes a hash bucket for each join key value that corresponds to a respective instance of the flag value; wherein applying a hash function to a given join key value that corresponds to a given instance of the flag value indicates a location of the hash bucket that corresponds to the given join key value; within the hash bucket for the given join key value, storing an indication of a given plurality of dense grouping key values with which the given join key value is associated; determining whether a particular entry within the first data structure includes a particular dense grouping key value from the first plurality of dense grouping key values or a particular instance of the flag value; in response to determining that the first data structure includes the particular dense grouping key value from the first plurality of dense grouping key values; determining, using the first data structure and the second data structure, a result set for the query; in response to determining that the first data structure includes the particular instance of the flag value; determining, using the third data structure and the second data structure, a result set for the query.
-
Specification