Applying various hash methods used in conjunction with a query with a group by clause
First Claim
Patent Images
1. A computer-based method to apply a plurality of hash methods in a query with a Group By clause comprising:
- identifying a plurality of cells, each of said cells representing a combination of disparate attribute partitions, said disparate attribute partitions formed based on partitioning a set of attribute values of each attribute among a plurality of attributes of a database, and said disparate partitions belonging to different attributes of said database;
identifying a plurality of drawers, each of said drawers comprising a collection of cells from a single partition of a Group By column, and each of said drawers being defined for a specific query;
independently computing a separate hash table for each of said drawers; and
independently applying a hashing scheme picked from among a plurality of hashing schemes for each of said drawers, wherein, said hashing scheme is picked according to the number of distinct groups in said given drawer, said number of distinct groups estimated from a dictionary, said hashing scheme is picked from said plurality of hashing schemes as follows;
for single column Group Bys, using a hashing scheme that uses group codes as an index to said given drawer'"'"'s hash table;
for correlated Group Bys, using a pre-computed minimal perfect hash function, said pre-computed minimal perfect hash function having as many buckets as number of groups in said given drawer; and
otherwise, using a linear probing hashing scheme using multiplicative hashing as a hashing function,wherein when said linear probing hashing scheme is picked, said method comprising the step of filling a hash table associated with said give drawer up to a predetermined load factor.
1 Assignment
0 Petitions
Accused Products
Abstract
A novel method is described for applying various hash methods used in conjunction with a query with a Group By clause. A plurality of drawers are identified, wherein each of the drawers is made up of a collection of cells from a single partition of a Group By column and each of the drawers being defined for a specific query. A separate hash table is independently computed for each of the drawers and a hashing scheme (picked from among a plurality of hashing schemes) is independently applied for each of the drawers.
-
Citations
15 Claims
-
1. A computer-based method to apply a plurality of hash methods in a query with a Group By clause comprising:
-
identifying a plurality of cells, each of said cells representing a combination of disparate attribute partitions, said disparate attribute partitions formed based on partitioning a set of attribute values of each attribute among a plurality of attributes of a database, and said disparate partitions belonging to different attributes of said database; identifying a plurality of drawers, each of said drawers comprising a collection of cells from a single partition of a Group By column, and each of said drawers being defined for a specific query; independently computing a separate hash table for each of said drawers; and independently applying a hashing scheme picked from among a plurality of hashing schemes for each of said drawers, wherein, said hashing scheme is picked according to the number of distinct groups in said given drawer, said number of distinct groups estimated from a dictionary, said hashing scheme is picked from said plurality of hashing schemes as follows; for single column Group Bys, using a hashing scheme that uses group codes as an index to said given drawer'"'"'s hash table; for correlated Group Bys, using a pre-computed minimal perfect hash function, said pre-computed minimal perfect hash function having as many buckets as number of groups in said given drawer; and otherwise, using a linear probing hashing scheme using multiplicative hashing as a hashing function, wherein when said linear probing hashing scheme is picked, said method comprising the step of filling a hash table associated with said give drawer up to a predetermined load factor. - View Dependent Claims (2, 3, 4, 5)
-
-
6. An article of manufacture comprising a non-transitory computer usable storage medium having computer readable program code embodied therein which implements a computer-based method for applying a plurality of hash methods in query with a Group By clause, said computer usable medium comprising:
computer readable program code identifying a plurality of cells, each of said cells representing a combination of disparate attribute partitions, said disparate attribute partitions formed based on partitioning a set of attribute values of each attribute among a plurality of attributes of a database, and said disparate partitions belonging to different attributes of said database;
computer readable program code identifying a plurality of drawers, each of said drawers comprising a collection of cells from a single partition of a Group By column, and each of said drawers being defined for a specific query;
computer readable program code independently computing a separate hash table for each of said drawers; andcomputer readable program code independently applying a hashing scheme picked from among a plurality of hashing schemes for each of said drawers wherein, said hashing scheme is picked according to the number of distinct groups in said given drawer, said number of distinct groups estimated from a dictionary, said hashing scheme is picked from said plurality of hashing schemes as follows; for single column Group Bys, using a hashing scheme that uses group codes as an index to said given drawer'"'"'s hash table; for correlated Group Bys, using a pre-computed minimal perfect hash function, said pre-computed minimal perfect hash function having as many buckets as number of groups in said given drawer; and otherwise, using a linear probing hashing scheme using multiplicative hashing as a hashing function, wherein when said linear probing hashing scheme is picked, said method comprising the step of filling a hash table associated with said give drawer up to a predetermined load factor. - View Dependent Claims (7, 8, 9, 10)
-
11. A computer-based method to apply a plurality of hash methods in a query with a Group By clause comprising:
-
identifying a plurality of cells, each of said cells representing a combination of disparate attribute partitions, said disparate attribute partitions formed based on partitioning a set of attribute values of each attribute among a plurality of attributes of a database, and said disparate partitions belonging to different attributes of said database; identifying a plurality of drawers, each of said drawers comprising a collection of cells from a single partition of a Group By column, and each of said drawers being defined for a specific query; independently computing a separate hash table for each of said drawers; independently applying any of the following hashing schemes; for single column Group Bys, using a hashing scheme that uses group codes as an index to said given drawer'"'"'s hash table; for correlated Group Bys, using a pre-computed minimal perfect hash function, said pre-computed minimal perfect hash function having as many buckets as number of groups in said given drawer; and otherwise, using a linear probing hashing scheme using multiplicative hashing as a hashing function, wherein when said linear probing hashing scheme is picked, said method comprising the step of filling a hash table associated with said give drawer up to a predetermined load factor; and wherein, for a given drawer, a hashing scheme is picked according to the number of distinct groups in said given drawer, said number of distinct groups estimated from a dictionary. - View Dependent Claims (12, 13, 14, 15)
-
Specification