OPTIMIZING PROCESSING OF GROUP-BY QUERIES FEATURING MAXIMUM OR MINIMUM EQUALITY CONDITIONS IN A PARELLEL PROCESSING SYSTEM
First Claim
1. A method of optimizing processing of a query specifying an equality condition on an attribute of a table in a parallel processing system, comprising:
- receiving, by a processing module of a plurality of processing modules deployed in the parallel processing system, the query, wherein the processing module has a subset of rows of the table allocated thereto;
initializing, by the processing module, a hash table including a first field for a selected attribute of the query and at least one second field for the attribute on which the equality condition is applied and a row of the subset of rows;
identifying, by the processing module, each row of the subset of rows that satisfies the equality condition;
storing, by the processing module, the selected attribute of each row of the subset of rows identified as satisfying the equality condition in the first field of a respective row of the hash table, and the value of attribute on which the equality condition is applied and the row identified as satisfying the equality condition in the at least one second field of the respective row of the hash table;
redistributing each row of the hash table to a respective one of the plurality of processing modules based on a hash value of the selected attribute; and
receiving, by the processing module, a global value of each attribute of the table on which the equality condition is applied that respectively specifies a maximum value of the attribute on which the equality condition is applied for a corresponding selected attribute in the event the equality condition comprises a maximum equality condition and that respectively specifies a minimum value of the attribute on which the equality condition is applied for a corresponding selected attribute in the event the equality condition comprises a minimum equality condition.
0 Assignments
0 Petitions
Accused Products
Abstract
A system, method, and computer-readable medium for optimized processing of queries that feature maximum or minimum equality conditions are provided. The disclosed mechanisms provide for a single-scan of the table on which the group-by query is applied. When the table is scanned, each processing module dynamically keeps track of the row(s) having a value of the attribute on which the equality condition is applied that equals or exceeds the maximum attribute value (assuming a maximum equality condition is applied) previously encountered by the processing module. Subsequently, a global aggregation process is then performed to compute the query'"'"'s result without rescanning the table. Queries featuring a minimum equality condition are similarly processed in accordance with the disclosed embodiments.
23 Citations
20 Claims
-
1. A method of optimizing processing of a query specifying an equality condition on an attribute of a table in a parallel processing system, comprising:
-
receiving, by a processing module of a plurality of processing modules deployed in the parallel processing system, the query, wherein the processing module has a subset of rows of the table allocated thereto; initializing, by the processing module, a hash table including a first field for a selected attribute of the query and at least one second field for the attribute on which the equality condition is applied and a row of the subset of rows; identifying, by the processing module, each row of the subset of rows that satisfies the equality condition; storing, by the processing module, the selected attribute of each row of the subset of rows identified as satisfying the equality condition in the first field of a respective row of the hash table, and the value of attribute on which the equality condition is applied and the row identified as satisfying the equality condition in the at least one second field of the respective row of the hash table; redistributing each row of the hash table to a respective one of the plurality of processing modules based on a hash value of the selected attribute; and receiving, by the processing module, a global value of each attribute of the table on which the equality condition is applied that respectively specifies a maximum value of the attribute on which the equality condition is applied for a corresponding selected attribute in the event the equality condition comprises a maximum equality condition and that respectively specifies a minimum value of the attribute on which the equality condition is applied for a corresponding selected attribute in the event the equality condition comprises a minimum equality condition. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-readable medium having computer-executable instructions for execution by a processing system, the computer-executable instructions for optimizing processing of a query specifying an equality condition on an attribute of a table in a parallel processing system, the computer-executable instructions, when executed, cause the processing system to:
-
receive, by a processing module of a plurality of processing modules deployed in the parallel processing system, the query, wherein the processing module has a subset of rows of the table allocated thereto; initialize, by the processing module, a hash table including a first field for a selected attribute of the query and at least one second field for the attribute on which the equality condition is applied and a row of the subset of rows; identify, by the processing module, each row of the subset of rows that satisfies the equality condition; store, by the processing module, the selected attribute of each row of the subset of rows identified as satisfying the equality condition in the first field of a respective row of the hash table, and the value of attribute on which the equality condition is applied and the row identified as satisfying the equality condition in the at least one second field of the respective row of the hash table; redistribute each row of the hash table to a respective one of the plurality of processing modules based on a hash value of the selected attribute; and receive, by the processing module, a global value of each attribute of the table on which the equality condition is applied that respectively specifies a maximum value of the attribute on which the equality condition is applied for a corresponding selected attribute in the event the equality condition comprises a maximum equality condition and that respectively specifies a minimum value of the attribute on which the equality condition is applied for a corresponding selected attribute in the event the equality condition comprises a minimum equality condition. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A parallel processing system configured to optimize processing of a query specifying an equality condition on an attribute of a table in the parallel processing system, comprising:
-
at least one storage facility on which a database table is stored; and a plurality of processing modules each allocated a respective subset of rows of the table, wherein a processing module of the plurality of processing modules receives the query, initializes a hash table including a first field for a selected attribute of the query and at least one second field for the attribute on which the equality condition is applied and a row of the subset of rows, identifies each row of the subset of rows that satisfies the equality condition, stores the selected attribute of each row of the subset of rows identified as satisfying the equality condition in the first field of a respective row of the hash table, the value of the attribute on which the equality condition is applied and the row identified as satisfying the equality condition in the at least one second field of the respective row of the hash table, redistributes each row of the hash table to a respective one of the plurality of processing modules based on a hash value of the selected attribute, and receives a global value of each attribute of the table on which the equality condition is applied that respectively specifies a maximum value of the attribute on which the equality condition is applied for a corresponding selected attribute in the event the equality condition comprises a maximum equality condition and that respectively specifies a minimum value of the attribute on which the equality condition is applied for a corresponding selected attribute in the event the equality condition comprises a minimum equality condition. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification