System, method, and computer-readable medium for optimizing processing of queries featuring maximum or minimum equality conditions in a parallel processing system
First Claim
1. A method of optimizing processing of a query specifying a maximum equality condition on an attribute of a table in a parallel processing system, comprising:
- receiving, by a processing module of a plurality of processing modules deployed in the parallel processing system, the query, wherein the processing module has a subset of rows of the table allocated thereto;
initializing, by the processing module, a variable to maintain a local value of the attribute encountered by the processing module to a value that is less than a value that may be stored by the attribute;
initializing, by the processing module, a result spool to maintain at least one row of the table having an attribute value that at least equals the variable;
iteratively evaluating attribute values of the rows allocated to the processing module;
determining to add a first row to the result spool responsive to determining the attribute value of the first row equals or exceeds the local value of the variable;
completing evaluation of all rows allocated to the processing module;
transmitting the local value of the variable to a central node;
receiving, from the central node, a global maximum attribute value;
comparing the global maximum value to the local value;
determining the global maximum attribute value exceeds the local value; and
deleting all rows from the result spool.
1 Assignment
0 Petitions
Accused Products
Abstract
A system, method, and computer-readable medium for optimized processing of queries that feature maximum or minimum equality conditions are provided. A table on which the query is applied is scanned a single time. Rows of the table distributed to respective processing modules are scanned by the processing modules. Each processing module maintains identification of any rows distributed to the respective processing module that have attribute values that equal the maximum or minimum attribute value locally identified by the processing module. Subsequently, a global aggregation mechanism is invoked to compute the query result without requiring an additional rescan of the table. Further, the disclosed mechanisms may be extended to compute top N queries featuring maximum or minimum equality conditions.
-
Citations
22 Claims
-
1. A method of optimizing processing of a query specifying a maximum equality condition on an attribute of a table in a parallel processing system, comprising:
-
receiving, by a processing module of a plurality of processing modules deployed in the parallel processing system, the query, wherein the processing module has a subset of rows of the table allocated thereto; initializing, by the processing module, a variable to maintain a local value of the attribute encountered by the processing module to a value that is less than a value that may be stored by the attribute; initializing, by the processing module, a result spool to maintain at least one row of the table having an attribute value that at least equals the variable; iteratively evaluating attribute values of the rows allocated to the processing module; determining to add a first row to the result spool responsive to determining the attribute value of the first row equals or exceeds the local value of the variable; completing evaluation of all rows allocated to the processing module; transmitting the local value of the variable to a central node; receiving, from the central node, a global maximum attribute value; comparing the global maximum value to the local value; determining the global maximum attribute value exceeds the local value; and deleting all rows from the result spool. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method of optimizing processing of a query specifying a maximum equality condition on an attribute of a table in a parallel processing system, comprising:
-
receiving, by a processing module of a plurality of processing modules deployed in the parallel processing system, the query, wherein the processing module has a subset of rows of the table allocated thereto; initializing, by the processing module, a variable to maintain a local value of the attribute encountered by the processing module to a value that is less than a value that may be stored by the attribute; initializing, by the processing module, a result spool to maintain at least one row of the table having an attribute value that at least equals the variable; iteratively evaluating attribute values of the rows allocated to the processing module; determining to add a first row to the result spool responsive to determining the attribute value of the first row equals or exceeds the local value of the variable; completing evaluation of all rows allocated to the processing module; transmitting the local value of the variable to a central node; receiving, from the central node, a global maximum attribute value; comparing the global maximum value to the local value; determining the global maximum attribute value equals the local value; and maintaining rows of the result spool for final results of processing of the query.
-
-
7. A method of optimizing processing of a query specifying a minimum equality condition on an attribute of a table in a parallel processing system, comprising:
-
receiving, by a processing module of a plurality of processing modules deployed in the parallel processing system, the query, wherein the processing module has a subset of rows of the table allocated thereto; initializing, by the processing module, a variable to maintain a local value of the attribute encountered by the processing module to a maximum value that may be stored by the attribute; initializing, by the processing module, a result spool to maintain at least one row of the table having an attribute value that is less than or equals the local value; iteratively evaluating attribute values of the rows allocated to the processing module; determining to add a first row to the result spool responsive to determining the attribute value of the first row is less than or equals the local value; completing evaluation of all rows allocated to the processing module; transmitting the local value of the variable to a central node; receiving, from the central node, a global minimum attribute value; comparing the global minimum value to the local value; determining the global minimum attribute value is less than the local value; and deleting all rows from the result spool. - View Dependent Claims (8, 9, 10)
-
-
11. A method of optimizing processing of a query specifying a minimum equality condition on an attribute of a table in a parallel processing system, comprising:
-
receiving, by a processing module of a plurality of processing modules deployed in the parallel processing system, the query, wherein the processing module has a subset of rows of the table allocated thereto; initializing, by the processing module, a variable to maintain a local value of the attribute encountered by the processing module to a maximum value that may be stored by the attribute; initializing, by the processing module, a result spool to maintain at least one row of the table having an attribute value that is less than or equals the local value; iteratively evaluating attribute values of the rows allocated to the processing module; determining to add a first row to the result spool responsive to determining the attribute value of the first row is less than or equals the local value; completing evaluation of all rows allocated to the processing module; transmitting the local value of the variable to a central node; receiving, from the central node, a global minimum attribute value; comparing the global minimum value to the local value; determining the global minimum attribute value equals the local value; and maintaining rows of the result spool for final results of processing of the query.
-
-
12. A non-transitory computer-readable medium having computer-executable instructions for execution by a processing system, the computer-executable instructions for optimizing processing of a query specifying a maximum equality condition on an attribute of a table in a parallel processing system, the computer-executable instructions, when executed, cause the processing system to:
-
receive, by a processing module of a plurality of processing modules deployed in the parallel processing system, the query, wherein the processing module has a subset of rows of the table allocated thereto; initialize, by the processing module, a variable to maintain a local value of the attribute encountered by the processing module, wherein the variable is initialized to a value that is less than a minimum value that may be stored by the attribute; initialize, by the processing module, a result spool to maintain at least one row of the table having an attribute value that at least equals the variable; iteratively evaluate attribute values of the rows allocated to the processing module; determine to add a first row to the result spool responsive to determining the attribute value of the first row equals or exceeds a value of the variable; complete evaluation of all rows allocated to the processing module; transmit the local value of the variable to a central node; receive, from the central node, a global maximum attribute value; compare the global maximum value to the local value; determine the global maximum attribute value exceeds the local value; and delete all rows from the result spool. - View Dependent Claims (13, 14, 15, 16)
-
-
17. A non-transitory computer-readable medium having computer-executable instructions for execution by a processing system, the computer-executable instructions for optimizing processing of a query specifying a maximum equality condition on an attribute of a table in a parallel processing system, the computer-executable instructions, when executed, cause the processing system to:
-
receive, by a processing module of a plurality of processing modules deployed in the parallel processing system, the query, wherein the processing module has a subset of rows of the table allocated thereto; initialize, by the processing module, a variable to maintain a local value of the attribute encountered by the processing module, wherein the variable is initialized to a value that is less than a minimum value that may be stored by the attribute; initialize, by the processing module, a result spool to maintain at least one row of the table having an attribute value that at least equals the variable; iteratively evaluate attribute values of the rows allocated to the processing module; determine to add a first row to the result spool responsive to determining the attribute value of the first row equals or exceeds a value of the variable; complete evaluation of all rows allocated to the processing module; transmit the local value of the variable to a central node; receive, from the central node, a global maximum attribute value; compare the global maximum value to the local value; determine the global maximum attribute value equals the local value; and maintain rows of the result spool for final results of processing of the query.
-
-
18. A non-transitory computer-readable medium having computer-executable instructions for execution by a processing system, the computer-executable instructions for optimizing processing of a query specifying a minimum equality condition on an attribute of a table in a parallel processing system, the computer-executable instructions, when executed, cause the processing system to:
-
receive, by a processing module of a plurality of processing modules deployed in the parallel processing system, the query, wherein the processing module has a subset of rows of the table allocated thereto; initialize, by the processing module, a variable to maintain a local value of the attribute encountered by the processing module, wherein the variable is initialized to a maximum value that may be stored by the attribute; initialize, by the processing module, a result spool to maintain at least one row of the table having an attribute value that is less than or equals the local value; iteratively evaluate attribute values of the rows allocated to the processing module; determine to add a first row to the result spool responsive to determining the attribute value of the first row is less than or equals the local value; complete evaluation of all rows allocated to the processing module; transmit the local value of the variable to a central node; receive, from the central node, a global minimum attribute value; compare the global minimum value to the local value; determine the global minimum attribute value is less than the local value; and delete all rows from the result spool. - View Dependent Claims (19, 20, 21)
-
-
22. A non-transitory computer-readable medium having computer-executable instructions for execution by a processing system, the computer-executable instructions for optimizing processing of a query specifying a minimum equality condition on an attribute of a table in a parallel processing system, the computer-executable instructions, when executed, cause the processing system to:
-
receive, by a processing module of a plurality of processing modules deployed in the parallel processing system, the query, wherein the processing module has a subset of rows of the table allocated thereto; initialize, by the processing module, a variable to maintain a local value of the attribute encountered by the processing module, wherein the variable is initialized to a maximum value that may be stored by the attribute; initialize, by the processing module, a result spool to maintain at least one row of the table having an attribute value that is less than or equals the local value; iteratively evaluate attribute values of the rows allocated to the processing module; determine to add a first row to the result spool responsive to determining the attribute value of the first row is less than or equals the local value; complete evaluation of all rows allocated to the processing module; transmit the local value of the variable to a central node; receive, from the central node, a global minimum attribute value; compare the global minimum value to the local value; determine the global minimum attribute value equals the local value; and maintain rows of the result spool for final results of processing of the query.
-
Specification