Split processing paths for a database calculation engine
First Claim
1. A computer program product comprising a non-transitory machine-readable storage medium storing instructions that, when executed by at least one programmable processor, cause the at least one programmable processor to perform operations comprising:
- receiving, at a dynamic split node defined within a calculation model of a database management system in a multi-node data partitioning landscape that comprises a master node and a plurality of processing nodes, data as an intermediate result of an earlier operation in a calculation plan or data from one or more tables upon which the calculation plan is currently operating in response to a query, wherein the plurality of processing nodes are being controlled by the master node, wherein the calculation plan is generated by the calculation model with sequence of execution operations tailored to the query;
applying, at the dynamic split node, a partitioning specification to one or more reference columns in a table that includes a plurality of records containing the received data;
splitting, at the dynamic split node, the table for handling by a plurality of processing paths according to the partitioning specification, wherein the partitioning specification comprises examining the one or more reference columns, quantifying a number of distinct values in the one or more reference columns, and splitting the table into a number of two or more partitions based on the quantified number of distinct values, wherein the number of two or more partitions for the table is equal to the quantified number of distinct values, wherein each distinct value has a corresponding partition, wherein the partitioning specification further comprises placing the plurality of records into respective partitions such that records having same distinct value are placed together in a partition;
setting, at the dynamic split node, a separate processing path of the plurality of processing paths for a respective partition of the table, wherein the separate processing path is assigned to a respective processing node of the plurality of processing nodes to generate one or more intermediate results used in a following execution operation of the calculation plan; and
continuing execution of the following execution operation of the calculation plan using the plurality of processing paths to union the generated one or more intermediate results from each of the plurality of processing nodes and generate a union result, wherein the union result is used as an intermediate result by additional execution operations of the calculation plan if additional operations are required by the calculation plan, and wherein the union result is returned as a final result for the query if no additional execution operations are required by the calculation plan.
2 Assignments
0 Petitions
Accused Products
Abstract
A dynamic split node defined within a calculation model can receive data being operated on by a calculation plan generated based on the calculation model. A partition specification can be applied to one or more reference columns in a table containing at least some of the received data. The applying can cause the table to be split such that a plurality of records in the table are partitioned according to the partition specification. A separate processing path can be set for each partition, and execution of the calculation plan can continue using the separate processing paths, each of which can be assigned to a processing node of a plurality of available processing nodes.
-
Citations
10 Claims
-
1. A computer program product comprising a non-transitory machine-readable storage medium storing instructions that, when executed by at least one programmable processor, cause the at least one programmable processor to perform operations comprising:
-
receiving, at a dynamic split node defined within a calculation model of a database management system in a multi-node data partitioning landscape that comprises a master node and a plurality of processing nodes, data as an intermediate result of an earlier operation in a calculation plan or data from one or more tables upon which the calculation plan is currently operating in response to a query, wherein the plurality of processing nodes are being controlled by the master node, wherein the calculation plan is generated by the calculation model with sequence of execution operations tailored to the query; applying, at the dynamic split node, a partitioning specification to one or more reference columns in a table that includes a plurality of records containing the received data; splitting, at the dynamic split node, the table for handling by a plurality of processing paths according to the partitioning specification, wherein the partitioning specification comprises examining the one or more reference columns, quantifying a number of distinct values in the one or more reference columns, and splitting the table into a number of two or more partitions based on the quantified number of distinct values, wherein the number of two or more partitions for the table is equal to the quantified number of distinct values, wherein each distinct value has a corresponding partition, wherein the partitioning specification further comprises placing the plurality of records into respective partitions such that records having same distinct value are placed together in a partition; setting, at the dynamic split node, a separate processing path of the plurality of processing paths for a respective partition of the table, wherein the separate processing path is assigned to a respective processing node of the plurality of processing nodes to generate one or more intermediate results used in a following execution operation of the calculation plan; and continuing execution of the following execution operation of the calculation plan using the plurality of processing paths to union the generated one or more intermediate results from each of the plurality of processing nodes and generate a union result, wherein the union result is used as an intermediate result by additional execution operations of the calculation plan if additional operations are required by the calculation plan, and wherein the union result is returned as a final result for the query if no additional execution operations are required by the calculation plan. - View Dependent Claims (2, 3)
-
-
4. A system comprising:
-
at least one programmable processor; and a machine-readable medium storing instructions that, when executed by the at least one processor, cause the at least one programmable processor to perform operations comprising; receiving, at a dynamic split node defined within a calculation model of a database management system in a multi-node data partitioning landscape that comprises a master node and a plurality of processing nodes, data as an intermediate result of an earlier operation in a calculation plan or data from one or more tables upon which the calculation plan is currently operating in response to a query, wherein the plurality of processing nodes are being controlled by the master node, wherein the calculation plan is generated by the calculation model with sequence of execution operations tailored to the query; applying, at the dynamic split node, a partitioning specification to one or more reference columns in a table that includes a plurality of records containing the received data; splitting the table for handling by a plurality of processing paths according to the partitioning specification, wherein the partitioning specification comprises examining the one or more reference columns, quantifying a number of distinct values in the one or more reference columns, and splitting the table into a number of two or more partitions based on the quantified number of distinct values, wherein the number of two or more partitions for the table is equal to the quantified number of distinct values, wherein each distinct value has a corresponding partition, wherein the partitioning specification further comprises placing the plurality of records into respective partitions such that records having same distinct value are placed together in a partition; setting, at the dynamic split node, a separate processing path of the plurality of processing paths for a respective partition of the table, wherein the separate processing path is assigned to a respective processing node of the plurality of processing nodes to generate one or more intermediate results used in a following execution operation of the calculation plan; and continuing execution of the following execution operation of the calculation plan using the plurality of processing paths to union the generated one or more intermediate results from each of the plurality of processing nodes and generate a union result, wherein the union result is used as an intermediate result by additional execution operations of the calculation plan if additional operations are required by the calculation plan, and wherein the union result is returned as a final result for the query if no additional execution operations are required by the calculation plan. - View Dependent Claims (5, 6)
-
-
7. A computer-implemented method comprising:
-
receiving, at a dynamic split node defined within a calculation model of a database management system in a multi-node data partitioning landscape that comprises a master node and a plurality of processing nodes, data as an intermediate result of an earlier operation in a calculation plan or data from one or more tables upon which the calculation plan is currently operating in response to a query, wherein the plurality of processing nodes are being controlled by the master node, wherein the calculation plan is generated by the calculation model with sequence of execution operations tailored to the query; applying, at the dynamic split node, a partitioning specification to one or more reference columns in a table that includes a plurality of records containing the received data; splitting the table for handling by a plurality of processing paths according to the partitioning specification wherein the partitioning specification comprises examining the one or more reference columns, quantifying a number of distinct values in the one or more reference columns, and splitting the table into a number of two or more partitions based on the quantified number of distinct values, wherein the number of two or more partitions for the table is equal to the quantified number of distinct values, wherein each distinct value has a corresponding partition, wherein the partitioning specification further comprises placing the plurality of records into respective partitions such that records having same distinct value are placed together in a partition; setting, at the dynamic split node, a separate processing path of the plurality of processing paths for a respective partition of the table, wherein the separate processing path is assigned to a respective processing node of the plurality of processing nodes to generate one or more intermediate results used in a following execution operation of the calculation plan; and continuing execution of the following execution operation of the calculation plan using the plurality of processing paths to union the generated one or more intermediate results from each of the plurality of processing nodes and generate a union result, wherein the union result is used as an intermediate result by additional execution operations of the calculation plan if additional operations are required by the calculation plan, and wherein the union result is returned as a final result for the query if no additional execution operations are required by the calculation plan. - View Dependent Claims (8, 9, 10)
-
Specification