Split processing paths for a database calculation engine
First Claim
1. A computer program product comprising a non-transitory machine-readable storage medium storing instructions that, when executed by at least one programmable processor, cause the at least one programmable processor to perform operations comprising:
- receiving, at a first node defined within a calculation model, first data, the calculation model including a master node and a plurality of processing nodes comprising the first node, the plurality of processing nodes being controlled by the master node, the first data comprising at least one of a first intermediate result of an earlier operation in a calculation plan generated from the calculation model or second data from a table upon which the calculation plan is currently operating in response to a query, the calculation plan generated at runtime and tailored to both the query and a response to the query;
applying, at the first node and in response to receiving the first data, a partition specification to a column in the first data, the applying comprising;
determining, at the first node and in response to receiving the first data, a number of unique values in a column in the first data;
splitting, at the first node and according to the partition specification, a first table into a number of partitions equal to the number of unique values in the column, wherein each unique value has a corresponding partition; and
placing records of the first table having the same unique value together in a partition;
assigning, at the first node, a separate processing path for each of the number of partitions, each separate processing path being assigned for generation, by a respective processing node of the plurality of processing nodes, of one or more intermediate results based on a respective partition of the number of partitions;
executing, at a second node of the plurality of processing nodes, a union operation to union the generated one or more intermediate results, wherein the unioned one or more intermediate results comprise a union result;
using the union result as a subsequent intermediate result by additional execution operations of the calculation plan if additional operations are required by the calculation plan; and
returning the union result as a final result for the query if no additional execution operations are required by the calculation plan.
1 Assignment
0 Petitions
Accused Products
Abstract
A dynamic split node defined within a calculation model can receive data being operated on by a calculation plan generated based on the calculation model. A partition specification can be applied to one or more reference columns in a table containing at least some of the received data. The applying can cause the table to be split such that a plurality of records in the table are partitioned according to the partition specification. A separate processing path can be set for each partition, and execution of the calculation plan can continue using the separate processing paths, each of which can be assigned to a processing node of a plurality of available processing nodes.
-
Citations
9 Claims
-
1. A computer program product comprising a non-transitory machine-readable storage medium storing instructions that, when executed by at least one programmable processor, cause the at least one programmable processor to perform operations comprising:
-
receiving, at a first node defined within a calculation model, first data, the calculation model including a master node and a plurality of processing nodes comprising the first node, the plurality of processing nodes being controlled by the master node, the first data comprising at least one of a first intermediate result of an earlier operation in a calculation plan generated from the calculation model or second data from a table upon which the calculation plan is currently operating in response to a query, the calculation plan generated at runtime and tailored to both the query and a response to the query; applying, at the first node and in response to receiving the first data, a partition specification to a column in the first data, the applying comprising; determining, at the first node and in response to receiving the first data, a number of unique values in a column in the first data; splitting, at the first node and according to the partition specification, a first table into a number of partitions equal to the number of unique values in the column, wherein each unique value has a corresponding partition; and placing records of the first table having the same unique value together in a partition; assigning, at the first node, a separate processing path for each of the number of partitions, each separate processing path being assigned for generation, by a respective processing node of the plurality of processing nodes, of one or more intermediate results based on a respective partition of the number of partitions; executing, at a second node of the plurality of processing nodes, a union operation to union the generated one or more intermediate results, wherein the unioned one or more intermediate results comprise a union result; using the union result as a subsequent intermediate result by additional execution operations of the calculation plan if additional operations are required by the calculation plan; and returning the union result as a final result for the query if no additional execution operations are required by the calculation plan. - View Dependent Claims (2, 3)
-
-
4. A system comprising:
-
at least one programmable processor; and a machine-readable medium storing instructions that, when executed by the at least one processor, cause the at least one programmable processor to perform operations comprising; receiving, at a first node defined within a calculation model, first data, the calculation model including a master node and a plurality of processing nodes comprising the first node, the plurality of processing nodes being controlled by the master node, the first data comprising at least one of a first intermediate result of an earlier operation in a calculation plan generated from the calculation model or second data from a table upon which the calculation plan is currently operating in response to a query, the calculation plan generated at runtime and tailored to both the query and a response to the query; applying, at the first node and in response to receiving the first data, a partition specification to a column in the first data, the applying comprising; determining, at the first node and in response to receiving the first data, a number of unique values in a column in the first data; splitting, at the first node and according to the partition specification, a first table into a number of partitions equal to the number of unique values in the column, wherein each unique value has a corresponding partition; and placing records of the first table having the same unique value together in a partition; assigning, at the first node, a separate processing path for each of the number of partitions, each separate processing path being assigned for generation, by a respective processing node of the plurality of processing nodes, of one or more intermediate results based on a respective partition of the number of partitions; executing, at a second node of the plurality of processing nodes, a union operation to union the generated one or more intermediate results, wherein the unioned one or more intermediate results comprise a union result; using the union result as a subsequent intermediate result by additional execution operations of the calculation plan if additional operations are required by the calculation plan; and returning the union result as a final result for the query if no additional execution operations are required by the calculation plan. - View Dependent Claims (5, 6)
-
-
7. A method comprising:
-
receiving, at a first node defined within a calculation model, first data, the calculation model including a master node and a plurality of processing nodes comprising the first node, the plurality of processing nodes being controlled by the master node, the first data comprising at least one of a first intermediate result of an earlier operation in a calculation plan generated from the calculation model or second data from a table upon which the calculation plan is currently operating in response to a query, the calculation plan generated at runtime and tailored to both the query and a response to the query; applying, at the first node and in response to receiving the first data, a partition specification to a column in the first data, the applying comprising; determining, at the first node and in response to receiving the first data, a number of unique values in a column in the first data; splitting, at the first node and according to the partition specification, a first table into a number of partitions equal to the number of unique values in the column, wherein each unique value has a corresponding partition; placing records of the first table having the same unique value together in a partition; assigning, at the first node, a separate processing path for each of the number of partitions, each separate processing path being assigned for generation, by a respective processing node of the plurality of processing nodes, of one or more intermediate results based on a respective partition of the number of partitions; and executing, at a second node of the plurality of processing nodes, a union operation to union the generated one or more intermediate results, wherein the unioned one or more intermediate results comprise a union result; using the union result as a subsequent intermediate result by additional execution operations of the calculation plan if additional operations are required by the calculation plan; and returning the union result as a final result for the query if no additional execution operations are required by the calculation plan. - View Dependent Claims (8, 9)
-
Specification