Multi-system query execution plan
First Claim
1. A method comprising:
- a first database management system (DBMS) storing a first copy of particular data, and a second DBMS storing a second copy of the same particular data;
receiving a query to execute at the first DBMS, the query referencing the particular data;
estimating an initial operation cost of an operation of a plurality of operations in an execution plan of the query on the first DBMS, wherein the execution plan is represented by a tree of nodes, each node representing a corresponding operation of the execution plan and each subtree of nodes, of the tree of nodes, representing particular operations on which a particular operation at a root node of said each subtree of nodes depends;
wherein estimating the initial operation cost of the operation for the first DBMS includes a first operation cost for a subtree of the operation in the first DBMS and is based at least on the operation accessing the first copy of the particular data already available on the first DBMS prior to the receiving of the query;
determining feasibility to execute the operation on the second DBMS at least by determining that the second copy of the particular data is already stored in the second DBMS, wherein the operation is considered feasible if a) the particular data that the operation accesses is stored on the second DBMS, and b) the operation is supported by the second DBMS;
based on determining the feasibility, estimating an operation cost to execute the operation of the execution plan of the query on the second DBMS;
wherein estimating the operation cost of the operation for the second DBMS includes a second operation cost for the subtree of the operation in the second DBMS and is based at least on the operation accessing the second copy of the particular data already available on the second DBMS prior to the receiving of the query;
wherein the operation cost includes communication cost for parent and child node operations of the subtree of the operation in the second DBMS;
wherein the second DBMS is heterogeneous from the first DBMS;
comparing the operation cost of executing the subtree of the operation on the second DBMS with the initial operation cost of executing the subtree of the operation on the first DBMS thereby identifying whether to offload the operation of the execution plan to the second DBMS;
executing the query at least in part by executing the operation on either the first DBMS or the second DBMS to generate result data;
wherein the method is performed by one or more computing devices.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques are described to evaluate an operation from an execution plan of a query to offload the operation to another database management system for less costly execution. In an embodiment, the execution plan is determined based on characteristics of the database management system that received the query for execution. One or more operations in the execution plan are then evaluated for offloading to another heterogeneous database management system. In a related embodiment, the offloading cost for each operation may also include communication cost between the database management systems. The operations that are estimated to be less costly to execute on the other database management system are then identified for offloading to the other database management system. In an alternative embodiment, the database management system generates permutations of execution plans for the same query, and similarly evaluates each permutation of the execution plans for offloading its one or more operations. Based on the total cost of each permutation, which may include offloading cost for one or more operations to another database management system, the least costly plan is selected for the query execution.
-
Citations
28 Claims
-
1. A method comprising:
-
a first database management system (DBMS) storing a first copy of particular data, and a second DBMS storing a second copy of the same particular data; receiving a query to execute at the first DBMS, the query referencing the particular data; estimating an initial operation cost of an operation of a plurality of operations in an execution plan of the query on the first DBMS, wherein the execution plan is represented by a tree of nodes, each node representing a corresponding operation of the execution plan and each subtree of nodes, of the tree of nodes, representing particular operations on which a particular operation at a root node of said each subtree of nodes depends; wherein estimating the initial operation cost of the operation for the first DBMS includes a first operation cost for a subtree of the operation in the first DBMS and is based at least on the operation accessing the first copy of the particular data already available on the first DBMS prior to the receiving of the query; determining feasibility to execute the operation on the second DBMS at least by determining that the second copy of the particular data is already stored in the second DBMS, wherein the operation is considered feasible if a) the particular data that the operation accesses is stored on the second DBMS, and b) the operation is supported by the second DBMS; based on determining the feasibility, estimating an operation cost to execute the operation of the execution plan of the query on the second DBMS; wherein estimating the operation cost of the operation for the second DBMS includes a second operation cost for the subtree of the operation in the second DBMS and is based at least on the operation accessing the second copy of the particular data already available on the second DBMS prior to the receiving of the query; wherein the operation cost includes communication cost for parent and child node operations of the subtree of the operation in the second DBMS; wherein the second DBMS is heterogeneous from the first DBMS; comparing the operation cost of executing the subtree of the operation on the second DBMS with the initial operation cost of executing the subtree of the operation on the first DBMS thereby identifying whether to offload the operation of the execution plan to the second DBMS; executing the query at least in part by executing the operation on either the first DBMS or the second DBMS to generate result data; wherein the method is performed by one or more computing devices. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. One or more non-transitory storage media storing a set of instructions which, when executed by one or more processors, causes:
-
a first database management system (DBMS) storing a first copy of particular data, and a second DBMS storing a second copy of the same particular data; receiving a query to execute at the first DBMS, the query referencing the particular data; estimating an initial operation cost of an operation of a plurality of operations in an execution plan of the query on the first DBMS, wherein the execution plan is represented by a tree of nodes, each node representing a corresponding operation of the execution plan and each subtree of nodes, of the tree of nodes, representing particular operations on which a particular operation at a root node of said each subtree of nodes depends; wherein estimating the initial operation cost of the operation for the first DBMS includes a first operation cost for a subtree of the operation in the first DBMS and is based at least on the operation accessing the first copy of the particular data already available on the first DBMS prior to the receiving of the query; determining feasibility to execute the operation on the second database management system DBMS at least by determining that the second copy of the particular data is already stored in the second DBMS, wherein the operation is considered feasible if a) the particular data that the operation accesses is stored on the second DBMS, and b) the operation is supported by the second DBMS; based on determining the feasibility, estimating an operation cost to execute the operation of the execution plan of the query on the second DBMS; wherein estimating the operation cost of the operation for the second DBMS includes a second operation cost for the subtree of the operation in the second DBMS and is based at least on the operation accessing the second copy of the particular data already available on the second DBMS prior to the receiving of the query; wherein the operation cost includes communication cost for parent and child node operations of the subtree of the operation in the second DBMS; wherein the second DBMS is heterogeneous from the first DBMS; comparing the operation cost of executing the subtree of the operation on the second DBMS with the initial operation cost of executing the subtree of the operation on the first DBMS thereby identifying whether to offload the operation of the execution plan to the second DBMS; executing the query at least in part by executing the operation on either the first DBMS or the second DBMS to generate result data. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
Specification