Processing a database query using a shared metadata store
First Claim
Patent Images
1. A method for executing queries in a parallel processing database system, comprising:
- receiving a query at a master node, the master node comprising a database catalog including metadata defining database objects;
in response to receiving the query at the master node, initiating a catalog server session, taking a snapshot of the metadata, and associating the snapshot of the metadata with the catalog server session, wherein a separate catalog server session is initiated and a separate snapshot of the metadata is taken for each query;
transmitting a query plan and query metadata to a worker node, wherein the query plan is based on the query, wherein the query metadata includes metadata for executing the query plan, wherein the query metadata includes database table definitions that define database objects, and wherein the query metadata is retrieved from the snapshot of the metadata associated with the catalog server session;
receiving, by the master node, a request for additional metadata that is required for the worker node to execute the query plan, wherein in response to a determination that the worker node requires the additional metadata for the worker node to execute the query plan, the worker node queries a parent in a tree structure of a plurality of worker nodes in the parallel processing database system for the additional metadata, wherein the parent node is a node between the master node and the worker node in relation to the tree structure;
communicating, to the worker node, the additional metadata that is required for the worker node to execute the query plan, wherein the additional metadata is retrieved from a same session as the catalog server session corresponding to the query;
executing the query plan on the worker node; and
returning, to the master node, a result associated with the execution of the query plan on the worker node.
10 Assignments
0 Petitions
Accused Products
Abstract
A method and system for executing a query in parallel is disclosed. A master node may receive a query from a client and develop query plans from that query. The query plans may be forwarded to worker nodes for execution, and each query plan may be accompanied by query metadata. The metadata may be stored in a catalog on the master node.
81 Citations
28 Claims
-
1. A method for executing queries in a parallel processing database system, comprising:
-
receiving a query at a master node, the master node comprising a database catalog including metadata defining database objects; in response to receiving the query at the master node, initiating a catalog server session, taking a snapshot of the metadata, and associating the snapshot of the metadata with the catalog server session, wherein a separate catalog server session is initiated and a separate snapshot of the metadata is taken for each query; transmitting a query plan and query metadata to a worker node, wherein the query plan is based on the query, wherein the query metadata includes metadata for executing the query plan, wherein the query metadata includes database table definitions that define database objects, and wherein the query metadata is retrieved from the snapshot of the metadata associated with the catalog server session; receiving, by the master node, a request for additional metadata that is required for the worker node to execute the query plan, wherein in response to a determination that the worker node requires the additional metadata for the worker node to execute the query plan, the worker node queries a parent in a tree structure of a plurality of worker nodes in the parallel processing database system for the additional metadata, wherein the parent node is a node between the master node and the worker node in relation to the tree structure; communicating, to the worker node, the additional metadata that is required for the worker node to execute the query plan, wherein the additional metadata is retrieved from a same session as the catalog server session corresponding to the query; executing the query plan on the worker node; and returning, to the master node, a result associated with the execution of the query plan on the worker node. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A computer program product for executing queries in a parallel processing database system, comprising a non-transitory computer readable medium having program instructions embodied therein for:
-
receiving a query at a master node, the master node comprising a database catalog including metadata defining database objects; in response to receiving the query at the master node, initiating a catalog server session, taking a snapshot of the metadata, and associating the snapshot of the metadata with the catalog server session, wherein a separate catalog server session is initiated and a separate snapshot of the metadata is taken for each query; transmitting a query plan and query metadata to a worker node, wherein the query plan is based on the query, wherein the query metadata includes metadata for executing the query plan, wherein the query metadata includes database table definitions that define database objects, and wherein the query metadata is retrieved from the snapshot of the metadata associated with the catalog server session; receiving, by the master node, a request for additional metadata that is required for the worker node to execute the query plan, wherein in response to a determination that the worker node requires the additional metadata for the worker node to execute the query plan, the worker node queries a parent in a tree structure of a plurality of worker nodes in the parallel processing database system for the additional metadata, wherein the parent node is a node between the master node and the worker node in relation to the tree structure; communicating, to the worker node, the additional metadata that is required for the worker node to execute the query plan, wherein the additional metadata is retrieved from a same session as the catalog server session corresponding to the query; executing the query plan on the worker node; and returning, to the master node, a result associated with the execution of the query plan on the worker node. - View Dependent Claims (19, 20, 21, 22, 23)
-
-
24. A system for executing queries in a parallel processing database, comprising a non-transitory computer readable medium and a processor configured to:
-
receive a query at a master node, the master node comprising a database catalog including metadata defining database objects; in response to receiving the query at the master node, initiating a catalog server session, taking a snapshot of the metadata, and associating the snapshot of the metadata with the catalog server session, wherein a separate catalog server session is initiated and a separate snapshot of the metadata is taken for each query; transmit a query plan and query metadata to a worker node, wherein the query plan is based on the query, wherein the query metadata includes metadata for executing the query plan, wherein the query metadata includes database table definitions that define database objects, and wherein the query metadata is retrieved from the snapshot of the metadata associated with the catalog server session; receive, by the master node, a request for additional metadata that is required for the worker node to execute the query plan, wherein in response to a determination that the worker node requires the additional metadata for the worker node to execute the query plan, the worker node queries a parent in a tree structure of a plurality of worker nodes in the parallel processing database system for the additional metadata, wherein the parent node is a node between the master node and the worker node in relation to the tree structure; communicate, to the worker node, the additional metadata that is required for the worker node to execute the query plan, wherein the additional metadata is retrieved from a same session as the catalog server session corresponding to the query; execute the query plan on the worker node; and return, to the master node, a result associated with the execution of the query plan on the worker node. - View Dependent Claims (25, 26, 27, 28)
-
Specification