REAL TIME DATA REPLICATION FOR QUERY EXECUTION IN A MASSIVELY PARALLEL COMPUTER
First Claim
Patent Images
1. A method for processing a database query, comprising:
- receiving a query of a database, wherein the database is stored on a plurality of compute nodes provided by a parallel computing system;
identifying two or more portions of the query evaluated using data records stored on a first compute node of the plurality of compute nodes;
copying the data records stored on the first compute node to a second compute node;
transmitting a first portion of the query to the first compute node and a second portion of the query to the second compute node, wherein the first compute node and the second compute node execute the respective first query portion and second query portion in parallel, thereby producing respective query results; and
receiving the respective query results from the first compute node and the second compute node.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of the invention may be used to increase query processing parallelism of an in-memory database stored on a parallel computing system. A group of compute nodes each store a portion of data as part of the in-memory database. Further, a pool of compute nodes may be reserved to create copies of data from the compute nodes of the in-memory database as part of query processing. When a query is received for execution, the query may be evaluated to determine whether portions of in-memory should be duplicated to allow multiple elements of the query (e.g., multiple query predicates) to be evaluated in parallel.
-
Citations
21 Claims
-
1. A method for processing a database query, comprising:
-
receiving a query of a database, wherein the database is stored on a plurality of compute nodes provided by a parallel computing system; identifying two or more portions of the query evaluated using data records stored on a first compute node of the plurality of compute nodes; copying the data records stored on the first compute node to a second compute node; transmitting a first portion of the query to the first compute node and a second portion of the query to the second compute node, wherein the first compute node and the second compute node execute the respective first query portion and second query portion in parallel, thereby producing respective query results; and receiving the respective query results from the first compute node and the second compute node. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer readable storage medium containing a program which, when executed, performs an operation, comprising:
-
receiving a query of a database, wherein the database is stored on a plurality of compute nodes provided by a parallel computing system; identifying two or more portions of the query evaluated using data records stored on a first compute node of the plurality of compute nodes; copying the data records stored on the first compute node to a second compute node; transmitting a first portion of the query to the first compute node and a second portion of the query to the second compute node, wherein the first compute node and the second compute node execute the respective first query portion and second query portion in parallel, thereby producing respective query results; and receiving the respective query results from the first compute node and the second compute node. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A parallel computing system, comprising:
-
a plurality of compute nodes, each having at least a processor and a memory, wherein each of the plurality of compute nodes stores a portion of an in-memory database; and a master node having at least a processor and a memory and a database controller program configured to; receive a query of the in-memory database, identify two or more portions of the query evaluated using data records stored on a first compute node of the plurality of compute nodes, copy the data records stored on the first compute node to a second compute node, transmit a first portion of the query to the first compute node and a second portion of the query to the second compute node, wherein the first compute node and the second compute node execute the respective first query portion and second query portion in parallel, thereby producing respective query results; and receive the respective query results from the first compute node and the second compute node. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification