Method and system for parallel processing of database queries
First Claim
1. A system for processing at least one query to a database, the system comprising:
- a first processing matrix comprising;
a master node of a first type having a processor and receiving at least one executable compiled from a query; and
a first plurality of slave nodes of a first type each comprising a processor and means for storing data, each first type slave node storing a portion of the database, each first type slave node processor adapted to execute a first portion of the at least one executable on its stored database portion to generate query results, the plurality of first type slave nodes executing the first executable portion substantially in parallel to collectively generate a set of initial query results; and
a second processing matrix receiving the initial query results generated by the first processing matrix and comprising;
a second plurality of slave nodes of a second type each comprising a processor and means for storing a portion of the initial query results, each second type slave node processor adapted to execute a second portion of the at least one executable on its stored initial query results portion to generate a portion of intermediary query results, the plurality of second type slave nodes executing the second executable portion substantially in parallel to collectively generate a set of intermediary query results.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and methods for parallel processing of queries to one or more databases are described herein. One or more databases may be distributed among a subset of slave nodes of a global-results processing matrix. A query to the database may be generated using a query-based high-level programming language. The query-based source code then may be converted to intermediary source code in a common programming language and then compiled into a dynamic link library (DLL) or other type of executable. The DLL is then distributed among the slave nodes of the processing matrix, whereupon the slave nodes execute related portions of the DLL substantially in parallel to generate initial query results. The initial query results may then be provided to master node of the global-results processing matrix for additional processing, whereby the master node is adapted to execute one or more associated portions of the DLL on the initial query results.
126 Citations
104 Claims
-
1. A system for processing at least one query to a database, the system comprising:
-
a first processing matrix comprising;
a master node of a first type having a processor and receiving at least one executable compiled from a query; and
a first plurality of slave nodes of a first type each comprising a processor and means for storing data, each first type slave node storing a portion of the database, each first type slave node processor adapted to execute a first portion of the at least one executable on its stored database portion to generate query results, the plurality of first type slave nodes executing the first executable portion substantially in parallel to collectively generate a set of initial query results; and
a second processing matrix receiving the initial query results generated by the first processing matrix and comprising;
a second plurality of slave nodes of a second type each comprising a processor and means for storing a portion of the initial query results, each second type slave node processor adapted to execute a second portion of the at least one executable on its stored initial query results portion to generate a portion of intermediary query results, the plurality of second type slave nodes executing the second executable portion substantially in parallel to collectively generate a set of intermediary query results. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43)
-
-
44. A system for processing at least one query to a database, the system comprising:
-
a general-purpose (GP) query processing matrix comprising;
a GP master node having a processor and receiving at least one executable compiled from a query; and
a plurality of GP slave nodes each comprising a processor and means for storing data, each GP slave node storing a portion of the database, each GP slave node processor adapted to execute a first portion of the at least one executable on its stored database portion to generate query results, the plurality of GP slave nodes executing the first executable portion substantially in parallel to collectively generate a set of initial query results; and
a global-results (GR) processing matrix receiving the initial query results generated by the general-purpose processing matrix and comprising;
a plurality of GR slave nodes each comprising a processor and means for storing a portion of the initial query results, each GR slave node processor adapted to execute a second portion of the at least one executable on its stored initial query results portion to generate a portion of intermediary query results, the plurality of GR slave nodes executing the second executable portion substantially in parallel to collectively generate a set of intermediary query results; and
a GR master node comprising a processor and means for storing data. - View Dependent Claims (45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86)
-
-
87. A system for processing at least one query to a database, the system comprising:
-
a general-purpose (GP) query processing matrix comprising;
a GP master node comprising a processor, memory, disk storage, and a network interface, the GP master node receiving at least one executable compiled from a query; and
a plurality of GP slave nodes each comprising a processor, memory, disk storage, and a network interface, each GP slave node storing a portion of the database, each GP slave node processor adapted to execute a first portion of the at least one executable using the stored database portion to generate query results, the first portion of the at least one executable representing at least one database operation, the GP slave node executing the first executable portion substantially in parallel with other GP slave nodes to collectively generate a set of initial query results; and
a global-results (GR) processing matrix receiving the initial query results generated by the general-purpose processing matrix and comprising;
a plurality of GR slave nodes each comprising a processor, memory, disk storage, and a network interface, each GR slave node storing a portion of the initial query results, each GR slave node processor adapted to execute a second portion of the at least one executable on the stored initial query results portion to generate a portion of intermediary query results, the second portion of the at least one executable representing at least one database operation, the GR slave node executing the second executable portion substantially in parallel with other GR slave nodes to collectively generate a set of intermediary query results; and
a GR master node comprising a processor, memory, disk storage, and a network interface, the GR master node receiving and storing the intermediary query results generated by the GR slave nodes, the GR master node processor adapted to execute a third portion of the at least one executable on the intermediate query results to generate a refined set of query results, the third portion of the at least one executable representing at least one database operation. - View Dependent Claims (88, 89, 90)
-
-
91. A system for processing at least one query to a database, the system comprising:
-
a query server for receiving and processing a query;
a general-purpose (GP) query processing matrix having a plurality of processing nodes comprising;
a GP master node having a processor and at least one data storage device, the GP master node being adapted to receive at least one executable compiled from a query;
a plurality of GP slave nodes operably connected to the GP master node, wherein each GP slave node is adapted to;
store a substantially different portion of the database in memory at the slave node; and
execute a first portion of at least one executable using the stored database portion to generate query results, the first portion of the at least one executable representing at least one database operation to be performed on the stored database portion; and
wherein the plurality of GP slave nodes collectively generates a set of initial query results; and
a global-results (GR) processing matrix operably connected to the query server and including;
a plurality of GR slave nodes, each GR slave node being adapted to;
store a substantially different portion of the initial query results on disk storage of the slave node; and
execute a second portion of the at least one executable using the stored initial query results portion to generate a portion of intermediary query results, wherein the plurality of GR slave nodes execute the second portion substantially in parallel to generate a set of intermediary results; and
a master node having a processor and at least one storage device and being operably connected to the plurality of slave nodes. - View Dependent Claims (92, 93, 94, 95, 96, 97, 98)
-
-
99. A system for processing at least one query to at least one database, the system comprising:
-
a query server being adapted to;
generate intermediary source code from query source code, the query source code representing least one database operation using the database and wherein the query source code is formatted based in part on a query-based programming language; and
compile the intermediary source code to generate at least one executable;
a general-purpose query processing matrix operably connected to the query server and including;
a plurality of slave nodes, each slave node storing in memory a different portion of the database and being adapted to execute a first portion of the at least one executable using the stored database portion to generate a portion of initial query results, the first portion of the at least one executable representing at least one database operation on the stored database portion, the first portion being executed by the slave node substantially in parallel with an execution of the first portion by other slave nodes;
at least one level of collator nodes, each collator node of each level storing in memory query results from a lower level;
a global-results processing matrix including;
a plurality of slave nodes, each slave node storing on disk storage a different portion of the database and being adapted to execute a first portion of the at least one executable using the stored database portion to generate a portion of initial query results, the first portion of the at least one executable representing at least one database operation on the stored database portion, the first portion being executed by the slave node substantially in parallel with an execution of the first portion by other slave nodes; and
a master node operably connected to the plurality of slave nodes and being adapted to;
store the initial query results on the disk storage of the master node; and
execute a second portion of the at least one executable on the stored initial query results to generate final query results, the second portion of the at least one executable representing at least one database operation on the initial query results.
-
-
100. A method for processing at least one query to a database distributed among a plurality of slave nodes, each slave node storing a substantially distinct database portion on disk storage, the method comprising the steps of:
-
receiving a query in a query-based language source code and compiling at least one executable from the query source code;
executing, at each slave node, a first portion of the at least one executable using the stored database portion to generate a portion of initial query results, the first portion of the at least one executable representing at least one database operation on the stored database portion, the first portion being executed by the slave node substantially in parallel with an execution of the first portion by other slave nodes;
storing the initial query results to disk storage of a master node; and
executing, at the master node, a second portion of the at least one executable using the stored initial query results to generate resultant query results, the second portion of the at least one executable representing at least one database operation on the initial query results. - View Dependent Claims (101)
-
-
102. In a global-results processing matrix including a master node and a plurality of slave nodes, each of the master node and slave nodes including a processor and disk storage, a computer readable medium, the computer readable medium comprising:
-
a first set of executable instructions being adapted to manipulate the processor of each slave node to execute a first portion of the at least one executable using the a portion of a database stored at the disk storage of the slave node to generate a portion of initial query results, the first portion of the at least one executable representing at least one database operation on the stored database portion, the first portion being executed by the slave node substantially in parallel with an execution of the first portion by other slave nodes; and
a second set of executable instructions being adapted to manipulate the processor of the master node to execute a second portion of the at least one executable using the initial query results to generate final query results, the second portion of the at least one executable representing at least one database operation using the initial query results. - View Dependent Claims (103, 104)
-
Specification