Distributed query engine pipeline method and system
First Claim
1. A method of distributing portions of a query over two or more execution engines, the method comprising:
- receiving an input query;
identifying with a first analysis engine, a portion of the input query that can be processed by a first execution engine;
compiling the identified portion of the input query forming a first compiled portion;
rewriting the input query to form a first rewritten query wherein the identified portion of the input query is removed from the input query and replaced by a first placeholder;
passing the first rewritten query to a second analysis engine;
identifying with the second analysis engine, a portion of the first rewritten query that can be processed by a second execution engine; and
compiling the identified portion of the first rewritten query generating a second compiled portion wherein the input query is distributed over the first execution engine and the second execution engine.
2 Assignments
0 Petitions
Accused Products
Abstract
A distributed query engine pipeline architecture comprises cascaded analysis engines that accept an input query and each identifies a portion of the input query that it can pass on to an execution engine. Each stage rewrites the input query to remove the portion identified and replaces it with a placeholder. The rewritten query is forwarded to the next analysis engine in the cascade. Each engine compiles the portion it identified so that an execution engine may process that portion. Execution preferably proceeds from the portion of the query compiled by the last analysis engine. The execution engine corresponding to the last analysis engine executes the query and makes a call to the next higher execution engine in the cascade for data from the preceding portion. The process continues until the results from the input query are fully assembled.
-
Citations
16 Claims
-
1. A method of distributing portions of a query over two or more execution engines, the method comprising:
-
receiving an input query;
identifying with a first analysis engine, a portion of the input query that can be processed by a first execution engine;
compiling the identified portion of the input query forming a first compiled portion;
rewriting the input query to form a first rewritten query wherein the identified portion of the input query is removed from the input query and replaced by a first placeholder;
passing the first rewritten query to a second analysis engine;
identifying with the second analysis engine, a portion of the first rewritten query that can be processed by a second execution engine; and
compiling the identified portion of the first rewritten query generating a second compiled portion wherein the input query is distributed over the first execution engine and the second execution engine. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system for distributive processing of an input query, the system comprising:
-
two or more analysis engines for separating out portions of the input query that can be compiled and executed;
two or more execution engines for operation on the input query; and
access to one or more data sources, wherein the two or more analysis engines operate to independently identify and compile one or more portions of the input query wherein;
at least one of the two or more analysis engines rewrites the input query to remove the portion of the input query that corresponds to an execution engine, and the two or more execution engines process the one or more compiled portions of the input query such that partial query results from one execution engine are passed to a subsequent execution engine and combined to form overall input query results.
-
-
11. A computer-readable media containing instructions, which when run on a computer, execute a method of distributing portions of a query over two or more execution engines, the method comprising:
-
receiving an input query;
identifying with a first analysis engine, a portion of the input query that can be processed by a first execution engine;
compiling the identified portion of the input query forming a first compiled portion;
rewriting the input query to form a first rewritten query wherein the identified portion of the input query is removed from the input query and replaced by a first placeholder;
passing the first rewritten query to a second analysis engine;
identifying with the second analysis engine, a portion of the first rewritten query that can be processed by a second execution engine; and
compiling the identified portion of the first rewritten query generating a second compiled portion wherein the input query is distributed over the first execution engine and the second execution engine. - View Dependent Claims (12, 13, 14, 15, 16)
-
Specification