Database system with methods for optimizing performance of correlated subqueries by reusing invariant results of operator tree
First Claim
1. In a computer system having a processor, a memory, and a storage device, said storage device storing a database comprising database tables, each table comprising rows of data records, each data record storing information in database columns, a method for executing a correlated database query for selecting particular ones of said data records, the method comprising:
- (a) receiving a database query specifying selection of particular ones of said data records, said database query comprising inner and outer query blocks, said inner query block comprising a subquery nested within the database query, wherein said at least one subquery references information from said outer query block;
(b) determining at least one correlated part and at least one uncorrelated part of said subquery, wherein only said correlated part is capable of being affected by changing values of the outer query block during execution of the database query;
(c) creating a cache in said memory for at least storing a result computed for said uncorrelated part after said subquery has been executed for the first time; and
(d) executing said database query, including evaluating said expression of the subquery by;
(i) computing a result for the correlated part of said subquery, (ii) retrieving the cached result for the uncorrelated part of said subquery, and (iii) computing a value for said subquery by combining said computed result with said retrieved cached result.
1 Assignment
0 Petitions
Accused Products
Abstract
Database system and methods are described for improving execution speed of database queries (e.g., for decision support) by optimizing execution of nested queries or “subqueries,” which are commonly used in client/server database environments. In particular, the basic approach employed is to recognize the part of the subquery that is not related to the outer references and cache the result of that part after its first execution. Later, the result can be reused and combined with the result of the rest of the subquery that is changing for each iteration. Methods are employed to recognize the invariant part of a data flow tree, and to restructure the evaluation plan to reuse the stored intermediate result. An efficient method is used to teach an existing join optimizer to understand the invariant feature and thus allow it to be able to generate better join plans in the new context. When query rewriting is not possible, therefore, the invariant technique provides significantly better performance than the traditional nested iteration method.
-
Citations
35 Claims
-
1. In a computer system having a processor, a memory, and a storage device, said storage device storing a database comprising database tables, each table comprising rows of data records, each data record storing information in database columns, a method for executing a correlated database query for selecting particular ones of said data records, the method comprising:
-
(a) receiving a database query specifying selection of particular ones of said data records, said database query comprising inner and outer query blocks, said inner query block comprising a subquery nested within the database query, wherein said at least one subquery references information from said outer query block;
(b) determining at least one correlated part and at least one uncorrelated part of said subquery, wherein only said correlated part is capable of being affected by changing values of the outer query block during execution of the database query;
(c) creating a cache in said memory for at least storing a result computed for said uncorrelated part after said subquery has been executed for the first time; and
(d) executing said database query, including evaluating said expression of the subquery by;
(i) computing a result for the correlated part of said subquery, (ii) retrieving the cached result for the uncorrelated part of said subquery, and (iii) computing a value for said subquery by combining said computed result with said retrieved cached result. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. In a database system comprising a processor, a memory, and a storage device, said storage device storing a database comprising database tables, each table comprising rows of data records, each data record storing information in database columns, an improved query execution system comprising:
-
means for receiving a database query having an expression that includes a subquery nested within the database query, said subquery itself comprising an expression specifying a subquery result which is determined before evaluation of the expression for the database query is completed, wherein said subquery includes a variant portion that is dependent upon at least one value computed from a portion of the database query outside said subquery;
a cache in said memory for storing a result computed for an invariant portion of the subquery, wherein said invariant portion is not dependent upon computation of values from a portion of the database query outside said subquery;
means for executing said database query, including means for evaluating said expression of the subquery by;
(i) computing a result for the variant portion of said subquery, (ii) retrieving the cached result for the invariant portion of said subquery, and (ii) computing a value for said expression of the subquery by combining said computed result with said retrieved cached result. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. In a client/server database system, an improved method for executing a query submitted by a client to a database server, said query including a subquery having variant and invariant portions, the method comprising:
-
dividing the subquery into variant and invariant portions;
creating a cache for caching an intermediate result computed from the portion of the subquery that is invariant from one iteration of the query to another;
executing the query for providing a query result to the client; and
while executing the query, computing a query result at least in part by retrieving said intermediate result for the invariant portion from said cache and combining it with a result computed from said variant portion of said subquery. - View Dependent Claims (32, 33, 34, 35)
-
Specification