Database system with methodology providing improved cost estimates for query strategies
First Claim
1. In a computer system providing a database storing database objects, a method for improving determination of cost estimates associated with data access occurring during execution of a database query, the method comprising:
- receiving a database query specifying a database operation for at least one database object, said database query specifying a query condition for selecting particular rows based on multiple attributes of said at least one database object; and
determining an estimate for the cost associated with a query execution path for executing the database query by;
determining selectivity information providing a selectivity estimate for each of said multiple attributes, determining correlation information providing a measure of how well at least some of said multiple attributes are correlated, and combining said selectivity information together, based at least in part on said correlation information, for determining a multi-attribute selectivity estimate for the query condition.
1 Assignment
0 Petitions
Accused Products
Abstract
Database system and methods are described for improving execution speed of database queries (e.g., for decision support). A multi-attribute selectivity optimization methodology is described that provides a more accurate estimate of the cost of a query execution plan, so that the predicted performance of the final execution plan will be more accurate. The densities by how much the selectivity deviates from a single attribute density and by how much the multi-attribute densities differ from one another are used as a basis for multi-selectivity estimates. The multi-attribute densities are used to scale estimates between extremes of total independence and total dependence. By taking into account how well attributes are correlated, the approach is able to provide more accurate multi-selectivity estimates. As a result, the database system can formulate better query plans and, thus, provide better performance.
137 Citations
20 Claims
-
1. In a computer system providing a database storing database objects, a method for improving determination of cost estimates associated with data access occurring during execution of a database query, the method comprising:
-
receiving a database query specifying a database operation for at least one database object, said database query specifying a query condition for selecting particular rows based on multiple attributes of said at least one database object; and
determining an estimate for the cost associated with a query execution path for executing the database query by;
determining selectivity information providing a selectivity estimate for each of said multiple attributes, determining correlation information providing a measure of how well at least some of said multiple attributes are correlated, and combining said selectivity information together, based at least in part on said correlation information, for determining a multi-attribute selectivity estimate for the query condition. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
adding selectivity information for an attribute only if a particular value specified in the database query for the attribute is not totally dependent on a given value specified in the database query for another attribute whose selectivity information has been taken into account.
-
-
5. The method of claim 1, wherein said determined selectivity information provides a measure of the fraction of rows of said at least one database object that satisfy said query condition.
-
6. The method of claim 1, wherein each of said multiple attributes corresponds to a particular column of said at least one database object.
-
7. The method of claim 1, further comprising:
formulating a query execution plan for providing access to said a least one database object, said query execution plan including an access strategy based at least in part on the determined multi-attribute selectivity estimate.
-
8. The method of claim 1, wherein said formulating a query execution plan step includes determining a particular access path for use by access methods, for executing the database query.
-
9. The method of claim 1, wherein said at least one database object comprises a database table.
-
10. The method of claim 1, wherein said at least one database object comprises a database index.
-
11. A database system employing cost estimates comprising:
-
a database storing database objects;
an optimizer which employs cost estimates for formulating query access plans in response to receiving a database query specifying a database operation for at least one database object, said database query specifying a query condition for selecting particular rows based on multiple attributes of said at least one database object, wherein said optimizer determines an optimal execution path for executing the database query by;
determining selectivity information providing a selectivity estimate for each of said multiple attributes, determining correlation information providing a measure of how well at least some of said multiple attributes are correlated, and based at least in part on said correlation information, combining said selectivity information together for determining a multi-attribute selectivity estimate for the query condition; and
an execution unit for executing a query access plan for the database query, said query access plan including an access strategy based at least in part on the determined multi-attribute selectivity estimate. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification