Estimation of clustering for access planning
First Claim
1. A method for generating a clustering statistic for an attribute of a relation to be used in optimizing execution of a query directed to one or more attributes of said relation, comprising:
- accessing records of said relation from a database in electronic storage;
determining clustered storage locations of records in said relation, said clustered storage locations being locations where said records would be found in the event that said records were clustered relative to said attribute;
computing a correlation between actual storage locations of records in said relation and said clustered storage locations of said records; and
generating said clustering statistic based upon said correlation;
utilizing said statistic in execution of a query and retrieval of said records from said electronic storage.
5 Assignments
0 Petitions
Accused Products
Abstract
A method for computing clustering factor that is particularly suitable for use with existing indexes. The clustering factor is generated, by first determining clustered storage locations of records in a relation, i.e., locations where the records would be found if they were clustered relative to the attribute (e.g., locations for the records if they were ordered in storage in accordance with the attribute). Then, the actual storage locations of records are correlated to the clustered storage locations, and a clustering statistic is generated based upon the correlation.
14 Citations
15 Claims
-
1. A method for generating a clustering statistic for an attribute of a relation to be used in optimizing execution of a query directed to one or more attributes of said relation, comprising:
-
accessing records of said relation from a database in electronic storage; determining clustered storage locations of records in said relation, said clustered storage locations being locations where said records would be found in the event that said records were clustered relative to said attribute; computing a correlation between actual storage locations of records in said relation and said clustered storage locations of said records; and generating said clustering statistic based upon said correlation; utilizing said statistic in execution of a query and retrieval of said records from said electronic storage. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer system implementing a relational database system and generating a clustering statistic for an attribute of a relation of said relational database, to be used in optimizing execution of a query directed to one or more attributes of said relation, comprising:
-
electronic storage for said relational database, including a relation having a plurality of tuples including values for a plurality of attributes; and computing circuitry performing query optimization and query execution upon said relational database, said query optimization including generating a clustering statistic for an attribute of said relation by determining clustered storage locations of records in said relation, said clustered storage locations being locations where said records would be found in the event that said records were clustered relative to said attribute, computing a correlation between actual storage locations of records in said relation and said clustered storage locations of said records, generating said clustering statistic based upon said correlation, and utilizing said statistic in execution of a query and retrieval of said records from said electronic storage. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A program product for implementing a relational database system and generating a clustering statistic for an attribute of a relation of said relational database, to be used in optimizing execution of a query directed to one or more attributes of said relation, comprising:
-
a relational database, including a relation that is electronically stored and accessed and has a plurality of tuples including values for a plurality of attributes; and relational database software performing query optimization and query execution upon said relational database, said query optimization including generating a clustering statistic for an attribute of said relation by determining clustered storage locations of records in said relation, said clustered storage locations being locations where said records would be found in the event that said records were clustered relative to said attribute, computing a correlation between actual storage locations of records in said relation and said clustered storage locations of said records, and generating said clustering statistic based upon said correlation, and utilizing said statistic in execution of a query and retrieval of said records from electronic storage; and a signal bearing media holding said relational database and relational database software. - View Dependent Claims (14, 15)
-
Specification