Cardinality estimation using spanning trees
First Claim
Patent Images
1. A computer-implemented method for generating a cardinality estimate, comprising:
- identifying a predicate in a query, wherein the predicate is split into a plurality of equivalence classes;
generating a plurality of undirected equivalence graphs from the plurality of equivalence classes, wherein the undirected equivalence graphs include a plurality of weighted edges representing a join predicate between two tables, and wherein the equivalence classes are determined based on sets of common attributes that are included in tables joined in the query;
identifying spanning trees in the plurality of undirected equivalence graphs;
determining a minimum spanning tree of the identified spanning trees;
calculating a cardinality estimate based on the minimum spanning tree based on multiplying each predicate, in a set of identified predicates in the spanning tress, by a selectivity associated with each edge corresponding to the predicate, wherein a quality of the selectivity indicates a relationship between two tables joined in the query, and wherein the relationship indicates at least one of a key or attribute relationship between the two tables; and
selecting a query plan corresponding to the cardinality estimate, wherein the cardinality estimate for the selected query plan is associated with a lower consumption of resources amongst a plurality of query plans in an execution of a query by a processor.
1 Assignment
0 Petitions
Accused Products
Abstract
A system, computer-implemented method, and computer-program product embodiments for determining a cardinality estimate for a query. A cardinality estimator identifies a predicate in a query, where the predicate is split into a plurality of equivalence classes. The cardinality estimator then generates a plurality of equivalence graphs from the plurality of equivalence classes, one equivalence graph for an equivalence class. Spanning trees are identified from the plurality of equivalence graphs, and the cardinality estimator then determines the cardinality estimate for the query from the spanning trees.
69 Citations
16 Claims
-
1. A computer-implemented method for generating a cardinality estimate, comprising:
-
identifying a predicate in a query, wherein the predicate is split into a plurality of equivalence classes; generating a plurality of undirected equivalence graphs from the plurality of equivalence classes, wherein the undirected equivalence graphs include a plurality of weighted edges representing a join predicate between two tables, and wherein the equivalence classes are determined based on sets of common attributes that are included in tables joined in the query; identifying spanning trees in the plurality of undirected equivalence graphs; determining a minimum spanning tree of the identified spanning trees; calculating a cardinality estimate based on the minimum spanning tree based on multiplying each predicate, in a set of identified predicates in the spanning tress, by a selectivity associated with each edge corresponding to the predicate, wherein a quality of the selectivity indicates a relationship between two tables joined in the query, and wherein the relationship indicates at least one of a key or attribute relationship between the two tables; and selecting a query plan corresponding to the cardinality estimate, wherein the cardinality estimate for the selected query plan is associated with a lower consumption of resources amongst a plurality of query plans in an execution of a query by a processor. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for generating a cardinality estimate, comprising:
-
a memory; and a processor coupled to the memory and configured to; identify a predicate in a query, wherein the predicate is split into a plurality of equivalence classes; generate a plurality of undirected equivalence graphs from the plurality of equivalence classes, wherein the undirected equivalence graphs include a plurality of weighted edges representing a join predicate between two tables, and wherein the equivalence classes are determined based on sets of common attributes that are included in tables joined in the query; identify spanning trees in the plurality of undirected equivalence graphs; determine a minimum spanning tree of the identified spanning trees; calculate a cardinality estimate based on the minimum spanning tree based on multiplying each predicate, in a set of identified predicates in the spanning tress, by a selectivity associated with each edge corresponding to the predicate, wherein a quality of the selectivity indicates a relationship between two tables joined in the query, and wherein the relationship indicates at least one of a key or attribute relationship between the two tables; and select a query plan corresponding to the cardinality estimate wherein the cardinality estimate for the selected query plan is associated with a lower consumption of resources amongst a plurality of query plans in an execution of a query by the processor. - View Dependent Claims (9, 10, 11, 12)
-
-
13. A non-transitory computer-readable storage device having instructions stored thereon that, when executed by at least one computing device, causes the at least one computing device to perform operations that generate a
cardinality estimate, the operations comprising: -
identifying a predicate in a query, wherein the predicate is split into a plurality of equivalence classes, generating a plurality of undirected equivalence graphs from the plurality of equivalence classes, wherein the undirected equivalence graphs include a plurality of weighted edges representing a join predicate between two tables, and wherein the equivalence classes are determined based on sets of common attributes that are included in tables joined in the query; identifying spanning trees in the plurality of undirected equivalence graphs; determining a minimum spanning tree of the identified spanning trees; calculating a cardinality estimate based on the minimum spanning tree based on multiplying each predicate, in a set, of identified predicates in the spanning tress, by a selectivity associated with each edge corresponding to the predicate, wherein a quality of the selectivity indicates a relationship between two tables joined in the query, and wherein the relationship indicates at least one of a key or attribute relationship between the two tables; and selecting a query plan corresponding to the cardinality estimate wherein the cardinality estimate for the selected query plan is associated with a lower consumption of resources amongst a plurality of query plans in an execution of a query by a processor of the at least one computing device. - View Dependent Claims (14, 15, 16)
-
Specification