Cardinality estimation using spanning trees

US 9,922,088 B2
Filed: 12/31/2013
Issued: 03/20/2018
Est. Priority Date: 12/31/2013
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method for generating a cardinality estimate, comprising:

identifying a predicate in a query, wherein the predicate is split into a plurality of equivalence classes;

generating a plurality of undirected equivalence graphs from the plurality of equivalence classes, wherein the undirected equivalence graphs include a plurality of weighted edges representing a join predicate between two tables, and wherein the equivalence classes are determined based on sets of common attributes that are included in tables joined in the query;

identifying spanning trees in the plurality of undirected equivalence graphs;

determining a minimum spanning tree of the identified spanning trees;

calculating a cardinality estimate based on the minimum spanning tree based on multiplying each predicate, in a set of identified predicates in the spanning tress, by a selectivity associated with each edge corresponding to the predicate, wherein a quality of the selectivity indicates a relationship between two tables joined in the query, and wherein the relationship indicates at least one of a key or attribute relationship between the two tables; and

selecting a query plan corresponding to the cardinality estimate, wherein the cardinality estimate for the selected query plan is associated with a lower consumption of resources amongst a plurality of query plans in an execution of a query by a processor.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system, computer-implemented method, and computer-program product embodiments for determining a cardinality estimate for a query. A cardinality estimator identifies a predicate in a query, where the predicate is split into a plurality of equivalence classes. The cardinality estimator then generates a plurality of equivalence graphs from the plurality of equivalence classes, one equivalence graph for an equivalence class. Spanning trees are identified from the plurality of equivalence graphs, and the cardinality estimator then determines the cardinality estimate for the query from the spanning trees.

69 Citations

View as Search Results

16 Claims

1. A computer-implemented method for generating a cardinality estimate, comprising:
- identifying a predicate in a query, wherein the predicate is split into a plurality of equivalence classes;
  
  generating a plurality of undirected equivalence graphs from the plurality of equivalence classes, wherein the undirected equivalence graphs include a plurality of weighted edges representing a join predicate between two tables, and wherein the equivalence classes are determined based on sets of common attributes that are included in tables joined in the query;
  
  identifying spanning trees in the plurality of undirected equivalence graphs;
  
  determining a minimum spanning tree of the identified spanning trees;
  
  calculating a cardinality estimate based on the minimum spanning tree based on multiplying each predicate, in a set of identified predicates in the spanning tress, by a selectivity associated with each edge corresponding to the predicate, wherein a quality of the selectivity indicates a relationship between two tables joined in the query, and wherein the relationship indicates at least one of a key or attribute relationship between the two tables; and
  
  selecting a query plan corresponding to the cardinality estimate, wherein the cardinality estimate for the selected query plan is associated with a lower consumption of resources amongst a plurality of query plans in an execution of a query by a processor.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The computer-implemented method of claim 1, wherein an equivalence class in the plurality of the equivalence classes shares a set of attributes common to the plurality of tables.
  - 3. The computer-implemented method of claim 1, wherein an equivalence class in the plurality of the equivalence classes comprises a constant vector.
  - 4. The computer-implemented method of claim 1, wherein generating an undirected equivalence graph in the plurality of undirected equivalence graphs, further comprises:
    - generating a first node and a second node, wherein the first node corresponds to a first table and a set of attributes and the second node corresponds to a second table and a set of attributes, wherein the first table, the second table and the attributes are included in an equivalence class from the plurality of the equivalence classes;
      
      generating an edge between the first node and the second node; and
      
      annotating the edge with a second predicate referencing the attributes between the first node and the second node, wherein the second predicate is a component of the query predicate.
  - 5. The computer-implemented method of claim 4, wherein generating the undirected equivalence graph further comprises:
    - generating a third node including a constant vector;
      
      generating a second edge between a first node and a third node; and
      
      annotating the second edge with a third predicate referencing the attributes between the second table and the constant vector, wherein the third predicate is a component of the query predicate.
  - 6. The computer-implemented method of claim 1, further comprising:
    - determining a confidence level of the selectivity of at least one of the edges, wherein the confidence level is the weight of at least one of the edges.
  - 7. The computer-implemented method of claim 1, wherein the minimum spanning tree comprises one or more trees that include all vertices in a join equivalence undirected graph, wherein nodes of the join equivalence undirected graph is connected using edges with the lowest weights.

8. A system for generating a cardinality estimate, comprising:
- a memory; and
  
  a processor coupled to the memory and configured to;
  
  identify a predicate in a query, wherein the predicate is split into a plurality of equivalence classes;
  
  generate a plurality of undirected equivalence graphs from the plurality of equivalence classes, wherein the undirected equivalence graphs include a plurality of weighted edges representing a join predicate between two tables, and wherein the equivalence classes are determined based on sets of common attributes that are included in tables joined in the query;
  
  identify spanning trees in the plurality of undirected equivalence graphs;
  
  determine a minimum spanning tree of the identified spanning trees;
  
  calculate a cardinality estimate based on the minimum spanning tree based on multiplying each predicate, in a set of identified predicates in the spanning tress, by a selectivity associated with each edge corresponding to the predicate, wherein a quality of the selectivity indicates a relationship between two tables joined in the query, and wherein the relationship indicates at least one of a key or attribute relationship between the two tables; and
  
  select a query plan corresponding to the cardinality estimate wherein the cardinality estimate for the selected query plan is associated with a lower consumption of resources amongst a plurality of query plans in an execution of a query by the processor.
- View Dependent Claims (9, 10, 11, 12)
- - 9. The system of claim 8, wherein an equivalence class in the plurality of equivalence classes shares a set of attributes common to the plurality of tables.
  - 10. The system of claim 8, wherein an equivalence class in the plurality of equivalence classes comprises a constant vector.
  - 11. The system of claim 8, wherein to generate an undirected equivalence graph in the plurality of the undirected equivalence graphs, the processor is further configured to:
    - generate a first node and a second node, wherein the first node corresponds to a first table and a set of attributes and the second node corresponds to a second table and a set of attributes, wherein the first table, the second table and the attributes are included in an equivalence class in the plurality of equivalence classes;
      
      generate an edge between the first node and the second node; and
      
      annotate the edge with a second predicate referencing the attribute between the first node and the second node;
      
      wherein the second predicate is a component of the query predicate.
  - 12. The system of claim 11, to generate the undirected equivalence graph, the processor is further configured to:
    - generate a third node including a constant vector;
      
      generate a second edge between a first node and a third node; and
      
      annotate the second edge with a third predicate referencing the attributes between the second table and the constant vector, wherein the third predicate is a component of the query predicate.

13. A non-transitory computer-readable storage device having instructions stored thereon that, when executed by at least one computing device, causes the at least one computing device to perform operations that generate acardinality estimate, the operations comprising:
- identifying a predicate in a query, wherein the predicate is split into a plurality of equivalence classes,generating a plurality of undirected equivalence graphs from the plurality of equivalence classes, wherein the undirected equivalence graphs include a plurality of weighted edges representing a join predicate between two tables, and wherein the equivalence classes are determined based on sets of common attributes that are included in tables joined in the query;
  
  identifying spanning trees in the plurality of undirected equivalence graphs;
  
  determining a minimum spanning tree of the identified spanning trees;
  
  calculating a cardinality estimate based on the minimum spanning tree based on multiplying each predicate, in a set, of identified predicates in the spanning tress, bya selectivity associated with each edge corresponding to the predicate, wherein a quality of the selectivity indicates a relationship between two tables joined in the query, and wherein the relationship indicates at least one of a key or attribute relationship between the two tables; and
  
  selecting a query plan corresponding to the cardinality estimate wherein the cardinality estimate for the selected query plan is associated with a lower consumption of resources amongst a plurality of query plans in an execution of a query by a processor of the at least one computing device.
- View Dependent Claims (14, 15, 16)
- - 14. The non-transitory computer-readable storage device of claim 13, wherein the query manipulates data in a plurality of tables, wherein a table includes a plurality of attributes.
  - 15. The tangible computer-readable device of claim 13, wherein an equivalence class in the plurality of equivalence classes shares a set of attributes common to the plurality of tables.
  - 16. The non-transitory tangible computer-readable storage device of claim 13, wherein generating an undirected equivalence graph in the plurality of undirected equivalence graphs, further comprises operations comprising:
    - generating a first node and a second node, wherein the first node corresponds to a first table and a set of attributes and the second node corresponds to a second table and a set of attributes, wherein the first table, the second table and the attributes are included in an equivalence class from the plurality of the equivalence classes;
      
      generating an edge between the first node and the second node, andannotating the edge with a second predicate referencing the attributes between the first node and the second node, wherein the second predicate is a component of the query predicate.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sybase Incorporated (SAP SE)
Original Assignee
Sybase Incorporated (SAP SE)
Inventors
Nica, Anisoara
Primary Examiner(s)
Gebresenbet, Dinku

Application Number

US14/145,777
Publication Number

US 20150186461A1
Time in Patent Office

1,540 Days
Field of Search
US Class Current
CPC Class Codes

G06F 16/24545 Selectivity estimation or d...

Cardinality estimation using spanning trees

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

69 Citations

16 Claims

Specification

Use Cases

Quick Links

Others

Cardinality estimation using spanning trees

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

69 Citations

16 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others