×

AUTOMATICALLY AND ADAPTIVELY DETERMINING EXECUTION PLANS FOR QUERIES WITH PARAMETER MARKERS

  • US 20080195577A1
  • Filed: 02/09/2007
  • Published: 08/14/2008
  • Est. Priority Date: 02/09/2007
  • Status: Abandoned Application
First Claim
Patent Images

1. A computer-based method of automatically and adaptively determining query execution plans for queries having parameter markers, said method comprising:

  • generating, by a computing system, a first classifier trained by an initial set of training points;

    randomly generating a set of random decision trees (RDTs), said set of RDTs having a predetermined number of RDTs, wherein said randomly generating said set of RDTs includes defining a generation procedure for each RDT of said set of RDTs, wherein said defining said generation procedure includes;

    randomly choosing a selectivity of said plurality of selectivities for a first node of an RDT of said set of RDTs, said chosen selectivity not used in a higher node of said RDT, said higher node including said first node in a hierarchy of said RDT,selecting a decision threshold value for said chosen selectivity, said decision threshold value separating a set of query execution plans associated with said node into two disjoint subsets of said set of query execution plans, andrecursively using said generation procedure to expand said RDT for each subset of said two disjoint subsets until a number of query execution plans in a subset of said two disjoint subsets is fewer than a predefined minimum query execution plan threshold, a depth of said RDT reaches a depth threshold based on predefined criteria, or all query execution plans of said subset of said two disjoint subsets belong to a single type;

    dynamically updating, by a computing system at a first runtime thereof, at least one of a workload of queries processed by a database of said computing system and database statistics collected by said database for computing a plurality of selectivities;

    collecting, by a computing system in an off-line phase thereof, said off-line phase being subsequent to said first runtime, a new set of training points, said collecting responsive to a detection of said dynamically updating;

    modifying, by said computing system in said off-line phase, said first classifier into a second classifier, said modifying including utilizing said new set of training points;

    receiving, by said computing system at a second runtime thereof, said second runtime being subsequent to said off-line phase, a query for said database, said query including one or more predicates, each predicate including one or more parameter markers bound to one or more actual values, and said one or more predicates associated with one or more selectivities of said plurality of selectivities in a one-to-one correspondence; and

    automatically determining a query execution plan by said computing system, wherein said automatically determining includes;

    mapping, by said second classifier, said one or more selectivities into said query execution plan, wherein said query execution plan is included in an augmented set of training points, said augmented set including said initial set and said new set,traversing a plurality of decision paths for said query, said decision paths associated with said set of RDTs in a one-to-one correspondence, said traversing including obtaining a set of posterior probabilities across said set of RDTs, each posterior probability of said set of posterior probabilities associated with a first query execution plan of said set of query execution plans,computing an average of said posterior probabilities in said set of posterior probabilities,comparing said average of said posterior probabilities to one or more other averages of other sets of posterior probabilities, said one or more other averages associated with one or more other query execution plans in a one-to-one correspondence, said comparing including identifying an optimal average of said average of said posterior probabilities and said one or more other averages, said identifying said optimal average including utilizing a loss function,identifying said query execution plan to be determined by said automatically determining, said identifying said query execution plan including determining that said optimal average is associated with said query execution plan, andproviding said query execution plan as a prediction of an output of a query optimizer of said database without utilizing said query optimizer to provide said output.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×