Gathering index statistics using sampling
First Claim
Patent Images
1. A method comprising:
- allocating, by one or more processors, a plurality of sample point identifiers to a root node included in an index tree corresponding to a database, the index tree including a plurality of leaf nodes;
identifying, by at least one of the one or more processors, a number of first nodes included in a plurality of first nodes that are child nodes of the root node;
computing, by at least one of the one or more processors, a distribution average based upon the number of first nodes and a number of sample point identifiers included in the plurality of sample point identifiers;
distributing, by at least one of the one or more processors, the plurality of sample point identifiers to the plurality of first nodes, wherein an amount of the plurality of sample point identifiers distributed to each of the plurality of first nodes does not exceed the distribution average;
recursively traversing, by at least one of the one or more processors, through a plurality of hierarchical index levels included in the index tree and distributing the plurality of sample point identifiers from the plurality of first nodes to a subset of the plurality of leaf nodes;
collecting sample data, by at least one of the one or more processors, from the subset of the plurality of leaf nodes corresponding to the distributed plurality of sample point identifiers; and
generating, by at least one of the one or more processors, a query plan corresponding to the database based upon the collected sample data.
2 Assignments
0 Petitions
Accused Products
Abstract
An approach is provided in which a sample point system allocates sample point identifiers to a root node included an index tree that includes multiple leaf nodes. The sample point system distributes the sample point identifiers to the root node'"'"'s child nodes, and recursively traverses through the index tree'"'"'s hierarchical index levels and distributes the sample point identifiers from the child nodes to a subset of the index tree'"'"'s leaf nodes. In turn, the sample point system collects sample data from the subset of the plurality of leaf nodes corresponding to the distributed sample point identifiers.
22 Citations
9 Claims
-
1. A method comprising:
-
allocating, by one or more processors, a plurality of sample point identifiers to a root node included in an index tree corresponding to a database, the index tree including a plurality of leaf nodes; identifying, by at least one of the one or more processors, a number of first nodes included in a plurality of first nodes that are child nodes of the root node; computing, by at least one of the one or more processors, a distribution average based upon the number of first nodes and a number of sample point identifiers included in the plurality of sample point identifiers; distributing, by at least one of the one or more processors, the plurality of sample point identifiers to the plurality of first nodes, wherein an amount of the plurality of sample point identifiers distributed to each of the plurality of first nodes does not exceed the distribution average; recursively traversing, by at least one of the one or more processors, through a plurality of hierarchical index levels included in the index tree and distributing the plurality of sample point identifiers from the plurality of first nodes to a subset of the plurality of leaf nodes; collecting sample data, by at least one of the one or more processors, from the subset of the plurality of leaf nodes corresponding to the distributed plurality of sample point identifiers; and generating, by at least one of the one or more processors, a query plan corresponding to the database based upon the collected sample data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
Specification