Gathering index statistics using sampling
First Claim
Patent Images
1. An information handling system comprising:
- one or more processors;
a memory coupled to at least one of the one or more processors;
one or more non-volatile storage areas coupled to at least one of the one or more processors;
a set of computer program instructions stored in the memory and executed by at least one of the processors in order to perform actions of;
allocating a plurality of sample point identifiers to a root node included in an index tree corresponding to a database stored in one of the non-volatile storage areas, the index tree including a plurality of leaf nodes;
identifying, by at least one of the one or more processors, a number of first nodes included in a plurality of first nodes that are child nodes of the root node;
computing, by at least one of the one or more processors, a distribution average based upon the number of first nodes and a number of sample point identifiers included in the plurality of sample point identifiers;
distributing the plurality of sample point identifiers to the plurality of first nodes, wherein an amount of the plurality of sample point identifiers distributed to each of the plurality of first nodes does not exceed the distribution average;
recursively traversing through a plurality of hierarchical index levels included in the index tree and distributing the plurality of sample point identifiers from the plurality of first nodes to a subset of the plurality of leaf nodes;
collecting sample data from the subset of the plurality of leaf nodes corresponding to the distributed plurality of sample point identifiers; and
generating a query plan corresponding to the database based upon the collected sample data.
1 Assignment
0 Petitions
Accused Products
Abstract
An approach is provided in which a sample point system allocates sample point identifiers to a root node included an index tree that includes multiple leaf nodes. The sample point system distributes the sample point identifiers to the root node'"'"'s child nodes, and recursively traverses through the index tree'"'"'s hierarchical index levels and distributes the sample point identifiers from the child nodes to a subset of the index tree'"'"'s leaf nodes. In turn, the sample point system collects sample data from the subset of the plurality of leaf nodes corresponding to the distributed sample point identifiers.
-
Citations
14 Claims
-
1. An information handling system comprising:
-
one or more processors; a memory coupled to at least one of the one or more processors; one or more non-volatile storage areas coupled to at least one of the one or more processors; a set of computer program instructions stored in the memory and executed by at least one of the processors in order to perform actions of; allocating a plurality of sample point identifiers to a root node included in an index tree corresponding to a database stored in one of the non-volatile storage areas, the index tree including a plurality of leaf nodes; identifying, by at least one of the one or more processors, a number of first nodes included in a plurality of first nodes that are child nodes of the root node; computing, by at least one of the one or more processors, a distribution average based upon the number of first nodes and a number of sample point identifiers included in the plurality of sample point identifiers; distributing the plurality of sample point identifiers to the plurality of first nodes, wherein an amount of the plurality of sample point identifiers distributed to each of the plurality of first nodes does not exceed the distribution average; recursively traversing through a plurality of hierarchical index levels included in the index tree and distributing the plurality of sample point identifiers from the plurality of first nodes to a subset of the plurality of leaf nodes; collecting sample data from the subset of the plurality of leaf nodes corresponding to the distributed plurality of sample point identifiers; and generating a query plan corresponding to the database based upon the collected sample data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer program product stored in a non-transitory computer readable storage medium, comprising computer program code that, when executed by an information handling system, causes the information handling system to perform actions comprising:
-
allocating a plurality of sample point identifiers to a root node included in an index tree of a database, the index tree including a plurality of leaf nodes; identifying, by at least one of the one or more processors, a number of first nodes included in a plurality of first nodes that are child nodes of the root node; computing, by at least one of the one or more processors, a distribution average based upon the number of first nodes and a number of sample point identifiers included in the plurality of sample point identifiers; distributing the plurality of sample point identifiers to the plurality of first nodes, wherein an amount of the plurality of sample point identifiers distributed to each of the plurality of first nodes does not exceed the distribution average; recursively traversing through a plurality of hierarchical index levels included in the index tree and distributing the plurality of sample point identifiers from the plurality of first nodes to a subset of the plurality of leaf nodes; collecting sample data from the subset of the plurality of leaf nodes corresponding to the distributed plurality of sample point identifiers; and generating a query plan corresponding to the database based upon the collected sample data. - View Dependent Claims (11, 12, 13, 14)
-
Specification