Method and apparatus for querying a cube forest data structure
First Claim
1. A method for efficiently answering a query for a single aggregate of data stored on a computer and indexed by a cube forest comprising the steps of:
- a) inputting a query, including specifying the type of query by stating how far down a hierarchy a value of an attribute is specified; and
b) generating a query plan by finding a path in a cube forest template that starts at a root, and includes a node for every attribute value specified in the query;
wherein the path found in step b) represents the query plan;
said method further including the step of transforming the path into a query plan that contains detailed instructions for a generic query plan executer;
said method further including the steps of;
c) beginning at a top of the path and walking to an end of the path; and
d) creating, for every node, an entry that specifies what to do at that node, including (dimension, attribute, value, and operation), wherein if the attribute value is specified in the query, the instruction is (dimension, attribute, value, single), otherwise the instruction is (dimension, attribute, *, sum).
1 Assignment
0 Petitions
Accused Products
Abstract
A device and method is disclosed for using a data structure known as a cube forest for use in a batch-load-then-read-intensively system. The device and method significantly improve the time to execute a bit vector query. Hierarchically split cube forests provide a method for efficiently duplicating information, and can be optimized to reduce update and storage costs. Cube forests including hierarchically split cube forests are most appropriate for read-intensive, update-rarely-and-in-large-batches multidimensional applications in an off-the-shelf (low cost) hardware environment. A method and an apparatus for querying a cube forest for aggregates are disclosed herein.
78 Citations
14 Claims
-
1. A method for efficiently answering a query for a single aggregate of data stored on a computer and indexed by a cube forest comprising the steps of:
-
a) inputting a query, including specifying the type of query by stating how far down a hierarchy a value of an attribute is specified; and
b) generating a query plan by finding a path in a cube forest template that starts at a root, and includes a node for every attribute value specified in the query;
wherein the path found in step b) represents the query plan;
said method further including the step of transforming the path into a query plan that contains detailed instructions for a generic query plan executer;
said method further including the steps of;
c) beginning at a top of the path and walking to an end of the path; and
d) creating, for every node, an entry that specifies what to do at that node, including (dimension, attribute, value, and operation), wherein if the attribute value is specified in the query, the instruction is (dimension, attribute, value, single), otherwise the instruction is (dimension, attribute, *, sum). - View Dependent Claims (2, 3)
(i) matching a superkey of an index with a maximal segment of the query plan, such that the maximal segment starts with a first key in the index'"'"'s superkey; and
(ii) summing, for each effective leaf that matches the values specified by the query plan segment, a result and returning said result, wherein if the query plan segment is a suffix of the query plan, then returning a stored aggregate, else making a recursive search.
-
-
4. A method for querying a cube forest for a group-by aggregate comprising the steps of:
-
a) specifying a type of query by stating (i) how far down a hierarchy a value of an attribute is specified in a query; and
(ii) specifying how much further down a hierarchy an answer is grouped; and
b) generating a query plan by finding a path in a cube forest template that starts at a root and includes a node for every attribute specified in the query;
wherein the path found in step b) represents the query plan;
said method further including the step of transforming the path into a query plan that contains detailed instructions for a generic query plan executer;
said method further including the steps of;
c) beginning at a top of the path and walking to an end of the path; and
d) creating, for every node, an entry that specifies what to do at that node, including (dimension, attribute, value, and operation), wherein if the attribute is specified in the query, the instruction is (dimension, attribute, value, single), else if the attribute is specified as a grouping attribute, the instruction is (dimension, attribute, *, groupby), otherwise the instruction is (dimension, attribute, *, sum). - View Dependent Claims (5, 6, 7, 8, 9, 10)
(i) matching a superkey of an index with a maximal segment of the query plan, such that the maximal segment starts with a first key in the index'"'"'s superkey; and
(ii) summing, for each effective leaf that matches the values specified by the query plan segment, a result and returning said result, wherein if the query plan segment is a suffix of the query plan, then returning a stored aggregate, else making a recursive search.
-
-
7. The method according to claim 6, further comprising the step of retrieving values for a group-by query.
-
8. The method according to claim 7, wherein the step of retrieving values for a group-by query further comprises the step of creating a query plan according to the steps for creating a query plan for a single-value retrieval, except distinguishing between “
- single” and
group-by.
- single” and
-
9. The method according to claim 8, further comprising the step of upon reaching a terminal effective leaf, determining which result aggregate the answer should be added to.
-
10. The method according to claim 4, wherein range queries are performed according to the same steps above, except that high and low values are attached to attribute selections, and a range of the search is extended appropriately.
-
11. A device for querying a cube forest for a single aggregate comprising:
-
a) an input/output device specifying a type of query by stating how far down a hierarchy a value of an attribute is specified; and
b) a processor coupled to the input/output device and generating a query plan by finding a path in a cube forest template that starts at a root, and includes a node for every attribute specified in the query;
said processor to generate the query plan by finding a shortest path if there are more than one path that starts at a root and includes a node for every attribute specified in the query, wherein the path found represents the query plan said processor to transform the path into a query plan that contains detailed instructions for a generic query plan executer by;
c) beginning at a top of the path and walking to an end of the path; and
d) creating, for every node, an entry that specifies what to do at that node, including (dimension, attribute, value, and operation), wherein if the attribute value is specified in the query, the instruction is (dimension, attribute, value, single), otherwise the instruction is (dimension, attribute, *, sum). - View Dependent Claims (12)
a) matching a superkey of an index with a maximal segment of the query plan, such that the maximal segment starts with a first key in the index'"'"'s superkey; and
b) summing, for each effective leaf that matches the values specified by the query plan segment, a result and returning said result, wherein if the query plan segment is a suffix of the query plan, then returning a stored aggregate, else making a recursive search.
-
-
13. A device for querying a cube forest for a group-by aggregate comprising:
-
a) an input/output device specifying a type of query by stating how far down a hierarchy a value of an attribute is specified; and
b) a processor generating a query plan by finding a path in a cube forest template that starts at a root and includes a node for every attribute specified in the query, by finding a shortest path, if there are more than one path that starts at a root and includes a node for every attribute specified in the query, wherein the path found represents the query plan, and by transforming the path into a query plan that contains detailed instructions for a generic query plan executer according to the steps of;
(i) beginning at a top of the path and walking to an end of the path; and
(ii) creating, for every node, an entry that specifies what to do at that node, including (dimension, attribute, value, and operation), wherein if the attribute is specified in the query, the instruction is (dimension , attribute, value, single), otherwise the instruction is (dimension, attribute, *, sum). - View Dependent Claims (14)
a) matching a superkey of an index with a maximal segment of the query plan, such that the maximal segment starts with a first key in the index'"'"'s superkey;
b) summing, for each effective leaf that matches the values specified by the query plan segment, a result and returning said result, wherein if the query plan segment is a suffix of the query plan, then returning a stored aggregate, else making a recursive search;
c) retrieving values for a group-by query by creating a query plan according to the steps for creating a query plan for a single-value retrieval, except distinguishing between “
single” and
group-by; and
d) upon reaching a terminal effective leaf, determining which result aggregate an answer should be added to, wherein range queries are performed according to the same steps above, except that high and low values are attached to attribute selections, and a range of the search is extended appropriately.
-
Specification