Method of calculating tuples for data cubes
First Claim
1. A method of constructing a data cube data structure from a data set, comprising the steps of:
- partitioning the data set by a first attribute into a plurality of data fragments, and for each data fragment,determining whether the size of the data fragment exceeds a predetermined threshold, andwhen the size of the data fragment does not exceed the predetermined threshold, computing cuboid tuples from the data fragment according to;
sorting data of the data fragment by a first order of attributes, andcalculating tuples for all cuboid fragments indexed by a prefix of the first sort order.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus of calculating data cubes is shown in which a data set is partitioned into memory sized data fragments and cuboid tuples are calculated from the data fragments. A search lattice of the data cube is used as a basis for ordering calculations of lower dimensional cuboids in the data cube. Identification of a minimum number of paths through the lattice that is sufficient to traverse all nodes in the lattice is achieved by iteratively duplicating twice all paths in a lower dimensional space, distributing a new attribute to the first duplicate, moving end points from paths of the second duplicate to a corresponding path in the first duplicate and merging the first and second duplicates.
137 Citations
41 Claims
-
1. A method of constructing a data cube data structure from a data set, comprising the steps of:
partitioning the data set by a first attribute into a plurality of data fragments, and for each data fragment, determining whether the size of the data fragment exceeds a predetermined threshold, and when the size of the data fragment does not exceed the predetermined threshold, computing cuboid tuples from the data fragment according to; sorting data of the data fragment by a first order of attributes, and calculating tuples for all cuboid fragments indexed by a prefix of the first sort order. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
14. A method of selecting a minimum number of paths necessary to traverse a search lattice of a data cube, comprising the steps of:
-
counting out a number of dimensions equal to a number of dimensions of the data cube, and for each counted dimension, duplicating twice a set of paths of a search lattice for a data cube having one less dimension than the counted dimension, distributing an attribute associated with the counted dimension to the first duplicate, for each path in the second duplicate, moving an end point of the path to a corresponding path in the first duplicate, and merging the paths of the first and second duplicate; and after all dimensions have been counted, reordering a prefix of each path in an order of end point of the path to first point of the path.
-
-
15. A data cube data structure constructed from a data set according to the method of:
partitioning the data set by a first attribute into a plurality of data fragments, and for each data fragment, determining whether the size of the data fragment exceeds a predetermined threshold, and when the size of the data fragment does not exceed the predetermined threshold, computing cuboid fragment tuples from the sorted data fragment according to; sorting data of the data fragment by a first order of attributes, calculating tuples for all cuboid fragments indexed by a prefix of the first sort order. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27)
-
28. A method of calculating a data cube from a data set, comprising the steps of:
-
identifying a minimum number of paths necessary to traverse a search lattice representing the data cube, wherein each of the minimum paths is characterized by a root node of the path, and for each identified path, calculating the data cube from the data set indexed by each of the identified paths including, for each path; sorting data of a cuboid that is a parent to the root node by a sort order identifying the path, and calculating cuboids indexed by the sort order, wherein tuples of each cuboid indexed by one of the minimum paths are calculated in unison.
-
-
29. A data cube data structure constructed from a data set according to the method of:
-
identifying a minimum number of paths necessary to traverse a search lattice representing the data cube, wherein each of the minimum paths is characterized by a root node of the path, and calculating cuboids indexed by each of the identified paths including, for each path; sorting data of a cuboid that is a parent to the root node by a sort order identifying the path, and calculating cuboids indexed by the sort order, wherein tuples of each cuboid indexed by one of the minimum paths are calculated in unison.
-
-
30. A method for performing a multiple attribute query on a data set in a memory, comprising the steps of:
-
determining whether a size of the data set exceeds a predetermined threshold; when the size exceeds the predetermined threshold; partitioning the data set by a first attribute into a plurality of data fragments, and computing cuboid tuples from the data fragments according to; sorting data of the data fragment by a first order of attributes, calculating tuples for all cuboid fragments indexed by a prefix of the first sort order, and identifying cuboid tuples responsive to the query. - View Dependent Claims (31, 32, 33, 34, 35, 36)
-
-
37. Apparatus for performing a multiple attribute query on a data set, comprising:
-
a first memory for storing the data set, a second memory that implements data storage and retrieval faster than the first memory, and processor in communication with the first and second memories, wherein; the processor determines whether the data set fits within the second memory, when the data set does not fit within the second memory, the processor partitions the data set into a plurality of data fragments by a first attribute, for each data fragment, the processor loads the data fragment into the second memory, and the processor computes cuboid tuples from the loaded data fragment according to; sorting data of the data fragment by a first order of attributes, calculating tuples for all cuboid fragments indexed by a prefix of the first sort order wherein certain of the computed cuboid tuples are responsive to the query. - View Dependent Claims (38, 39, 40, 41)
-
Specification