Building a Distributed Dwarf Cube using Mapreduce Technique
First Claim
1. A method for building a distributed dwarf cube comprising a plurality of dwarf cuboids using a mapreduce technique, the method comprising:
- receiving data comprising cube values and a cube definition, wherein the cube definition comprises dimensions defined for the cube values;
processing the data;
transforming the data to a format;
generating indexes based upon the format of the data;
sorting the cube values in one or more dimensions based on a cardinality of the cube values, and wherein the cube values are sorted with in an order of highest cardinality to lowest cardinality, wherein the cardinality indicates distinctiveness of the cube values in the one or more dimensions;
partitioning the data into data blocks; and
building a dwarf cuboid for one or more data blocks based upon the order of the cardinality of the cube values.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods for building a distributed dwarf cube comprising dwarf cuboid using mapreduce technique are disclosed. Data comprising cube values and a cube definition may be received. The cube definition comprises dimensions defined for the cube values. The data received is processed. The data may be transformed to a format. Based upon the format of the data, indexes may be generated. The cube values in one or more dimensions may be sorted based on a cardinality of the cube values. The cube values may be sorted in an order of highest cardinality to lowest cardinality. The cardinality indicates distinctiveness of the cube values in the one or more dimensions. The data may be partitioned into data blocks. A dwarf cuboid may be built for one or more data blocks based upon the order of the cardinality of the cube values.
-
Citations
17 Claims
-
1. A method for building a distributed dwarf cube comprising a plurality of dwarf cuboids using a mapreduce technique, the method comprising:
-
receiving data comprising cube values and a cube definition, wherein the cube definition comprises dimensions defined for the cube values; processing the data; transforming the data to a format; generating indexes based upon the format of the data; sorting the cube values in one or more dimensions based on a cardinality of the cube values, and wherein the cube values are sorted with in an order of highest cardinality to lowest cardinality, wherein the cardinality indicates distinctiveness of the cube values in the one or more dimensions; partitioning the data into data blocks; and building a dwarf cuboid for one or more data blocks based upon the order of the cardinality of the cube values. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system for building a distributed dwarf cube comprising a plurality of dwarf cuboids using a mapreduce technique, the system comprising:
-
a processor; and a memory coupled to the processor, wherein the processor executes program instructions stored in the memory, to; receive data comprising cube values and a cube definition, wherein the cube definition comprises dimensions defined for the cube values; process the data; transform the data to a format; generate indexes based upon the format of the data; sort the cube values in one or more dimensions based on a cardinality of the cube values, and wherein the cube values are sorted with in an order of highest cardinality to lowest cardinality, wherein the cardinality indicates distinctiveness of the cube values in the one or more dimensions; partition the data into data blocks; and build a dwarf cuboid for one or more data blocks based upon the order of the cardinality of the cube values. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A non-transitory computer readable medium embodying a program executable in a computing device for building a distributed dwarf cube comprising a plurality of dwarf cuboids using a mapreduce technique, the program comprising:
-
a program code for receiving data comprising cube values and a cube definition, wherein the cube definition comprises dimensions defined for the cube values; a program code for processing the data; a program code for transforming the data to a format; a program code for generating indexes based upon the format of the data; a program code for sorting the cube values in one or more dimensions based on a cardinality of the cube values, and wherein the cube values are sorted with in an order of highest cardinality to lowest cardinality, wherein the cardinality indicates distinctiveness of the cube values in the one or more dimensions; a program code for partitioning the data into data blocks; and a program code for building a dwarf cuboid for one or more data blocks based upon the order of the cardinality of the cube values.
-
Specification