Hadoop OLAP engine
First Claim
Patent Images
1. A system comprising:
- at least one processor of a machine;
a metadata engine to generate cube metadata using a mapping from a cube to Hbase table schema, the cube metadata comprising dimension and measure information for the cube;
a cube build engine to generate cube data for the cube based on the cube metadata received from the metadata engine and source data, executing on the at least one processor of the machine, by performing at least a first MapReduce job and a second MapReduce job on the source data to produce a multi-dimensional cube having multiple cuboids, the first MapReduce job and the second MapReduce job having differently configured sets of mappers such that the first MapReduce job generates a first cuboid having a first quantity of dimensions and the second MapReduce job generates a second cuboid having a second quantity of dimensions that is less than the first quantity of dimensions, the cube build engine further configured to store the cube data to a cube store; and
a query engine to receive a query and retrieve query results by accessing at least one of the first cuboid or the second cuboid.
1 Assignment
0 Petitions
Accused Products
Abstract
In various example embodiments, systems and methods for building data cubes to be stored in a cube store are presented. In some embodiments, a metadata engine generates the cube metadata. In further embodiments, cube data is generated by a cube build engine based on the cube metadata and source data. The cube build engine performs a multi-stage MapReduce job on the source data to produce a multi-dimensional cube lattice having multiple cuboids. In further embodiments, the cube data is provided to the cube store.
6 Citations
19 Claims
-
1. A system comprising:
-
at least one processor of a machine; a metadata engine to generate cube metadata using a mapping from a cube to Hbase table schema, the cube metadata comprising dimension and measure information for the cube; a cube build engine to generate cube data for the cube based on the cube metadata received from the metadata engine and source data, executing on the at least one processor of the machine, by performing at least a first MapReduce job and a second MapReduce job on the source data to produce a multi-dimensional cube having multiple cuboids, the first MapReduce job and the second MapReduce job having differently configured sets of mappers such that the first MapReduce job generates a first cuboid having a first quantity of dimensions and the second MapReduce job generates a second cuboid having a second quantity of dimensions that is less than the first quantity of dimensions, the cube build engine further configured to store the cube data to a cube store; and a query engine to receive a query and retrieve query results by accessing at least one of the first cuboid or the second cuboid.
-
-
2. A method comprising:
-
receiving source data from a database; receiving cube metadata generated from a metadata engine using a mapping from a cube to Hbase table schema, the cube metadata including dimension and measure information for the cube; building the cube based on the cube metadata and the source data, executing on at least one processor of a machine, by performing at least a first MapReduce job and a second MapReduce job on the source data to produce a multi-dimensional cube having multiple cuboids representing cube data, the first MapReduce job and the second MapReduce job having differently configured sets of mappers such that the first MapReduce job generates a first cuboid having a first quantity of dimensions and the second MapReduce job generates a second cuboid having a second quantity of dimensions that is less than the first quantity of dimensions; storing the cube data to a cube store; receiving a query; and retrieving query results by accessing at least one of the first cuboid or the second cuboid. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A machine readable storage device storing instructions that, when executed by at least one processor of a machine, cause the machine to perform operations comprising:
-
receiving source data from a database; receiving cube metadata generated from a metadata engine using a mapping from a cube to table schema, the cube metadata including dimension and measure information for the cube; building the cube based on the cube metadata and the source data by performing at least a first mapping and reducing job and a second mapping and reducing job on the source data to produce a multi-dimensional cube having multiple cuboids representing cube data, the first mapping and reducing job and the second mapping and reducing job having differently configured sets of mappers such that the first mapping and reducing job generates a first cuboid having a first quantity of dimensions and the second mapping and reducing job generates a second cuboid having a second quantity of dimensions that is less than the first quantity of dimensions; storing the cube data to a cube store; receiving a query; and retrieving query results by accessing at least one of the first cuboid or the second cuboid.
-
Specification