Distributed storage of aggregated data
First Claim
1. A computer-implemented method comprising:
- storing, by one or more configured computing nodes, a plurality of aggregated data values for an OLAP (“
online analytical processing”
) cube that has multiple dimensions by, for each of the plurality of aggregated data values;
determining a hash key for use with the aggregated data value, wherein the determined hash key is based at least in part on a combination of an aggregation metric used for aggregating the aggregated data value and multiple dimension category values for the multiple dimensions that are associated with the aggregated data value;
determining a storage location within a distributed key-value storage structure stored across multiple storage nodes by using the determined hash key as input to a hash function, wherein output of the hash function indicates the determined storage location, and wherein the determined storage location is within a subset of the distributed key-value storage structure that is stored on one of the multiple storage nodes; and
providing the aggregated data value for storage in the determined storage location on the one storage node as part of the OLAP cube; and
after the storing of the plurality of aggregated data values for the OLAP cube,receiving a request for one or more of the stored aggregated data values, the request indicating one or more dimension category values for the multiple dimensions of the OLAP cube, andusing the indicated one or more dimension category values to obtain and provide the one or more stored aggregated data values from the distributed key-value storage structure in response to the request.
0 Assignments
0 Petitions
Accused Products
Abstract
Techniques are described for managing aggregation of data in a distributed manner, such as for a particular client based on specified configuration information. The described techniques may include storing aggregated data values for an OLAP cube or other data structure in a distributed manner, such as in some situations in a distributed hash table. The aggregated data values to be stored may be generated in various manners, such as by performing multi-stage data manipulation operations—for example, a map-reduce architecture may be used, with a first stage involving the use of one or more specified map functions to be performed, and with at least a second stage involving the use of one or more specified reduce functions to be performed.
51 Citations
19 Claims
-
1. A computer-implemented method comprising:
-
storing, by one or more configured computing nodes, a plurality of aggregated data values for an OLAP (“
online analytical processing”
) cube that has multiple dimensions by, for each of the plurality of aggregated data values;determining a hash key for use with the aggregated data value, wherein the determined hash key is based at least in part on a combination of an aggregation metric used for aggregating the aggregated data value and multiple dimension category values for the multiple dimensions that are associated with the aggregated data value; determining a storage location within a distributed key-value storage structure stored across multiple storage nodes by using the determined hash key as input to a hash function, wherein output of the hash function indicates the determined storage location, and wherein the determined storage location is within a subset of the distributed key-value storage structure that is stored on one of the multiple storage nodes; and providing the aggregated data value for storage in the determined storage location on the one storage node as part of the OLAP cube; and after the storing of the plurality of aggregated data values for the OLAP cube, receiving a request for one or more of the stored aggregated data values, the request indicating one or more dimension category values for the multiple dimensions of the OLAP cube, and using the indicated one or more dimension category values to obtain and provide the one or more stored aggregated data values from the distributed key-value storage structure in response to the request. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A non-transitory computer-readable medium having stored contents that configure one or more computing systems of an online data aggregation service to:
-
select, by the configured one or more computing systems and on behalf of a client of the online data aggregation service, multiple storage nodes from a plurality of storage nodes provided by the online data aggregation service; store, by the configured one or more computing systems and on behalf of the client, a plurality of aggregated data values generated by the online data aggregation service that are each associated with multiple dimension category values for multiple dimensions by, for each of the plurality of aggregated data values; determining a value for use as a hash key with the aggregated data value, the determined value being based at least in part on a combination of an aggregation metric used for aggregating the aggregated data value and the multiple dimension category values associated with the aggregated data value; determining a storage location within a key-value storage structure stored across the selected multiple storage nodes by using the determined value as a hash key input to a hash function, wherein output of the hash function indicates the determined storage location; and providing the aggregated data value for storage in the determined storage location; and after the plurality of aggregated data values are stored, receive a request for one or more of the stored aggregated data values, the request indicating one or more dimension category values; and obtain, by the configured one or more computing systems and by using the indicated one or more dimension category values, the one or more stored aggregated data values from the key-value storage structure to provide in response to the request. - View Dependent Claims (15, 16)
-
-
17. A system, comprising:
-
one or more hardware processors of one or more computing systems; and one or more memories with software instructions that, when executed by at least one of the one or more hardware processors, cause the at least one hardware processor to implement functionality of an online data aggregation service, including; selecting, on behalf of a client of the online data aggregation service, multiple storage nodes from a plurality of storage nodes provided by the online data aggregation service; generating, on behalf of the client, a plurality of aggregated data values for an OLAP (“
online analytical processing”
) cube having multiple dimensions, wherein each of the plurality of aggregated data values is further associated with an aggregation metric used for aggregating the aggregated data value; andstoring the plurality of aggregated data values by, for each of the plurality of aggregated data values; determining a value for use as a hash key with the aggregated data value, the determined value being based at least in part on the aggregation metric used for aggregating the aggregated data value and on a combination of multiple dimension category values that correspond to the multiple dimensions and that are associated with the aggregated data value; determining a storage location within a key-value storage structure stored across the selected multiple storage nodes, the determining of the storage location including using the determined value as a hash key input to a hash function, the determined storage location being within a subset of the key-value storage structure that is stored on one of the multiple storage nodes; and providing the aggregated data value for storage in the determined storage location on the one storage node; and after the storing of the plurality of aggregated data values, receiving a request for one or more of the stored aggregated data values, the request indicating one or more dimension category values, and using the indicated one or more dimension category values to obtain and provide the one or more stored aggregated data values from the key-value storage structure in response to the request. - View Dependent Claims (18, 19)
-
Specification