Iterate-aggregate query parallelization
First Claim
Patent Images
1. A query execution system, comprising:
- a processor configured to facilitate execution of queries, wherein the queries comprise an iterate-aggregate shape query;
an identification component configured to identify the iterate-aggregate shape query;
an execution component configured to process the iterate-aggregate shape query recursively in parallel over multiple partitions of data; and
an optimization component configured to;
identify optimal partitioning of data, based, at least, in part, on information about the data or available resources; and
employ a modified matrix multiplication technique to identify an optimal manner for parenthesizing associative operations associated with a specified number of partitions of data,wherein the partitions of data have respective matrix sizes, andwherein the optimal manner for parenthesizing associative operations minimizes a number of operations for executing matrix multiplication of the partitions of data.
2 Assignments
0 Petitions
Accused Products
Abstract
Iterate-aggregate shape queries are executable in an efficient parallel manner. Techniques are utilized that leverage properties of aggregations to implement such a query in a highly parallelized manner utilizing one or both of vertical and horizontal parallelism. More specifically, queries can be recursively evaluated in parallel utilizing partitioning and repartitioning mechanisms. Distributed query execution results over a subset of input data are repartitioned and aggregated to produce a final result.
-
Citations
20 Claims
-
1. A query execution system, comprising:
-
a processor configured to facilitate execution of queries, wherein the queries comprise an iterate-aggregate shape query; an identification component configured to identify the iterate-aggregate shape query; an execution component configured to process the iterate-aggregate shape query recursively in parallel over multiple partitions of data; and an optimization component configured to; identify optimal partitioning of data, based, at least, in part, on information about the data or available resources; and employ a modified matrix multiplication technique to identify an optimal manner for parenthesizing associative operations associated with a specified number of partitions of data, wherein the partitions of data have respective matrix sizes, and wherein the optimal manner for parenthesizing associative operations minimizes a number of operations for executing matrix multiplication of the partitions of data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A parallel processing method, comprising:
executing computer-readable instructions stored on a computer-readable medium for performing the following acts; identifying an iterate-aggregate shape language integrated query; executing the iterate-aggregate shape language integrated query recursively in parallel; identifying an optimal manner for grouping associative operations associated with a specified number of partitions of data, wherein the partitions of data have respective matrix sizes, and wherein the optimal manner for grouping associative operations minimizes a number of operations for executing matrix multiplication of partitions of data; and segmenting the data to partition the data into the specified number of partitions based, at least, in part, on the identified optimal manner for grouping associative operations associated with the specified number of partitions of data. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
-
19. An iterate-aggregate query execution system, comprising:
-
a processor configured to facilitate execution of queries, wherein the queries comprise a language integrated iterate-aggregate shape query; a partition component configured to partition data identified by the language integrated iterate-aggregate shape query; an evaluation component configured to perform parallel evaluation of the partitioned data in accordance with the language integrated iterate-aggregate shape query; a repartition component configured to repartition results of the parallel evaluation; an aggregation component configured to reduce the repartition results to a single response; and an optimization component configured to; identify optimal partitioning, based, at least, in part, on information about the data or available resources; and employ a modified matrix multiplication technique to identify an optimal manner for grouping mathematical associative property operations associated with a specified number of partitions of data, wherein the partitions of data have respective matrix sizes, and wherein the optimal manner for grouping mathematical associative property operations minimizes a number of operations for executing matrix multiplication of the partitions of data. - View Dependent Claims (20)
-
Specification