PARALLEL PROCESSING OF DATA
First Claim
Patent Images
1. A method comprising:
- receiving, at a data center including one or more processing modules and providing a native processing environment, an untrusted application that includes a data parallel pipeline, wherein the data parallel pipeline specifies multiple parallel data objects that contain multiple elements and multiple parallel operations that are associated with untrusted functions that operate on the elements;
instantiating a first secured processing environment in the native processing environment and on one or more of the processing modules;
executing the untrusted application in the first secured processing environment, wherein executing the application generates a dataflow graph of deferred parallel data objects and deferred parallel operations corresponding to the data parallel pipeline;
communicating information representing the data flow graph outside of the first secured processing environment;
applying, outside of the first secured processing environment and in the native processing environment, one or more graph transformations to the information representing the dataflow graph to generate a revised dataflow graph that includes one or more of the deferred parallel data objects and deferred, combined parallel data operations that are associated with one or more of the untrusted functions; and
executing the deferred, combined parallel operations to produce materialized parallel data objects corresponding to the deferred parallel data objects, wherein executing the deferred, combined parallel operations comprises;
instantiating one or more second secured processing environments in the native processing environment and on one or more of the processing modules;
executing the untrusted functions associated with the deferred, combined parallel operations in the one or more second secured processing environments.
2 Assignments
0 Petitions
Accused Products
Abstract
An untrusted application is received at a data center including one or more processing modules and providing a native processing environment. The untrusted application includes a data parallel pipeline. Secured processing environments are used to execute the untrusted application.
75 Citations
22 Claims
-
1. A method comprising:
-
receiving, at a data center including one or more processing modules and providing a native processing environment, an untrusted application that includes a data parallel pipeline, wherein the data parallel pipeline specifies multiple parallel data objects that contain multiple elements and multiple parallel operations that are associated with untrusted functions that operate on the elements; instantiating a first secured processing environment in the native processing environment and on one or more of the processing modules; executing the untrusted application in the first secured processing environment, wherein executing the application generates a dataflow graph of deferred parallel data objects and deferred parallel operations corresponding to the data parallel pipeline; communicating information representing the data flow graph outside of the first secured processing environment; applying, outside of the first secured processing environment and in the native processing environment, one or more graph transformations to the information representing the dataflow graph to generate a revised dataflow graph that includes one or more of the deferred parallel data objects and deferred, combined parallel data operations that are associated with one or more of the untrusted functions; and executing the deferred, combined parallel operations to produce materialized parallel data objects corresponding to the deferred parallel data objects, wherein executing the deferred, combined parallel operations comprises; instantiating one or more second secured processing environments in the native processing environment and on one or more of the processing modules; executing the untrusted functions associated with the deferred, combined parallel operations in the one or more second secured processing environments. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system comprising:
-
one or more processing modules configured to provide native processing environment and to implement the following; a first secured processing environment in the native processing environment, the first secured processing environment configured to; execute an untrusted application that includes a data parallel pipeline, the data parallel pipeline specifying multiple parallel data objects that contain multiple elements and multiple parallel operations that are associated with untrusted functions that operate on the elements;
wherein executing the application generates a dataflow graph of deferred parallel data objects and deferred parallel operations corresponding to the data parallel pipeline;communicate information representing the data flow graph outside of the first secured processing environment; a service located outside of the first secured processing environment and in the native processing environment, the service configured to; receive the information representing the data flow graph from the first secured processing environment; apply one or more graph transformations to the information representing the dataflow graph to generate a revised dataflow graph that includes one or more of the deferred parallel data objects and deferred, combined parallel data operations that are associated with one or more of the untrusted functions; cause execution of the deferred, combined parallel operations to produce materialized parallel data objects corresponding to the deferred parallel data objects; and one or more second secured processing environments in the native processing environment, the one or more second secured processing environments configured to execute the untrusted functions associated with the deferred, combined parallel operations to result in execution of the deferred, combined parallel operations. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
-
20. A computer readable medium storing instructions that, when executed by one to or more processing devices, cause the one or more processing devices to:
-
access information representing a dataflow graph of deferred parallel data objects and deferred parallel operations, the deferred parallel data objects and deferred parallel operations corresponding to parallel data objects and parallel operations specified by a data parallel pipeline included in an untrusted application, wherein the parallel data objects contain multiple elements and the parallel operations are associated with untrusted functions that operate on the elements; apply one or more graph transformations to the information representing the dataflow graph to generate a revised dataflow graph that includes one or more of the deferred parallel data objects and deferred, combined parallel data operations that are associated with one or more of the untrusted functions; and execute the deferred, combined parallel operations to produce materialized parallel data objects corresponding to the deferred parallel data objects, wherein, to execute the deferred, combined parallel operations, the instructions comprise instructions that cause the one or more processing devices to; instantiate one or more secured processing environments; execute the untrusted functions associated with the deferred, combined parallel operations in the one or more secured processing environments.
-
-
21. The medium of 22 wherein, to access information representing the dataflow graph of deferred parallel data objects and deferred parallel operations, the instructions include instructions that, when executed by the one or more processing devices, cause the one or more processing devices to:
-
receive the untrusted application that includes the data parallel pipeline; instantiate an initial secured processing environment; execute the untrusted application in the initial secured processing environment, wherein executing the application generates the dataflow graph of deferred parallel data objects and deferred parallel operations; and communicate the information representing the data flow graph outside of the initial secured processing environment such that the graph transformations are applied to the information representing the dataflow graph outside of the initial secured processing environment. - View Dependent Claims (22)
-
Specification