Distributed data propagator
First Claim
12. A method for executing a message-passing parallel program, comprised of a plurality of concurrently-executable virtual nodes, each having one or more numbered step(s), with one or more associated executable instruction(s) and zero or more associated messaging task(s), the method comprising:
- maintaining a pool of available processing elements, wherein the number of processing elements in the pool may be smaller than the number of virtual nodes;
assigning each of the virtual nodes to at least one processing element from the pool of available processing elements; and
, executing the parallel program, starting with the lowest-numbered step, by;
(a) executing all instruction(s) associated with said step;
(b) completing all messaging task(s) associated with said step; and
, then, (c) repeating (a)-(b) for the next lowest-numbered step until execution of the parallel program is completed.
0 Assignments
0 Petitions
Accused Products
Abstract
The invention provides an off-the-shelf product solution to target the specific needs of commercial users with naturally parallel applications. A top-level, public API provides a simple “compute server” or “task farm” model that dramatically accelerates integration and deployment. A Propagator API allows parallel applications that require inter-node communication to be seamlessly deployed in heterogeneous environments, including networks of interruptible PCs. Implementation of parallel applications using the Propagator API does not require that the environment provide a separate node (or processor) for each block of concurrently-executable code. Nor does the Propagator API require that the assignment between particular blocks of code and processing resources remain static during execution of the parallel application.
348 Citations
36 Claims
-
12. A method for executing a message-passing parallel program, comprised of a plurality of concurrently-executable virtual nodes, each having one or more numbered step(s), with one or more associated executable instruction(s) and zero or more associated messaging task(s), the method comprising:
-
maintaining a pool of available processing elements, wherein the number of processing elements in the pool may be smaller than the number of virtual nodes;
assigning each of the virtual nodes to at least one processing element from the pool of available processing elements; and
,executing the parallel program, starting with the lowest-numbered step, by;
(a) executing all instruction(s) associated with said step;
(b) completing all messaging task(s) associated with said step; and
, then,(c) repeating (a)-(b) for the next lowest-numbered step until execution of the parallel program is completed. - View Dependent Claims (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
25-1. A fault-tolerant method for executing a message-passing parallel program on a network of interruptible processors, as defined in claim 24, wherein:
-
in (c), permitting instructions associated with one or more of the virtual nodes to be executed involves executing all instructions associated with a selected step; and
,in (d), returning to (c) to continue execution involves advancing the selected step prior to returning to (c).
-
-
31. A network-based computing system configured to execute a message-passing parallel program, wherein the system includes a network of processing elements in which the number, N, of available processing elements in the network can be less than the number, P, of concurrently-executable processes in the message-passing parallel program, the system comprising:
-
a plurality of virtual nodes, each corresponding to a concurrently-executable process in the parallel program, each virtual node including (i) state information, (ii) a plurality of executable instructions, and (iii) a messaging interface capable of sending and/or receiving messages to/from other virtual node(s);
an adaptive scheduler that assigns each of the virtual nodes to at least one of the available processing elements in the network for execution, such that at least some of the available processing elements have more than one assigned virtual node;
characterized in that the virtual nodes can migrate from one available processing element to another during execution of the parallel program. - View Dependent Claims (32, 33, 34, 35)
-
-
36. A fault-tolerant, network-based computing system configured to execute a message-passing parallel program on a network of interruptible processors, the system comprising:
-
(a) a plurality of concurrently-executable virtual nodes, each having associated state information;
(b) one or more network-accessible servers that collectively maintain a cache of the state information associated with each virtual node;
(c) at least one server that controls execution of the parallel program by permitting instructions associated with one or more of the virtual nodes to be executed on one or more available processing elements, and permits messages to be exchanged between the virtual nodes;
(d) the server including means for updating cached state information and continuing execution, or, upon fault detection or timeout, restoring the state of the virtual nodes using cached state information and repeating execution of selected instructions.
-
Specification