Resilient Message Passing Applications
First Claim
Patent Images
1. A system comprising:
- a plurality of compute nodes;
an application execution environment that exposes said computing nodes to an application;
for each of said compute nodes, at least two physical computing resources, each of said physical computing resources that perform identical computing tasks for said compute node;
said application execution environment that further passing messages between compute nodes, said messages being generated by an application executing on said plurality of compute nodes.
2 Assignments
0 Petitions
Accused Products
Abstract
A message passing system may execute a parallel application on multiple compute nodes. Each compute node may perform a single workload on at least two physical computing resources. Messages may be passed from one compute node to another, and each physical computing resource assigned to a compute node may receive and process the messages. In some embodiments, the compute nodes may be virtualized so that a message passing system may only detect a single compute node and not the multiple underlying physical computing resources.
10 Citations
20 Claims
-
1. A system comprising:
-
a plurality of compute nodes; an application execution environment that exposes said computing nodes to an application; for each of said compute nodes, at least two physical computing resources, each of said physical computing resources that perform identical computing tasks for said compute node; said application execution environment that further passing messages between compute nodes, said messages being generated by an application executing on said plurality of compute nodes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method for operating a first compute node in a message passing environment for executing a parallel processing application, said method comprising:
-
receiving a first message from a second compute node, said message being addressed to said first compute node; identifying a plurality of physical computing resources, each of said plurality of computing resources executing a first execution thread for said parallel processing application; transferring said first message to each of said physical computing resources; receiving a second message from a first physical computing resource, said second message having an address for a third compute node; and transferring said second message to said third compute node. - View Dependent Claims (13, 14)
-
-
15. A system comprising:
-
multiple physical computing resources; an application execution environment that; exposes a first set of compute nodes to an application, each of said compute nodes executing a workflow portion of said application; transmits communications from a first compute node to a second compute node; for each of said compute nodes, provides at least two computing resources, each of said computing resources executing a similar workflow, and forwards communications received by a compute node to each of said computing resources associated with said compute node; determines that a first message from said first compute node to said second compute node has not been received and retransmits said first message. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification