Managing internode data communications for an uninitialized process in a parallel computer
First Claim
1. A method of managing internode data communications in a parallel computer, the parallel computer comprising a plurality of compute nodes, each compute node comprising main computer memory and a messaging unit (MU), each MU comprising a module of automated computing machinery coupling the plurality of compute nodes for data communications, each MU comprising the main computer memory, the main computer memory of the MU comprising one or more MU message buffers, each MU message buffer associated with an uninitialized process on one of the plurality of compute nodes, the method comprising:
- receiving, by the MU of the one of the plurality of compute nodes, one or more data communications messages in one of the one or more MU message buffers associated with the uninitialized process on the one of the plurality of compute nodes;
determining, by an application agent, that the one of the one or more MU message buffers associated with the uninitialized process is full prior to initialization of the uninitialized process;
establishing, by the application agent, a temporary message buffer for the uninitialized process in the main computer memory; and
moving, by the application agent, the one or more data communications messages from the one of the one or more MU message buffers associated with the uninitialized process to the temporary message buffer in the main computer memory, wherein;
the parallel computer comprises a parallel active messaging interface (‘
PAMI’
) and the plurality of compute nodes execute a parallel application, the PAMI comprises data communications endpoints, each data communications endpoint comprising a specification of data communications parameters for a thread of execution on one of the plurality of compute nodes, including specifications of a client, a context, and a task, the data communications endpoints coupled for data communications through the PAMI; and
the uninitialized process comprises one of the data communications endpoints;
each client comprises a collection of data communications resources dedicated to exclusive use of an application-level data processing entity;
each context comprises a subset of the collection of data processing resources of a client, context functions, and a work queue of data transfer instructions to be performed by use of the subset through the context functions operated by an assigned thread of execution; and
each task represents a process of execution of the parallel application.
1 Assignment
0 Petitions
Accused Products
Abstract
A parallel computer includes nodes, each having main memory and a messaging unit (MU). Each MU includes computer memory, which in turn includes, MU message buffers. Each MU message buffer is associated with an uninitialized process on the compute node. In the parallel computer, managing internode data communications for an uninitialized process includes: receiving, by an MU of a compute node, one or more data communications messages in an MU message buffer associated with an uninitialized process on the compute node; determining, by an application agent, that the MU message buffer associated with the uninitialized process is full prior to initialization of the uninitialized process; establishing, by the application agent, a temporary message buffer for the uninitialized process in main computer memory; and moving, by the application agent, data communications messages from the MU message buffer associated with the uninitialized process to the temporary message buffer in main computer memory.
83 Citations
9 Claims
-
1. A method of managing internode data communications in a parallel computer, the parallel computer comprising a plurality of compute nodes, each compute node comprising main computer memory and a messaging unit (MU), each MU comprising a module of automated computing machinery coupling the plurality of compute nodes for data communications, each MU comprising the main computer memory, the main computer memory of the MU comprising one or more MU message buffers, each MU message buffer associated with an uninitialized process on one of the plurality of compute nodes, the method comprising:
-
receiving, by the MU of the one of the plurality of compute nodes, one or more data communications messages in one of the one or more MU message buffers associated with the uninitialized process on the one of the plurality of compute nodes; determining, by an application agent, that the one of the one or more MU message buffers associated with the uninitialized process is full prior to initialization of the uninitialized process; establishing, by the application agent, a temporary message buffer for the uninitialized process in the main computer memory; and moving, by the application agent, the one or more data communications messages from the one of the one or more MU message buffers associated with the uninitialized process to the temporary message buffer in the main computer memory, wherein; the parallel computer comprises a parallel active messaging interface (‘
PAMI’
) and the plurality of compute nodes execute a parallel application, the PAMI comprises data communications endpoints, each data communications endpoint comprising a specification of data communications parameters for a thread of execution on one of the plurality of compute nodes, including specifications of a client, a context, and a task, the data communications endpoints coupled for data communications through the PAMI; andthe uninitialized process comprises one of the data communications endpoints; each client comprises a collection of data communications resources dedicated to exclusive use of an application-level data processing entity; each context comprises a subset of the collection of data processing resources of a client, context functions, and a work queue of data transfer instructions to be performed by use of the subset through the context functions operated by an assigned thread of execution; and each task represents a process of execution of the parallel application. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
Specification