DMA engine for repeating communication patterns
First Claim
1. A parallel computer system comprising a network of interconnected compute nodes that operates a global message-passing application for performing communications across the network, wherein each of the compute nodes comprises one or more individual processors with memories, wherein local instances of the global message-passing application operate at each compute node to carry out local processing operations independent of processing operations carried out at other compute nodes, and wherein each compute node further comprises:
- a DMA engine constructed to interact with the application via Injection FIFO Metadata, wherein the Injection FIFO Metadata comprises an address that points to an Injection FIFO buffer having one or more message descriptors, and the DMA engine is operable to retrieve a message descriptor from the Injection FIFO buffer, the DMA engine injecting a message that corresponds to the message descriptor into the network, the DMA engine having a fixed processing overhead for said injecting irrespective of the number of message descriptors stored in the Injection FIFO buffer, wherein if there is a greater number of communication patterns than available Injection FIFO Metadata, the local instance of the global message-passing application determines whether an Injection FIFO buffer described by the Injection FIFO Metadata at the DMA network interface has completed the messages of the current communication pattern and is available for a new communication pattern and deactivates that Injection FIFO Metadata to control that the DMA engine does not access the current content of the Injection FIFO buffer while it is rewritten for the Injection FIFO buffer of the new communication pattern.
2 Assignments
0 Petitions
Accused Products
Abstract
A parallel computer system is constructed as a network of interconnected compute nodes to operate a global message-passing application for performing communications across the network. Each of the compute nodes includes one or more individual processors with memories which run local instances of the global message-passing application operating at each compute node to carry out local processing operations independent of processing operations carried out at other compute nodes. Each compute node also includes a DMA engine constructed to interact with the application via Injection FIFO Metadata describing multiple Injection FIFOs where each Injection FIFO may containing an arbitrary number of message descriptors in order to process messages with a fixed processing overhead irrespective of the number of message descriptors included in the Injection FIFO.
119 Citations
18 Claims
-
1. A parallel computer system comprising a network of interconnected compute nodes that operates a global message-passing application for performing communications across the network, wherein each of the compute nodes comprises one or more individual processors with memories, wherein local instances of the global message-passing application operate at each compute node to carry out local processing operations independent of processing operations carried out at other compute nodes, and wherein each compute node further comprises:
a DMA engine constructed to interact with the application via Injection FIFO Metadata, wherein the Injection FIFO Metadata comprises an address that points to an Injection FIFO buffer having one or more message descriptors, and the DMA engine is operable to retrieve a message descriptor from the Injection FIFO buffer, the DMA engine injecting a message that corresponds to the message descriptor into the network, the DMA engine having a fixed processing overhead for said injecting irrespective of the number of message descriptors stored in the Injection FIFO buffer, wherein if there is a greater number of communication patterns than available Injection FIFO Metadata, the local instance of the global message-passing application determines whether an Injection FIFO buffer described by the Injection FIFO Metadata at the DMA network interface has completed the messages of the current communication pattern and is available for a new communication pattern and deactivates that Injection FIFO Metadata to control that the DMA engine does not access the current content of the Injection FIFO buffer while it is rewritten for the Injection FIFO buffer of the new communication pattern. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
11. A parallel computer system comprising a compute node comprising:
-
a DMA engine, a network of interconnected compute nodes, said DMA engine supporting a global message-passing operation and controlled from an application via Injection FIFO Metadata, wherein the Injection FIFO Metadata comprises an address pointing to an Injection FIFO buffer located in said DMA engine, said injection FIFO buffer having one or more message descriptors, the DMA engine operable to process messages with a fixed processing overhead irrespective of the number of message descriptors stored in the Injection FIFO buffer, wherein if there is a greater number of communication patterns than available Injection FIFO Metadata, a local instance of the global message-passing application determines whether an Injection FIFO buffer described by the Injection FIFO Metadata at a DMA network interface has completed the messages, of a current communication pattern and is available for a new communication pattern and deactivates that Injection FIFO Metadata to control that the DMA engine does not access the current content of the injection FIFO buffer while it is rewritten for the Injection FIFO buffer of the new communication pattern.
-
-
12. A method for passing messages within a parallel computer system comprising a network of interconnected compute nodes, wherein each of the compute nodes comprises one or more individual processors, memory and DMA engine, the method comprising the steps of:
-
running a global message-passing application across the parallel computer system including running local instances at each compute node for passing messages into and out of the compute node by operation independent of other compute nodes; and exchanging messages with the compute nodes, with a DMA engine via Injection FIFO Metadata, wherein the Injection FIFO Metadata comprises an address pointing to an Injection FIFO buffer located in said DMA engine, said injection FIFO buffer having one or more message descriptors, and the DMA engine is operable to retrieve a message descriptor stored at the memory address, the DMA engine injecting a message that corresponds to the message descriptor into the network with a fixed processing overhead for injecting irrespective of the number of message descriptors stored in the Injection FIFO buffer, wherein if there is a greater number of communication patterns than available Injection FIFO Metadata, the local instance of the global message-passing application further performs the steps of; determining whether an Injection FIFO buffer described by the Injection FIFO Metadata at the DMA network interface has completed the messages of a current communication pattern and is available for a new communication pattern; and deactivating the Injection FIFO Metadata to control that the DMA engine does not access the current content of the Injection FIFO buffer while it is rewritten for the Injection FIFO buffer of the new communication pattern. - View Dependent Claims (13, 14)
-
-
15. A computer program product comprising:
-
a non-transitory storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising; passing messages within a parallel computer system comprising a network of interconnected compute nodes, wherein each of the compute nodes comprises one or more individual processors, memory and DMA engine, the method comprising the steps of; framing global message-passing application across the parallel computer system including running local instances at each compute node for passing messages into and out of the compute node by operation independent of other compute nodes; exchanging messages with the compute nodes, in which each compute node also includes a DMA engine constructed to interact with the application via Injection FIFO Metadata, wherein the Injection FIFO Metadata comprises an address pointing to an Injection FIFO buffer located in said DMA engine, said Injection FIFO buffer having one or more message descriptors, and the DMA engine is operable to retrieve the message descriptor stored at the memory address, the DMA engine further injecting a message that corresponds to the message descriptor into the network with a fixed processing overhead for injecting irrespective of the number of message descriptors stored in the Injection FIFO buffer; determining whether the Injection FIFO buffer described by the Injection FIFO Metadata at the DMA network interface has completed the messages of a current communication pattern and is available for a new communication pattern; and deactivating the Injection FIFO Metadata to control that the DMA engine does not access the current content of the Injection FIFO buffer while it is rewritten for the Injection FIFO buffer of the new communication pattern. - View Dependent Claims (16, 17)
-
-
18. A computer system comprising a sending node comprising a first DMA engine for facilitating injection of one or more message descriptors and a first memory coupled by a network interface to a receiving node comprising a second DMA engine and a second memory, wherein the sending node communicates with the receiving node via Injection FIFO Metadata, wherein the Injection FIFO Metadata comprises an address pointing to an Injection FIFO buffer located in the first DMA engine, said injection FIFO buffer having one or more message descriptors, the first DMA engine having a fixed processing overhead for injecting irrespective of the number of message descriptors stored in the Injection FIFO buffer, wherein the sending node further comprises a processor, and communication between the sending node and the receiving node is initiated by the processor modifying the Injection FIFO buffer with a put message descriptor and updating the Injection FIFO Metadata to correspond to the modified Injection FIFO buffer, the first DMA engine reads the Injection FIFO Metadata, retrieves the corresponding Injection FIFO buffer, retrieves a message described by the put message descriptor within the Injection FIFO buffer, and injects the message described by the put message descriptor into the network interface and the second DMA engine receives the message from the network interface, and writes the message to the second memory using information stored within the put message descriptor and the put message descriptor comprises an injection counter, a reception counter, a message length, an injection offset, a reception offset and a receiving node identifier.
Specification