PREVENTING MESSAGING QUEUE DEADLOCKS IN A DMA ENVIRONMENT
First Claim
1. A method for managing message queues in a parallel computing system having a plurality of compute nodes, comprising:
- determining that a first queue, on a first compute node, storing a set of message descriptors has become full, wherein a direct memory access controller (DMA) is configured to inject message descriptors into the first queue; and
generating an interrupt delivered to an interrupt handler, wherein the interrupt handler is configured to perform the steps of;
stopping the DMA controller;
generating a second queue, wherein the second queue is larger than the first queue;
swapping the first queue with the second queue such that the DMA controller is configured to inject message descriptors into the second queue;
copying the set of message descriptors from the first queue to the second queue; and
restarting the DMA controller.
3 Assignments
0 Petitions
Accused Products
Abstract
Embodiments of the invention may be used to manage message queues in a parallel computing environment to prevent message queue deadlock. A direct memory access controller of a compute node may determine when a messaging queue is full. In response, the DMA may generate an interrupt. An interrupt handler may stop the DMA and swap all descriptors from the full messaging queue into a larger queue (or enlarge the original queue). The interrupt handler then restarts the DMA. Alternatively, the interrupt handler stops the DMA, allocates a memory block to hold queue data, and then moves descriptors from the full messaging queue into the allocated memory block. The interrupt handler then restarts the DMA. During a normal messaging advance cycle, a messaging manager attempts to inject the descriptors in the memory block into other messaging queues until the descriptors have all been processed.
37 Citations
24 Claims
-
1. A method for managing message queues in a parallel computing system having a plurality of compute nodes, comprising:
-
determining that a first queue, on a first compute node, storing a set of message descriptors has become full, wherein a direct memory access controller (DMA) is configured to inject message descriptors into the first queue; and generating an interrupt delivered to an interrupt handler, wherein the interrupt handler is configured to perform the steps of; stopping the DMA controller; generating a second queue, wherein the second queue is larger than the first queue; swapping the first queue with the second queue such that the DMA controller is configured to inject message descriptors into the second queue; copying the set of message descriptors from the first queue to the second queue; and restarting the DMA controller. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-readable storage-medium containing a program which, when executed, performs an operation for managing message queues in a parallel computing system having a plurality of compute nodes, the operation comprising:
-
determining that a first queue, on a first compute node, storing a set of message descriptors has become full, wherein a direct memory access controller (DMA) is configured to inject message descriptors into the first queue; and generating an interrupt delivered to an interrupt handler, wherein the interrupt handler is configured to perform the steps of; stopping the DMA controller; generating a second queue, wherein the second queue is larger than the first queue; swapping the first queue with the second queue such that the DMA controller is configured to inject message descriptors into the second queue; copying the set of message descriptors from the first queue to the second queue; and restarting the DMA controller. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A parallel computing system, comprising:
a plurality of compute nodes, each having at least a processor, a memory and a direct memory access controller (DMA), wherein the plurality of compute nodes are configured to move messages between two compute nodes of the plurality, and wherein the DMA on a first compute node is configured to; determine that a first queue, on the first compute node, storing a set of message descriptors has become full, and generate an interrupt delivered to an interrupt handler on the first compute node, wherein the interrupt handler is configured to perform the steps of; stopping the DMA controller; generating a second queue, wherein the second queue is larger than the first queue; swapping the first queue with the second queue such that the DMA controller is configured to inject message descriptors into the second queue; copying the set of message descriptors from the first queue to the second queue; and restarting the DMA controller. - View Dependent Claims (20, 21, 22, 23, 24)
Specification