PROVIDING IMPROVED MESSAGE HANDLING PERFORMANCE IN COMPUTER SYSTEMS UTILIZING SHARED NETWORK DEVICES
First Claim
1. A parallel computer system, comprising:
- an input/output (I/O) node;
a plurality of compute nodes coupled to each other and to the I/O node via a collective network, each compute node comprising;
a compute logic block having a plurality of processors, wherein one of the processors runs a first thread;
a memory array block shared by the processors;
a network logic block having one or more communication blocks, wherein at least one of the communication blocks comprises a collective network device for facilitating communication of messages between the compute node and the I/O node, each message comprising a plurality of packets;
wherein when receiving a message at the compute node, the compute node performs the steps of;
(a) obtaining a lock on the network device;
(b) checking a shared storage location of the memory array block to see if a message is pending for the first thread;
(c) if a message is pending for the first thread based on the checking step (b), receiving the remaining packets in the message directly to a user'"'"'s buffer, unlocking the network device, and returning;
(d) if no message is pending for the first thread based on the checking step (b), receiving at least one packet of a message from the network device;
(e) if the at least one packet received in step (d) indicates that the message is for the first thread, receiving the remaining packets in the message directly to the user'"'"'s buffer, unlocking the network device, and returning;
(f) if the at least one packet received in step (d) indicates that the message is for a thread other than the first thread, updating the shared storage location of the memory array block with a thread id of the other thread, unlocking the network device, waiting for a time out to expire, obtaining a lock on the network device, and repeating from the checking step (b).
1 Assignment
0 Petitions
Accused Products
Abstract
In a massively parallel computer system embodiment, when receiving a message at a compute node from an input/output node, the compute node performs the steps of: obtaining a lock on a collective network device; checking a shared storage location for a message pending for a thread; if such a message is pending, receiving the message'"'"'s remaining packets directly to a user'"'"'s buffer, unlocking, and returning; if no such message is pending, receiving one packet from the network device; if the packet indicates that the message is for the thread, receiving the message'"'"'s remaining packets directly to the user'"'"'s buffer, unlocking, and returning; and if the packet indicates that the message is for another thread, updating the shared storage location with a thread id of the other thread, unlocking, waiting for a time out, locking, and repeating from the checking step. Accordingly, data copying is eliminated with an attendant performance benefit.
63 Citations
21 Claims
-
1. A parallel computer system, comprising:
-
an input/output (I/O) node; a plurality of compute nodes coupled to each other and to the I/O node via a collective network, each compute node comprising; a compute logic block having a plurality of processors, wherein one of the processors runs a first thread; a memory array block shared by the processors; a network logic block having one or more communication blocks, wherein at least one of the communication blocks comprises a collective network device for facilitating communication of messages between the compute node and the I/O node, each message comprising a plurality of packets; wherein when receiving a message at the compute node, the compute node performs the steps of; (a) obtaining a lock on the network device; (b) checking a shared storage location of the memory array block to see if a message is pending for the first thread; (c) if a message is pending for the first thread based on the checking step (b), receiving the remaining packets in the message directly to a user'"'"'s buffer, unlocking the network device, and returning; (d) if no message is pending for the first thread based on the checking step (b), receiving at least one packet of a message from the network device; (e) if the at least one packet received in step (d) indicates that the message is for the first thread, receiving the remaining packets in the message directly to the user'"'"'s buffer, unlocking the network device, and returning; (f) if the at least one packet received in step (d) indicates that the message is for a thread other than the first thread, updating the shared storage location of the memory array block with a thread id of the other thread, unlocking the network device, waiting for a time out to expire, obtaining a lock on the network device, and repeating from the checking step (b). - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer-implemented method for providing improved message handling performance in a parallel computer system utilizing a shared network device, wherein the parallel computer system comprises an input/output (I/O) node and a plurality of compute nodes coupled to each other and to the I/O node via a collective network, each compute node comprises:
- a compute logic block having a plurality of processors, wherein one of the processors runs a first thread;
a memory array block shared by the processors;
a network logic block having a collective network device for facilitating communication of messages between the compute node and the I/O node, each message comprising a plurality of packets;
wherein when receiving a message at the compute node, the compute node performs the computer-implemented method comprising the steps of;(a) obtaining a lock on the network device; (b) checking a shared storage location of the memory array block to see if a message is pending for the first process; (c) if a message is pending for the first thread based on the checking step (b), receiving the remaining packets in the message directly to a user'"'"'s buffer, unlocking the network device, and returning; (d) if no message is pending for the first thread based on the checking step (b), receiving at least one packet of a message from the network device; (e) if the at least one packet received in step (d) indicates that the message is for the first thread, receiving the remaining packets in the message directly to the user'"'"'s buffer, unlocking the network device, and returning; (f) if the at least one packet received in step (d) indicates that the message is for a thread other than the first thread, updating the shared storage location of the memory array block with a thread id of the other thread, unlocking the network device, waiting for a time out to expire, obtaining a lock on the network device, and repeating from the checking step (b). - View Dependent Claims (8, 9, 10, 11)
- a compute logic block having a plurality of processors, wherein one of the processors runs a first thread;
-
12. A computer readable medium for providing improved message handling performance in a parallel computer system utilizing a shared network device, wherein the parallel computer system comprises an input/output (I/O) node and a plurality of compute nodes coupled to each other and to the I/O node via a collective network, each compute node comprises:
- a compute logic block having a plurality of processors, wherein one of the processors runs a first thread;
a memory array block shared by the processors; and
a network logic block having a collective network device for facilitating communication of messages between the compute node and the I/O node, each message comprising a plurality of packets;the computer readable medium comprising instructions that when executed by one or more of the processors of the compute node cause the compute node when receiving a message to perform the steps of; (a) obtaining a lock on the network device; (b) checking a shared storage location of the memory array block to see if a message is pending for the first thread; (c) if a message is pending for the first thread based on the checking step (b), receiving the remaining packets in the message directly to a user'"'"'s buffer, unlocking the network device, and returning; (d) if no message is pending for the first thread based on the checking step (b), receiving at least one packet of a message from the network device; (e) if the at least one packet received in step (d) indicates that the message is for the first thread, receiving the remaining packets in the message directly to the user'"'"'s buffer, unlocking the network device, and returning; (f) if the at least one packet received in step (d) indicates that the message is for a thread other than the first thread, updating the shared storage location of the memory array block with a thread id of the other thread, unlocking the network device, waiting for a time out to expire, obtaining a lock on the network device, and repeating from the checking step (b). - View Dependent Claims (13, 14, 15, 16)
- a compute logic block having a plurality of processors, wherein one of the processors runs a first thread;
-
17. A computer-implemented method for providing improved message handling performance in a distributed computer system utilizing a shared network device, wherein the parallel computer system comprises a control system and a plurality of compute nodes coupled to the control system via a network, each compute node comprises:
- a compute logic block having a plurality of processors, wherein one of the processors runs a first thread;
a memory array block shared by the processors;
a network logic block having a network device for facilitating communication of messages between the compute node and the control system, each message comprising a plurality of packets;
wherein when receiving a message at the compute node from the control system, the compute node performs the computer-implemented method comprising the steps of;(a) obtaining a lock on the network device; (b) checking a shared storage location of the memory array block to see if a message is pending for the first thread; (c) if a message is pending for the first thread based on the checking step (b), receiving the remaining packets in the message directly to a user'"'"'s buffer, unlocking the network device, and returning; (d) if no message is pending for the first thread based on the checking step (b), receiving at least one packet of a message from the network device; (e) if the at least one packet received in step (d) indicates that the message is for the first thread, receiving the remaining packets in the message directly to the user'"'"'s buffer, unlocking the network device, and returning; (f) if the at least one packet received in step (d) indicates that the message is for a thread other than the first thread, updating the shared storage location of the memory array block with a thread id of the other thread, unlocking the network device, waiting for a time out to expire, obtaining a lock on the network device, and repeating from the checking step (b). - View Dependent Claims (18, 19, 20, 21)
- a compute logic block having a plurality of processors, wherein one of the processors runs a first thread;
Specification