×

Providing improved message handling performance in computer systems utilizing shared network devices

  • US 8,166,146 B2
  • Filed: 09/29/2008
  • Issued: 04/24/2012
  • Est. Priority Date: 09/29/2008
  • Status: Expired due to Fees
First Claim
Patent Images

1. A parallel computer system, comprising:

  • an input/output (I/O) node;

    a plurality of compute nodes coupled to each other and to the I/O node via a collective network, each compute node comprising;

    a compute logic block having a plurality of processors, wherein one of the processors runs a first thread, and wherein another of the processors runs a main process of an application program that spawned the first thread to run on said one of the processors;

    a memory array block shared by the processors;

    a network logic block having one or more communication blocks, wherein at least one of the communication blocks comprises a collective network device for facilitating communication of messages between the compute node and the I/O node, each message comprising a plurality of packets, wherein the collective network has a point-to-point mode that allows messages to be sent to a specific node in the collective network, wherein when sending a message to one of the compute nodes from the I/O node, all of the packets in the message are sent together so a complete message with the packets in order is delivered to the compute node, and wherein each of the messages has a one packet header that includes a thread ID identifying a thread to which the message is to be delivered;

    wherein when receiving a message at the compute node, the compute node performs the steps of;

    (a) obtaining a lock on the network device;

    (b) checking a shared storage location of the memory array block for a one packet header containing a thread ID identifying the first thread to see if a message is pending for the first thread;

    (c) if a message is pending for the first thread based on the checking step (b), receiving the remaining packets in the message directly to a user'"'"'s buffer, unlocking the network device, and returning;

    (d) if no message is pending for the first thread based on the checking step (b), receiving a one packet header of a message from the network device;

    (e) if the one packet header received in step (d) indicates that the message is for the first thread, receiving the remaining packets in the message directly to the user'"'"'s buffer, unlocking the network device, and returning;

    (f) if the one packet header received in step (d) indicates that the message is for a thread other than the first thread, updating the shared storage location of the memory array block with a thread ID of the other thread, unlocking the network device, waiting for a time out to expire, obtaining a lock on the network device, and repeating from the checking step (b);

    wherein an I/O node daemon runs on the I/O node, wherein a compute node kernel (CNK) runs on each of the processors, and wherein the compute node operates in at least one of a symmetric multi-processor (SMP) mode and a dual mode, comprising;

    when the compute node operates in the SMP mode, a first one of the processors runs a main process of an application program in the SMP mode, wherein the first thread is spawned to run on a second one of the processors by the application program'"'"'s main process running on the first processor, wherein steps (a)-(f) are performed by the CNK running on the second processor, and wherein a third one of the processors runs a second thread spawned by the application program'"'"'s main process running on the first processor;

    when the compute node operates in the dual mode, a first one and a second one of the processors each runs a main process of an application program in the dual mode, wherein the first thread is spawned to run on a third one of the processors by the application program'"'"'s main process running on the first processor wherein steps (a)-(f) are performed by the CNK running on the third processor, and wherein a fourth one of the processors runs a second thread spawned by the application program'"'"'s main process running on the second processor.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×