Failover mechanisms in RDMA operations
First Claim
1. A method for recovery from failure in a sending node in a data processing system which includes said sending node and at least one communications adapter coupled to the sending node, said method comprising:
- receiving at a receiving adapter of the data processing system a data packet transferred via remote direct memory access protocol from the sending node after restarting of the sending node, the data packet comprising a data packet header with a new key generated by the sending node, after restarting thereof, based on a random number seeded key generator, the new key being provided in a data structure of the data packet header of the data packet received by said receiving adapter;
checking, by said receiving adapter, that the new key provided within the data structure of the data packet header of the data packet received by said receiving adapter matches a key loaded by a device driver into a table of a plurality of tables resident in a node in said data processing system coupled to said receiving adapter, the plurality of tables shared by the receiving adapter and one or more other adapters coupled to the node and mapping to memory buffers for remote direct memory access via the receiving adapter and the one or more other adapters, the data structure referencing the table of the plurality of tables, and the checking comprising using, by the receiving adapter, the data structure to reference the table of the plurality of tables to obtain the key loaded therein by the device driver; and
responsive to the checking indicating a failed match, dropping the received data packet at the receiving adapter and signaling to the sending node that a fatal error has occurred with respect to the data packet transferred via remote direct memory access protocol.
1 Assignment
0 Petitions
Accused Products
Abstract
In remote direct memory access transfers in a multinode data processing system in which the nodes communicate with one another through communication adapters coupled to a switch or network, failures in the nodes or in the communication adapters can produce the phenomenon known as trickle traffic, which is data that has been received from the switch or from the network that is stale but which may have all the signatures of a valid packet data. The present invention addresses the trickle traffic problem in two situations: node failure and adapter failure. In the node failure situation randomly generated keys are used to reestablish connections to the adapter while providing a mechanism for the recognition of stale packets. In the adapter failure situation, a round robin context allocation approach is used with adapter state contexts being provided with state information which helps to identify stale packets.
51 Citations
18 Claims
-
1. A method for recovery from failure in a sending node in a data processing system which includes said sending node and at least one communications adapter coupled to the sending node, said method comprising:
-
receiving at a receiving adapter of the data processing system a data packet transferred via remote direct memory access protocol from the sending node after restarting of the sending node, the data packet comprising a data packet header with a new key generated by the sending node, after restarting thereof, based on a random number seeded key generator, the new key being provided in a data structure of the data packet header of the data packet received by said receiving adapter; checking, by said receiving adapter, that the new key provided within the data structure of the data packet header of the data packet received by said receiving adapter matches a key loaded by a device driver into a table of a plurality of tables resident in a node in said data processing system coupled to said receiving adapter, the plurality of tables shared by the receiving adapter and one or more other adapters coupled to the node and mapping to memory buffers for remote direct memory access via the receiving adapter and the one or more other adapters, the data structure referencing the table of the plurality of tables, and the checking comprising using, by the receiving adapter, the data structure to reference the table of the plurality of tables to obtain the key loaded therein by the device driver; and responsive to the checking indicating a failed match, dropping the received data packet at the receiving adapter and signaling to the sending node that a fatal error has occurred with respect to the data packet transferred via remote direct memory access protocol. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for recovery from failure in a communications adapter coupled to a node in a data processing system which includes said node and said communications adapter, said method comprising the steps of:
-
initiating recovery operations within said failed adapter; acquiring, by said adapter, from said node, adapter state information from a set of available adapter state contexts used for remote direct memory access, said state contexts being supplied in round robin fashion from said set of available state contexts; firstly checking, by said failed adapter, to insure that said acquired adapter state information is owned by a current window by checking a window identifier field in memory in said adapter; and secondly, checking, by said failed adapter, to verify that a state context identifier key for a received remote direct memory access data packet matches a state context identifier key field in adapter memory. - View Dependent Claims (8, 9, 10, 11)
-
-
12. A method for determining failure in a communications adapter coupled to a node in a data processing system which includes said node and said communications adapter, said method comprising the steps of:
-
maintaining within said node a count of adapter recovery events for said adapter; establishing an adapter state information context residing within said node and in said adapter, said state information context including an initial indication of said count; transferring data packets with packet header information including count information present in said node; and comparing said count information of said packet header information with said count included in said established adapter state information context residing in said adapter to determine that an adapter failure has occurred. - View Dependent Claims (13, 14, 15)
-
-
16. A multinode data processing system comprising:
-
a plurality of data processing nodes together with a respective plurality of communications adapters coupled respectively thereto which enable said nodes to communicate through a switch or network to which said adapters are coupled; and programming means within said nodes and said adapters for receiving at a receiving adapter of the data processing system a data packet transferred via remote direct memory access protocol from a sending node after restarting of the sending node, the data packet comprising a data packet header with a new key generated by the sending node, after restarting thereof, based on a random number seeded key generator, the new key being provided in a data structure of the data packet header of the data packet received by said receiving adapter; and
for checking, by the receiving adapter, that the new key provided within the data structure of the data packet header of the data packet received by said receiving adapter matches a key loaded by a device driver into a table of a plurality of tables resident in the node coupled to said receiving adapter, the plurality of tables shared by the receiving adapter and one or more other adapters coupled to the node and mapping to memory buffers for remote direct memory access via the receiving adapter and the one or more other adapters, the data structure referencing the table of the plurality of tables, and the checking comprising using, by the receiving adapter, the data structure to reference the table of the plurality of tables to obtain the key loaded therein by the device driver; and
responsive to said checking indicating a failed match, dropping the received data packet at the receiving adapter and signaling to the sending node that a fatal error has occurred with respect to the data packet transferred via remote direct memory access protocol.
-
-
17. A multinode data processing system comprising:
-
a plurality of data processing nodes together with a respective plurality of communications adapters coupled respectively thereto which enable said nodes to communicate through a switch or network to which said adapters are coupled; and programming means within said nodes and said adapters for;
initiating recovery operations within a failed adapter;
for acquiring, by said failed adapter, from its respectively coupled node, adapter state information from a set of available adapter state contexts used for remote direct memory access, said state contexts being supplied in round robin fashion from said set of available state contexts; and
for firstly checking, by said failed adapter, to insure that said acquired adapter state information is owned by a current window by checking a window identifier field in said adapter state information in memory in said adapter; and
secondly, for checking, by said failed adapter, to verify that a state context identifier key for a received remote direct memory access data packet matches a state context identifier key field in adapter memory.
-
-
18. A multinode data processing system comprising:
-
a plurality of data processing nodes together with a respective plurality of communications adapters coupled respectively thereto which enable said nodes to communicate through a switch or network to which said adapters are coupled; and programming means within said nodes and said adapters for;
maintaining, within said nodes, counts of adapter recovery events for respective ones of said adapters; and
for establishing adapter state information contexts residing within said nodes and in said adapters, said adapter state information contexts including respective ones of said counts of adapter recovery events; and
for transferring data packets with packet header information including count information present in the nodes; and
for comparing said count information of said packet header information with said counts of adapter recovery events included in said established adapter state information contexts residing in said adapters to determine that an adapter failure has occurred.
-
Specification