NON-DISRUPTIVE FAILOVER OF RDMA CONNECTION
First Claim
1. A system for performing non-disruptive failover of an RDMA connection between a first computer node and a second computer node, the system comprising:
- an upper subsystem module at the first computer node configured to create a request; and
a failover virtual layer in communication with the upper system module at the first computer node, the failover virtual layer configured to establish the RDMA connection between the first computer node and the second computer node, to detect an error on the RDMA connection and perform a failover of the RDMA connection so that the upper subsystem module does not have knowledge of the error on the RDMA connection and the upper subsystem module continues providing requests to the failover virtual layer while the RDMA connection sustained the error.
1 Assignment
0 Petitions
Accused Products
Abstract
A novel RDMA connection failover technique that minimizes disruption to upper subsystem modules (executed on a computer node), which create requests for data transfer. A new failover virtual layer performs failover of an RDMA connection in error so that the upper subsystem that created a request does not have knowledge of an error (which is recoverable in software and hardware), or of a failure on the RDMA connection due to the error. Since the upper subsystem does not have knowledge of a failure on the RDMA connection or of a performed failover of the RDMA connection, the upper subsystem continues providing requests to the failover virtual layer without interruption, thereby minimizing downtime of the data transfer activity.
48 Citations
25 Claims
-
1. A system for performing non-disruptive failover of an RDMA connection between a first computer node and a second computer node, the system comprising:
-
an upper subsystem module at the first computer node configured to create a request; and a failover virtual layer in communication with the upper system module at the first computer node, the failover virtual layer configured to establish the RDMA connection between the first computer node and the second computer node, to detect an error on the RDMA connection and perform a failover of the RDMA connection so that the upper subsystem module does not have knowledge of the error on the RDMA connection and the upper subsystem module continues providing requests to the failover virtual layer while the RDMA connection sustained the error. - View Dependent Claims (2, 3, 4)
-
-
5. A system for performing non-disruptive failover of an RDMA connection between a first computer node and a second computer node, the system comprising:
-
an upper subsystem module at the first computer node configured to create a request, the request including a source memory address of data to be read over an RDMA connection and a destination memory address to store data to; and a failover virtual layer configured to create at least one virtual queue structure (VQS), the VQS accessible by the upper subsystem module and not accessible by an interconnect adapter at the first computer node, the failover virtual layer further configured to create at least two physical queue structures (PQS) associated with a VQS, the PQS are not accessible by the upper subsystem module. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method for performing an RDMA transfer between a first and a second computer node, the method comprising:
-
creating at least one virtual queue structure (VQS) in a memory, the VQS for storing data transfer requests and completion status; providing a handle of the created VQS to an upper subsystem module so that the upper subsystem module is adapted to access the VQS using the handle while an interconnect adapter at the first computer node is not adapted to access the VQS, the upper subsystem module is configured to provide data transfer requests; creating physical queue structures (PQS) in the memory; registering the created PQS with the interconnect adapter so that the interconnect adapter is adapted to access the PQS and the upper subsystem is adapted not to access the PQS; associating the created VQS with the two or more created PQS; connecting the created VQS at the first computer node with a VQS at the second computer node; and connecting the created PQS at the first computer node with a PQS at the second computer node. - View Dependent Claims (15)
-
-
16. A method for performing non-disruptive failover of an RDMA connection between a first computer node and a second computer node, the method comprising:
-
receiving a request from an upper subsystem module, the request may include source and destination memory addresses; posting the request to a virtual queue structure (VQS) associated with two or more physical queue structures (PQS) and posting the request to a first PQS; accessing the first PQS to obtain the source and destination address of the data; sending the data over the RDMA connection between the first computer node and the second computer node; responsive to determining that an error occurred on the RDMA connection, identifying a second PQS associated with the VQS, the second PQS not in error; moving the request from the first PQS to the second PQS associated with the VQS; using the second PQS to perform data transfer over the RDMA connection without notifying the upper subsystem module of the error so that the upper subsystem module continues posting requests to the VQS; posting a completion status in the second PQS at the first computer node; and responsive to the completion status being a successful one, moving the completion status to the VQS, thereby making the upper subsystem module aware of the successful completion.
-
-
17. A method for performing non-disruptive failover of an RDMA connection between a first computer node and a second computer node, the method comprising:
-
receiving a data transfer request from an upper subsystem module; and providing a failover virtual layer at the first and the second nodes, the failover virtual layer at each node is configured to establish an RDMA connection between the first and the second computer node, to receive the data transfer request and to perform a failover of an RDMA transfer without notifying the upper subsystem module so that the upper subsystem module continues providing data transfer requests to the failover virtual layer. - View Dependent Claims (18, 19, 20, 21, 22, 23)
-
-
24. A high availability (HA) cluster system for performing non-disruptive failover of an RDMA connection between a first storage node and a second storage node engaged in a transfer of write logs, the system comprising:
-
an upper subsystem module at the first computer node configured to create a request for transfer of the write logs, the request including a source memory address of the write logs to be transferred over an RDMA connection and a destination memory address at the second computer node; and a failover virtual layer in communication with the upper system module at the first computer node, the failover virtual layer configured to establish the RDMA connection between the first computer node and the second computer node, obtain the write logs from the memory and store the write logs to a memory of the second storage node, detect an error on the RDMA connection and perform a failover of the RDMA connection so that the upper subsystem module does not have knowledge of the error on the RDMA connection and the upper subsystem module continues providing requests to the failover virtual layer while the RDMA connection sustained the error.
-
-
25. A computer-program product comprising a computer-readable medium having computer program code embodied thereon for performing an RDMA transfer between a first and a second computer node, the computer program code adapted to:
-
create at least one virtual queue structure (VQS) in a memory, the VQS for storing data transfer requests and completion status; provide a handle of the created VQS to an upper subsystem module so that the upper subsystem module is adapted to access the VQS using the handle while an interconnect adapter at the first computer node is not adapted to access the VQS, the upper subsystem module is configured to provide data transfer requests; create physical queue structures (PQS) in the memory; register the created PQS with the interconnect adapter so that the interconnect adapter is adapted to access the PQS and the upper subsystem is adapted not to access the PQS; associate the created VQS with the two or more created PQS; connect the created VQS at the first computer node with a VQS at the second computer node; and connect the created PQS at the first computer node with a PQS at the second computer node.
-
Specification