NON-DISRUPTIVE FAILOVER OF RDMA CONNECTION
First Claim
1. A system for performing non-disruptive failover of an RDMA connection between a first computer node and a second computer node, the system comprising:
- an upper subsystem module, associated with a first computer node, configured to;
create a request comprising a source memory address of data that is to be read over an RDMA connection and a destination memory address at which the data is to be stored; and
a failover virtual layer configured to;
establish the RDMA connection between the first computer node and a second computer node based upon the request; and
responsive to detecting an error on the RDMA connection;
perform a failover of the RDMA connection; and
during the error on the RDMA connection, accept one or more requests from the upper subsystem module.
0 Assignments
0 Petitions
Accused Products
Abstract
A novel RDMA connection failover technique that minimizes disruption to upper subsystem modules (executed on a computer node), which create requests for data transfer. A new failover virtual layer performs failover of an RDMA connection in error so that the upper subsystem that created a request does not have knowledge of an error (which is recoverable in software and hardware), or of a failure on the RDMA connection due to the error. Since the upper subsystem does not have knowledge of a failure on the RDMA connection or of a performed failover of the RDMA connection, the upper subsystem continues providing requests to the failover virtual layer without interruption, thereby minimizing downtime of the data transfer activity.
258 Citations
20 Claims
-
1. A system for performing non-disruptive failover of an RDMA connection between a first computer node and a second computer node, the system comprising:
-
an upper subsystem module, associated with a first computer node, configured to; create a request comprising a source memory address of data that is to be read over an RDMA connection and a destination memory address at which the data is to be stored; and a failover virtual layer configured to; establish the RDMA connection between the first computer node and a second computer node based upon the request; and responsive to detecting an error on the RDMA connection; perform a failover of the RDMA connection; and during the error on the RDMA connection, accept one or more requests from the upper subsystem module. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A method for performing non-disruptive failover of an RDMA connection between a first computer node and a second computer node, comprising:
-
invoking an upper subsystem module, associated with a first computer node, to; create a request comprising a source memory address of data that is to be read over an RDMA connection and a destination memory address at which the data is to be stored; and invoking a failover virtual layer to; establish the RDMA connection between the first computer node and a second computer node based upon the request; and responsive to detecting an error on the RDMA connection; perform a failover of the RDMA connection; and during the error on the RDMA connection, accept one or more requests from the upper subsystem module. - View Dependent Claims (19)
-
-
20. A computer-program product comprising a non-transitory computer-readable medium having computer program code embodied thereon for performing non-disruptive failover of an RDMA connection between a first computer node and a second computer node, the computer program code adapted to:
-
invoke an upper subsystem module, associated with a first computer node, to; create a request comprising a source memory address of data that is to be read over an RDMA connection and a destination memory address at which the data is to be stored; and invoke a failover virtual layer to; establish the RDMA connection between the first computer node and a second computer node based upon the request; and responsive to detecting an error on the RDMA connection; perform a failover of the RDMA connection; and during the error on the RDMA connection, accept one or more requests from the upper subsystem module.
-
Specification