NON-DISRUPTIVE FAILOVER OF RDMA CONNECTION
First Claim
1. A system for performing non-disruptive failover of an RDMA connection between a first computer node and a second computer node, the system comprising:
- an upper subsystem module, associated with a first computer node, configured to;
create a request comprising a source memory address of data that is to be read over an RDMA connection and a destination memory address at which the data is to be stored; and
a failover virtual layer configured to;
establish the RDMA connection between the first computer node and a second computer node based upon the request; and
responsive to detecting an error on the RDMA connection;
perform a failover of the RDMA connection; and
during the error on the RDMA connection, accept one or more requests from the upper subsystem module.
0 Assignments
0 Petitions
Accused Products
Abstract
A novel RDMA connection failover technique that minimizes disruption to upper subsystem modules (executed on a computer node), which create requests for data transfer. A new failover virtual layer performs failover of an RDMA connection in error so that the upper subsystem that created a request does not have knowledge of an error (which is recoverable in software and hardware), or of a failure on the RDMA connection due to the error. Since the upper subsystem does not have knowledge of a failure on the RDMA connection or of a performed failover of the RDMA connection, the upper subsystem continues providing requests to the failover virtual layer without interruption, thereby minimizing downtime of the data transfer activity.
-
Citations
20 Claims
-
1. A system for performing non-disruptive failover of an RDMA connection between a first computer node and a second computer node, the system comprising:
-
an upper subsystem module, associated with a first computer node, configured to; create a request comprising a source memory address of data that is to be read over an RDMA connection and a destination memory address at which the data is to be stored; and a failover virtual layer configured to; establish the RDMA connection between the first computer node and a second computer node based upon the request; and responsive to detecting an error on the RDMA connection; perform a failover of the RDMA connection; and during the error on the RDMA connection, accept one or more requests from the upper subsystem module. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A method for performing non-disruptive failover of an RDMA connection between a first computer node and a second computer node, comprising:
-
invoking an upper subsystem module, associated with a first computer node, to; create a request comprising a source memory address of data that is to be read over an RDMA connection and a destination memory address at which the data is to be stored; and invoking a failover virtual layer to; establish the RDMA connection between the first computer node and a second computer node based upon the request; and responsive to detecting an error on the RDMA connection; perform a failover of the RDMA connection; and during the error on the RDMA connection, accept one or more requests from the upper subsystem module. - View Dependent Claims (19)
-
-
20. A computer-program product comprising a non-transitory computer-readable medium having computer program code embodied thereon for performing non-disruptive failover of an RDMA connection between a first computer node and a second computer node, the computer program code adapted to:
-
invoke an upper subsystem module, associated with a first computer node, to; create a request comprising a source memory address of data that is to be read over an RDMA connection and a destination memory address at which the data is to be stored; and invoke a failover virtual layer to; establish the RDMA connection between the first computer node and a second computer node based upon the request; and responsive to detecting an error on the RDMA connection; perform a failover of the RDMA connection; and during the error on the RDMA connection, accept one or more requests from the upper subsystem module.
-
Specification