Fault tolerance solution for stateful applications
First Claim
1. A method for providing fault tolerance on a virtual machine (VM) cluster, comprising:
- maintaining a plurality of VMs in a VM cluster servicing a plurality of client sessions each having a network traffic flow directed to the VM cluster;
generating a primary client state and a backup client state for each client session according to a predefined criteria, wherein the primary client state and the backup client state are hosted on separate instances of the VMs in the VM cluster;
directing the network traffic flow of each of the client sessions to the VM hosting the primary client state of the client session;
detecting a failing VM in the VM cluster;
designating the backup client states of the primary client states hosted on the failing VM as new primary client states and directing the network traffic flow of the corresponding client sessions to the VMs hosting the new primary client states; and
generating a new backup client state for each of the backup client states hosted on the failing VM and a new backup for each of the new primary client states.
1 Assignment
0 Petitions
Accused Products
Abstract
A fault tolerance method and system for VMs on a cluster identifies a client state for each client session for those applications. The method replicates the client session onto a primary and a backup VM, and uses a network controller and orchestrator to direct network traffic to the primary VM and to periodically replicate the state onto the backup VM. In case of a VM failure, the method reroutes network traffic of states for which the failed VM serves as a primary to the corresponding backup, and replicates states without a backup after the failure onto another VM to create new backups. The method may be used as part of a method or system implementing the split/merge paradigm.
-
Citations
20 Claims
-
1. A method for providing fault tolerance on a virtual machine (VM) cluster, comprising:
-
maintaining a plurality of VMs in a VM cluster servicing a plurality of client sessions each having a network traffic flow directed to the VM cluster; generating a primary client state and a backup client state for each client session according to a predefined criteria, wherein the primary client state and the backup client state are hosted on separate instances of the VMs in the VM cluster; directing the network traffic flow of each of the client sessions to the VM hosting the primary client state of the client session; detecting a failing VM in the VM cluster; designating the backup client states of the primary client states hosted on the failing VM as new primary client states and directing the network traffic flow of the corresponding client sessions to the VMs hosting the new primary client states; and generating a new backup client state for each of the backup client states hosted on the failing VM and a new backup for each of the new primary client states. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for providing fault tolerance on a virtual machine (VM) cluster, comprising:
-
a first computer having a processor, and a computer-readable storage device; and a program embodied on the storage device for execution by the processor, the program having a plurality of program modules, including; a maintaining module configured to maintain a plurality of VMs in a VM cluster servicing a plurality of client sessions each having a network traffic flow directed to the VM cluster; a first generating module configured to generate a primary client state and a backup client state for each client session according to a predefined criteria, wherein the primary client state and the backup client state are hosted on separate instances of the VMs in the VM cluster; a directing module configured to directing the network traffic flow of each of the client sessions to the VM hosting the primary client state of the client session; a detecting module configured to detect a failing VM in the VM cluster; a designating module configured to designate the backup client states of the primary client states hosted on the failing VM as new primary client states and to direct the network traffic flow of the corresponding client sessions to the VMs hosting the new primary client states; and a second generating module configured to generate a new backup client state for each of the backup client states hosted on the failing VM and a new backup for each of the new primary client states. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15)
-
-
16. A computer program product for providing fault tolerance on a virtual machine (VM) cluster, the computer program product comprising a non-transitory computer-readable storage medium having program code embodied therewith, the program code readable/executable by a first processor of a first computer to perform a method comprising:
-
maintaining a plurality of VMs, by the processor, in a VM cluster servicing a plurality of client sessions each having a network traffic flow directed to the VM cluster; generating a primary client state and a backup client state, by the processor, for each client session according to a predefined criteria, wherein the primary client state and the backup client state are hosted on separate instances of the VMs in the VM cluster; directing the network traffic flow of each of the client sessions, by the processor, to the VM hosting the primary client state of the client session; detecting a failing VM in the VM cluster, by the processor; designating the backup client states of the primary client states hosted on the failing VM as new primary client states, by the processor, and directing the network traffic flow of the corresponding client sessions, by the processor, to the VMs hosting the new primary client states; and generating a new backup client state, by the processor, for each of the backup client states hosted on the failing VM and a new backup for each of the new primary client states. - View Dependent Claims (17, 18, 19, 20)
-
Specification