System for live-migration and automated recovery of applications in a distributed system
First Claim
1. A method of mounting a filesystem holding data for a live application at a server, the method comprising:
- prior to an event causing mounting of the application at the server, receiving changes in the live application data at the server from a current master server hosting the application and maintaining a version of the live application data;
responsive to the event, the server recognizing itself as the new master server and mounting the filesystem for the live application using its maintained version of the live application data;
receiving requests for the application at the server and servicing the request to deliver a service using the live application; and
managing load in a cluster of servers in which the server is connected, by at least;
detecting the number of servers in the cluster and their current application load; and
exchanging messages with other servers in the cluster to migrate applications to balance the load,wherein the server identifies itself as the new master server after exchanging messages with other servers in the cluster to determine the version of the filesystem of the highest centre of mass metric, based on analysis of snapshots of changes in the live application data which have been received.
5 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for distribution of applications amongst a number of servers, ensuring that changes to application data on a master for that application are asynchronously replicated to a number of slaves for that application. Servers may be located in geographically diverse locations; the invention permits data replication over high-latency and lossy network connections and failure-tolerance under hardware and network failure conditions. Access to applications is mediated by a distributed protocol handler which allows any request for any application to be addressed to any server, and which, when working in tandem with the replication system, pauses connections momentarily to allow seamless, consistent live-migration of applications and their state between servers. Additionally, a system which controls the aforementioned live-migration based on dynamic measurement of load generated by each application and the topological preferences of each application, in order to automatically keep servers at an optimum utilization level.
53 Citations
16 Claims
-
1. A method of mounting a filesystem holding data for a live application at a server, the method comprising:
-
prior to an event causing mounting of the application at the server, receiving changes in the live application data at the server from a current master server hosting the application and maintaining a version of the live application data; responsive to the event, the server recognizing itself as the new master server and mounting the filesystem for the live application using its maintained version of the live application data; receiving requests for the application at the server and servicing the request to deliver a service using the live application; and managing load in a cluster of servers in which the server is connected, by at least; detecting the number of servers in the cluster and their current application load; and exchanging messages with other servers in the cluster to migrate applications to balance the load, wherein the server identifies itself as the new master server after exchanging messages with other servers in the cluster to determine the version of the filesystem of the highest centre of mass metric, based on analysis of snapshots of changes in the live application data which have been received. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. Computer software which, when executed by appropriate processing means, causes the processing means to implement a method of mounting a filesystem holding data for a live application at a server, the method comprising:
-
prior to an event causing mounting of the application at the server, receiving changes in the live application data at the server from a current master server hosting the application and maintaining a version of the live application data; responsive to the event, the server recognizing itself as the new master server and mounting the filesystem for the live application using its maintained version of the live application data; receiving requests for the application at the server and servicing the request to deliver a service using the live application; and managing load in a cluster of servers in which the server is connected, by at least; detecting the number of servers in the cluster and their current application load; and exchanging messages with other servers in the cluster to migrate applications to balance the load, wherein the server identifies itself as the new master server after exchanging messages with other servers in the cluster to determine the version of the filesystem of the highest centre of mass metric, based on analysis of snapshots of changes in the live application data which have been received. - View Dependent Claims (8, 9, 10, 11)
-
-
12. A method for replicating a filesystem between a first server and a second server prior to and following a partition between the first server and the second server, the method comprising:
-
at the first server, taking snapshots of a current state of the filesystem at predetermined points in time following modification of the filesystem, each snapshot recording differences between the current state of the filesystem on the server and the state of the filesystem on the server at the time point of a previous snapshot; continually replicating the snapshots taken on the first server to the second server as soon as they are taken; upon detection of a partition, both the first and the second server becoming masters for the filesystem and accepting new modifications to the filesystems; after recovery of the partition, performing an update process to update the filesystem, the update process comprising; identifying which of the first server and the second server contains the most current version of the filesystem; nominating the server so identified as the master server and the other server as the slave server; identifying a snapshot that is common to both the master server and the slave server; and replicating subsequent snapshots from the master server to the slave server. - View Dependent Claims (13, 14, 15, 16)
-
Specification