System for live-migration and automated recovery of applications in a distributed system
First Claim
1. Computer software which, when executed by appropriate processing means, causes the processing means to implement a method of managing snapshots of a filesystem, where the filesystem is replicated across multiple servers connected in a cluster comprising:
- indentifying each snapshot by a snapnode object in the form of a binary sequence comprising a snapshot identifier, a parent pointer to an earlier snapshot on a specific server where the snapshot was taken, and the set of servers where this snapshot is presently stored;
storing a graph of snapnode objects of a set of snapshots of a filesystem on each of the multiple servers, one of the servers being an active master of the file system;
the active master taking a new snapshot of the filesystem and creating a snapnode object for the new snapshot identifying the active master as a server where the new snapshot is stored; and
transmitting the new snapshot to the other servers of the multiple servers; and
modifying the snapnode object to identify the other servers as servers where the new snapshot is stored,wherein the method is used to manage recovery of a file system after an event in which the active master is to confirm or modify its status.
5 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for distribution of applications amongst a number of servers, ensuring that changes to application data on a master for that application are asynchronously replicated to a number of slaves for that application. Servers may be located in geographically diverse locations; the invention permits data replication over high-latency and lossy network connections and failure-tolerance under hardware and network failure conditions. Access to applications is mediated by a distributed protocol handler which allows any request for any application to be addressed to any server, and which, when working in tandem with the replication system, pauses connections momentarily to allow seamless, consistent live-migration of applications and their state between servers. Additionally, a system which controls the aforementioned live-migration based on dynamic measurement of load generated by each application and the topological preferences of each application, in order to automatically keep servers at an optimum utilization level.
63 Citations
18 Claims
-
1. Computer software which, when executed by appropriate processing means, causes the processing means to implement a method of managing snapshots of a filesystem, where the filesystem is replicated across multiple servers connected in a cluster comprising:
-
indentifying each snapshot by a snapnode object in the form of a binary sequence comprising a snapshot identifier, a parent pointer to an earlier snapshot on a specific server where the snapshot was taken, and the set of servers where this snapshot is presently stored; storing a graph of snapnode objects of a set of snapshots of a filesystem on each of the multiple servers, one of the servers being an active master of the file system; the active master taking a new snapshot of the filesystem and creating a snapnode object for the new snapshot identifying the active master as a server where the new snapshot is stored; and transmitting the new snapshot to the other servers of the multiple servers; and
modifying the snapnode object to identify the other servers as servers where the new snapshot is stored,wherein the method is used to manage recovery of a file system after an event in which the active master is to confirm or modify its status. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of managing snapshots of a filesystem, where the filesystem is replicated across multiple servers connected in a cluster comprising:
-
indentifying each snapshot by a snapnode object in the form of a binary sequence comprising a snapshot identifier, a parent pointer to an earlier snapshot on a specific server where the snapshot was taken, and the set of servers where this snapshot is presently stored; storing a graph of snapnode objects of a set of snapshots of a filesystem on each of the multiple servers, one of the servers being an active master of the file system; the active master taking a new snapshot of the filesystem and creating a snapnode object for the new snapshot identifying the active master as a server where the new snapshot is stored; and transmitting the new snapshot to the other servers of the multiple servers; and modifying the snapnode object to identify the other servers as servers where the new snapshot is stored, wherein the method is used to manage recovery of a file system after an event in which the active master is to confirm or modify its status. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18)
-
Specification