Scalable cloud backup
First Claim
1. A method comprising:
- maintaining a distributed file system operable within a cluster of nodes wherein files and directories of the distributed file system are associated with a unique logical inode (“
LIN”
) and a set of data blocks;
establishing a cloud backup policy wherein the cloud backup policy is associated with a backup set of files and directories of the distributed file system;
initiating a cloud backup coordinator process on a coordinator node among the cluster of nodes;
initiating a set of worker processes by the cloud backup coordinator process wherein a worker process among the set of worker processes is assigned to each node in the cluster of nodes, wherein the set of worker processes are in communication with the cloud backup coordinator process;
assigning at least one of the backup set of files and directories to each worker processes in the set of worker processes;
for each worker process in the set of worker processes;
packaging at least one assigned backup set of files and directories into a node local upload object wherein the packaging includes arranging the set of data blocks associated with the at least one backup set of files and directories into the node local upload object and generating a set of metadata tables associated with the node local upload object;
in response to at least one of the node local upload object reaching an object capacity or the worker process finishing packing the at least one backup set of files and directories into the node local upload object, uploading the node local object and the set of metadata tables associated with the node local object to a cloud storage provider; and
coalescing the set of metadata tables associated with each uploaded node local object into a cloud hosted metadata table and a cloud hosted file block location table.
9 Assignments
0 Petitions
Accused Products
Abstract
Implementations are provided for scalable cloud backup. A coordinator process can manage worker processes on nodes to package file system data that is targeted for cloud backup into node local upload objects. File data can be arranged into distinct block offsets of the node local upload object. A set of metadata tables can be generated that characterize each file that is backed up as well as file block location information for each data block. The node local upload objects can be uploaded to a cloud service provider. The set of metadata tables generated by the worker process can be coalesced into a global set of metadata tables that describe the data that has been backed up. In one implementation, after an initial cloud backup has occurred, a snapshot service of the file system can be used to incrementally backup blocks of the file that have been changed.
26 Citations
18 Claims
-
1. A method comprising:
-
maintaining a distributed file system operable within a cluster of nodes wherein files and directories of the distributed file system are associated with a unique logical inode (“
LIN”
) and a set of data blocks;establishing a cloud backup policy wherein the cloud backup policy is associated with a backup set of files and directories of the distributed file system; initiating a cloud backup coordinator process on a coordinator node among the cluster of nodes; initiating a set of worker processes by the cloud backup coordinator process wherein a worker process among the set of worker processes is assigned to each node in the cluster of nodes, wherein the set of worker processes are in communication with the cloud backup coordinator process; assigning at least one of the backup set of files and directories to each worker processes in the set of worker processes; for each worker process in the set of worker processes; packaging at least one assigned backup set of files and directories into a node local upload object wherein the packaging includes arranging the set of data blocks associated with the at least one backup set of files and directories into the node local upload object and generating a set of metadata tables associated with the node local upload object; in response to at least one of the node local upload object reaching an object capacity or the worker process finishing packing the at least one backup set of files and directories into the node local upload object, uploading the node local object and the set of metadata tables associated with the node local object to a cloud storage provider; and coalescing the set of metadata tables associated with each uploaded node local object into a cloud hosted metadata table and a cloud hosted file block location table. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system comprising at least one storage device and at least one hardware processor configured to:
-
maintaining a distributed file system operable within a cluster of nodes wherein files and directories of the distributed file system are associated with a unique logical inode (“
LIN”
) and a set of data blocks;establishing a cloud backup policy wherein the cloud backup policy is associated with a backup set of files and directories of the distributed file system; initiating a cloud backup coordinator process on a coordinator node among the cluster of nodes; initiating a set of worker processes by the cloud backup coordinator process wherein a worker process among the set of worker processes is assigned to each node in the cluster of nodes, wherein the set of worker processes are in communication with the cloud backup coordinator process; assigning at least one of the backup set of files and directories to each worker processes in the set of worker processes; for each worker process in the set of worker processes; packaging at least one assigned backup set of files and directories into a node local upload object wherein the packaging includes arranging the set of data blocks associated with the at least one backup set of files and directories into the node local upload object and generating a set of metadata tables associated with the node local upload object; in response to at least one of the node local upload object reaching an object capacity or the worker process finishing packing the at least one backup set of files and directories into the node local upload object, uploading the node local object and the set of metadata tables associated with the node local object to a cloud storage provider; and coalescing the set of metadata tables associated with each uploaded node local object into a cloud hosted metadata table and a cloud hosted file block location table. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A non-transitory computer readable medium with program instructions stored thereon to cause a computer to perform the following acts:
-
maintaining a distributed file system operable within a cluster of nodes wherein files and directories of the distributed file system are associated with a unique logical inode (“
LIN”
) and a set of data blocks;establishing a cloud backup policy wherein the cloud backup policy is associated with a backup set of files and directories of the distributed file system; initiating a cloud backup coordinator process on a coordinator node among the cluster of nodes; initiating a set of worker processes by the cloud backup coordinator process wherein a worker process among the set of worker processes is assigned to each node in the cluster of nodes, wherein the set of worker processes are in communication with the cloud backup coordinator process; assigning at least one of the backup set of files and directories to each worker processes in the set of worker processes; for each worker process in the set of worker processes; packaging at least one assigned backup set of files and directories into a node local upload object wherein the packaging includes arranging the set of data blocks associated with the at least one backup set of files and directories into the node local upload object and generating a set of metadata tables associated with the node local upload object; in response to at least one of the node local upload object reaching an object capacity or the worker process finishing packing the at least one backup set of files and directories into the node local upload object, uploading the node local object and the set of metadata tables associated with the node local object to a cloud storage provider; and coalescing the set of metadata tables associated with each uploaded node local object into a cloud hosted metadata table and a cloud hosted file block location table. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification