Providing disaster recovery for a distributed filesystem
First Claim
1. A computer-implemented method for providing disaster recovery for a distributed filesystem, the method comprising:
- collectively managing the data of the distributed filesystem using two or more cloud controllers, wherein collectively managing the data comprises;
collectively presenting a unified namespace for the distributed filesystem to the clients of the distributed filesystem via the two or more cloud controllers, wherein the clients can only access the distributed filesystem via the cloud controllers, wherein the file data for the distributed filesystem is stored in a remote cloud storage system using fixed-size cloud files, wherein each cloud controller caches a subset of the file data from the remote cloud storage system that is being actively accessed by that cloud controller'"'"'s respective clients, wherein all new file data received by each cloud controller from its clients is written to the remote cloud storage system via the receiving cloud controller;
maintaining at each cloud controller a copy of the complete metadata for all of the files stored in the distributed filesystem, wherein each cloud controller communicates any changes to the metadata for the distributed filesystem to the full set of cloud controllers for the distributed filesystem to ensure that the clients of the distributed filesystem share a consistent view of each file in the distributed filesystem;
upon receiving in a cloud controller new data from a client, storing the new file data for the distributed filesystem in the remote cloud storage system, wherein the cloud file is sent from the cloud controller to the remote cloud storage system as part of an incremental data snapshot; and
upon receiving confirmation that the cloud file has been successfully stored in the remote cloud storage system, sending from the cloud controller an incremental metadata snapshot that includes new metadata for the distributed filesystem that describes the new data, wherein the incremental metadata snapshot is received by the other cloud controllers of the distributed filesystem;
allocating a backup cloud controller associated with the distributed filesystem, wherein the backup cloud controller maintains a copy of the complete metadata for all of the files stored in the distributed filesystem and receives all of the metadata changes for the distributed filesystem, including the incremental metadata snapshot sent by the cloud controller;
detecting the failure of the cloud controller; and
rerouting data requests from clients associated with the failed cloud controller to the backup cloud controller.
9 Assignments
0 Petitions
Accused Products
Abstract
The disclosed embodiments provide a system that distributes data for a distributed filesystem across multiple cloud storage systems. Two or more cloud controllers collectively manage distributed filesystem data that is stored in one or more cloud storage systems; the cloud controllers cache and ensure data consistency for the stored data. Whenever each cloud controller receives new data from a client, it outputs an incremental metadata snapshot for the new data that is propagated to the other cloud controllers and an incremental data snapshot containing the new data that is sent to a cloud storage system. During operation, a backup cloud controller associated with the distributed filesystem is also configured to receive each (incremental) metadata snapshot, such that, upon determining the failure of a cloud controller, the backup cloud controller can immediately begin receiving data requests from clients associated with the failed cloud controller.
49 Citations
19 Claims
-
1. A computer-implemented method for providing disaster recovery for a distributed filesystem, the method comprising:
-
collectively managing the data of the distributed filesystem using two or more cloud controllers, wherein collectively managing the data comprises; collectively presenting a unified namespace for the distributed filesystem to the clients of the distributed filesystem via the two or more cloud controllers, wherein the clients can only access the distributed filesystem via the cloud controllers, wherein the file data for the distributed filesystem is stored in a remote cloud storage system using fixed-size cloud files, wherein each cloud controller caches a subset of the file data from the remote cloud storage system that is being actively accessed by that cloud controller'"'"'s respective clients, wherein all new file data received by each cloud controller from its clients is written to the remote cloud storage system via the receiving cloud controller; maintaining at each cloud controller a copy of the complete metadata for all of the files stored in the distributed filesystem, wherein each cloud controller communicates any changes to the metadata for the distributed filesystem to the full set of cloud controllers for the distributed filesystem to ensure that the clients of the distributed filesystem share a consistent view of each file in the distributed filesystem; upon receiving in a cloud controller new data from a client, storing the new file data for the distributed filesystem in the remote cloud storage system, wherein the cloud file is sent from the cloud controller to the remote cloud storage system as part of an incremental data snapshot; and upon receiving confirmation that the cloud file has been successfully stored in the remote cloud storage system, sending from the cloud controller an incremental metadata snapshot that includes new metadata for the distributed filesystem that describes the new data, wherein the incremental metadata snapshot is received by the other cloud controllers of the distributed filesystem; allocating a backup cloud controller associated with the distributed filesystem, wherein the backup cloud controller maintains a copy of the complete metadata for all of the files stored in the distributed filesystem and receives all of the metadata changes for the distributed filesystem, including the incremental metadata snapshot sent by the cloud controller; detecting the failure of the cloud controller; and rerouting data requests from clients associated with the failed cloud controller to the backup cloud controller. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A non-transitory computer-readable storage medium storing instructions that when executed by a computer cause the computer to perform a method for providing disaster recovery for a distributed filesystem, the method comprising:
-
collectively managing the data of the distributed filesystem using two or more cloud controllers, wherein collectively managing the data comprises; collectively presenting a unified namespace for the distributed filesystem to the clients of the distributed filesystem via the two or more cloud controllers, wherein the clients can only access the distributed filesystem via the cloud controllers, wherein the file data for the distributed filesystem is stored in a remote cloud storage system using fixed-size cloud files, wherein each cloud controller caches a subset of the file data from the remote cloud storage system that is being actively accessed by that cloud controller'"'"'s respective clients, wherein all new file data received by each cloud controller from its clients is written to the remote cloud storage system via the receiving cloud controller; maintaining at each cloud controller a copy of the complete metadata for all of the files stored in the distributed filesystem, wherein each cloud controller communicates any changes to the metadata for the distributed filesystem to the full set of cloud controllers for the distributed filesystem to ensure that the clients of the distributed filesystem share a consistent view of each file in the distributed filesystem; upon receiving in a cloud controller new data from a client, storing the new file data for the distributed filesystem in the remote cloud storage system, wherein the cloud file is sent from the cloud controller to the remote cloud storage system as part of an incremental data snapshot; and upon receiving confirmation that the cloud file has been successfully stored in the remote cloud storage system, sending from the cloud controller an incremental metadata snapshot that includes new metadata for the distributed filesystem that describes the new data, wherein the incremental metadata snapshot is received by the other cloud controllers of the distributed filesystem; allocating a backup cloud controller associated with the distributed filesystem, wherein the backup cloud controller maintains a copy of the complete metadata for all of the files stored in the distributed filesystem and receives all of the metadata changes for the distributed filesystem, including the incremental metadata snapshot sent by the cloud controller; detecting the failure of the cloud controller; and
;rerouting data requests from clients associated with the failed cloud controller to the backup cloud controller. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
-
-
19. A backup cloud controller that provides disaster recovery for a distributed filesystem, comprising:
-
a processor; a storage mechanism that stores metadata for the distributed filesystem; and a receiving mechanism; wherein two or more cloud controllers collectively manage the data of the distributed filesystem, wherein collectively managing the data comprises; collectively presenting a unified namespace for the distributed filesystem to the clients of the distributed filesystem via the two or more cloud controllers, wherein the clients can only access the distributed filesystem via the cloud controllers, wherein the file data for the distributed filesystem is stored in a remote cloud storage system using fixed-size cloud files, wherein each cloud controller caches a subset of the file data from the remote cloud storage system that is being actively accessed by that cloud controller'"'"'s respective clients, wherein all new file data received by each cloud controller from its clients is written to the remote cloud storage system via the receiving cloud controller; maintaining at each cloud controller a copy of the complete metadata for all of the files stored in the distributed filesystem, wherein each cloud controller communicates any changes to the metadata for the distributed filesystem to the full set of cloud controllers for the distributed filesystem to ensure that the clients of the distributed filesystem share a consistent view of each file in the distributed filesystem; upon receiving in a cloud controller new data from a client, storing the new file data for the distributed filesystem in the remote cloud storage system, wherein the cloud file is sent from the cloud controller to the remote cloud storage system as part of an incremental data snapshot; upon receiving confirmation that the cloud file has been successfully stored in the remote cloud storage system, sending from the cloud controller an incremental metadata snapshot that includes new metadata for the distributed filesystem that describes the new data, wherein the incremental metadata snapshot is received by the other cloud controllers of the distributed filesystem; wherein the storage mechanism is configured to maintains a copy of the complete metadata for all of the files stored in the distributed filesystem; wherein the receiving mechanism is configured to receive all of the metadata changes for the distributed filesystem, including the incremental metadata snapshot, and store the received metadata in the storage mechanism; wherein the backup cloud controller is configured to detect the failure of the cloud controller and reroute data requests from the clients associated with the failed cloud controller to the backup cloud controller; and wherein the receiving mechanism is further configured to, upon receive data requests from clients associated with the failed cloud controller.
-
Specification