Restoring an archived file in a distributed filesystem
First Claim
1. A computer-implemented method for restoring an archived file in a distributed filesystem, the method comprising:
- collectively managing the data of the distributed filesystem using two or more cloud controllers, wherein collectively managing the data comprises;
storing the data for the distributed filesystem in one or more cloud storage systems, wherein the cloud controllers cache and ensure data consistency for data stored in the cloud storage systems;
maintaining in each cloud controller a metadata hierarchy that reflects the current state of the distributed filesystem, wherein changes to the metadata for the distributed filesystem are synchronized across the cloud controllers for the distributed filesystem to ensure that the clients of the distributed filesystem share a consistent view of the files in the distributed filesystem;
collectively presenting a unified namespace for the distributed filesystem to the clients of the distributed filesystem via the two or more cloud controllers, wherein the clients access the distributed filesystem via the cloud controllers, wherein the file data for the distributed filesystem is stored in the cloud storage systems, wherein cloud controllers cache in their local storage devices a subset of the file data from the remote cloud storage system that is being actively accessed by each respective cloud controller'"'"'s clients, wherein new file data received by each cloud controller from its clients is written to the cloud storage systems; and
archiving one or more infrequently accessed files for the distributed filesystem in an archival cloud storage system;
determining a billing model that discloses cost trade-offs for restore operations for the archival cloud storage system;
receiving at a cloud controller a request from a client system to access an archived file in the distributed filesystem, wherein the cloud controller tracks one or more distributed restore operations that are currently being performed by other cloud controllers for the distributed filesystem upon the archival cloud storage system; and
requesting to restore the archived file from the archival cloud storage system, wherein requesting to restore the archived file comprises;
determining a set of restore options for the archived file based on the billing model, wherein the set of restore options span a range of different restore time intervals and costs that are available for restoring the archived file; and
adjusting the restore behavior for the distributed filesystem by contacting one or more other cloud controllers for the distributed filesystem to collectively adjust one or more of the distributed restore operations for the distributed filesystem to ensure that the full set of restore operations for the distributed filesystem do not exceed a cost and bandwidth constraint for the billing model.
9 Assignments
0 Petitions
Accused Products
Abstract
The disclosed embodiments disclose techniques for restoring an archived file in a distributed filesystem. Two or more cloud controllers collectively manage distributed filesystem data that is stored in one or more cloud storage systems; the cloud controllers ensure data consistency for the stored data, and each cloud controller caches portions of the distributed filesystem. Furthermore, cloud controllers may archive infrequently-accessed files in an archival cloud storage system. During operation, a cloud controller receives a request from a client system to access an archived file, and restores this archived file from the archival cloud storage system.
-
Citations
20 Claims
-
1. A computer-implemented method for restoring an archived file in a distributed filesystem, the method comprising:
-
collectively managing the data of the distributed filesystem using two or more cloud controllers, wherein collectively managing the data comprises; storing the data for the distributed filesystem in one or more cloud storage systems, wherein the cloud controllers cache and ensure data consistency for data stored in the cloud storage systems; maintaining in each cloud controller a metadata hierarchy that reflects the current state of the distributed filesystem, wherein changes to the metadata for the distributed filesystem are synchronized across the cloud controllers for the distributed filesystem to ensure that the clients of the distributed filesystem share a consistent view of the files in the distributed filesystem; collectively presenting a unified namespace for the distributed filesystem to the clients of the distributed filesystem via the two or more cloud controllers, wherein the clients access the distributed filesystem via the cloud controllers, wherein the file data for the distributed filesystem is stored in the cloud storage systems, wherein cloud controllers cache in their local storage devices a subset of the file data from the remote cloud storage system that is being actively accessed by each respective cloud controller'"'"'s clients, wherein new file data received by each cloud controller from its clients is written to the cloud storage systems; and archiving one or more infrequently accessed files for the distributed filesystem in an archival cloud storage system; determining a billing model that discloses cost trade-offs for restore operations for the archival cloud storage system; receiving at a cloud controller a request from a client system to access an archived file in the distributed filesystem, wherein the cloud controller tracks one or more distributed restore operations that are currently being performed by other cloud controllers for the distributed filesystem upon the archival cloud storage system; and requesting to restore the archived file from the archival cloud storage system, wherein requesting to restore the archived file comprises; determining a set of restore options for the archived file based on the billing model, wherein the set of restore options span a range of different restore time intervals and costs that are available for restoring the archived file; and adjusting the restore behavior for the distributed filesystem by contacting one or more other cloud controllers for the distributed filesystem to collectively adjust one or more of the distributed restore operations for the distributed filesystem to ensure that the full set of restore operations for the distributed filesystem do not exceed a cost and bandwidth constraint for the billing model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory computer-readable storage medium storing instructions that when executed by a computer cause the computer to perform a method for restoring an archived file in a distributed filesystem, the method comprising:
-
collectively managing the data of the distributed filesystem using two or more cloud controllers, wherein collectively managing the data comprises; storing the data for the distributed filesystem in one or more cloud storage systems, wherein the cloud controllers cache and ensure data consistency for data stored in the cloud storage systems; maintaining in each cloud controller a metadata hierarchy that reflects the current state of the distributed filesystem, wherein changes to the metadata for the distributed filesystem are synchronized across the cloud controllers for the distributed filesystem to ensure that the clients of the distributed filesystem share a consistent view of the files in the distributed filesystem; collectively presenting a unified namespace for the distributed filesystem to the clients of the distributed filesystem via the two or more cloud controllers, wherein the clients access the distributed filesystem via the cloud controllers, wherein the file data for the distributed filesystem is stored in the cloud storage systems, wherein cloud controllers cache in their local storage devices a subset of the file data from the remote cloud storage system that is being actively accessed by each respective cloud controller'"'"'s clients, wherein new file data received by each cloud controller from its clients is written to the cloud storage systems; and archiving one or more infrequently accessed files for the distributed filesystem in an archival cloud storage system; determining a billing model that discloses cost trade-offs for restore operations for the archival cloud storage system; receiving at a cloud controller a request from a client system to access an archived file in the distributed filesystem, wherein the cloud controller tracks one or more distributed restore operations that are currently being performed by other cloud controllers for the distributed filesystem upon the archival cloud storage system; and requesting to restore the archived file from the archival cloud storage system, wherein requesting to restore the archived file comprises; determining a set of restore options for the archived file based on the billing model, wherein the set of restore options span a range of different restore time intervals and costs that are available for restoring the archived file; and adjusting the restore behavior for the distributed filesystem by contacting one or more other cloud controllers for the distributed filesystem to collectively adjust one or more of the distributed restore operations for the distributed filesystem to ensure that the full set of restore operations for the distributed filesystem do not exceed a cost and bandwidth constraint for the billing model. - View Dependent Claims (16, 17, 18, 19)
-
-
20. A cloud controller that restores an archived file in a distributed filesystem, comprising:
-
a processor; a storage mechanism that stores metadata for the distributed filesystem; and a storage management mechanism; wherein two or more cloud controllers collectively manage the data of the distributed filesystem by; storing the data for the distributed filesystem in one or more cloud storage systems, wherein the cloud controllers cache and ensure data consistency for data stored in the cloud storage systems; maintaining in each cloud controller a metadata hierarchy that reflects the current state of the distributed filesystem, wherein changes to the metadata for the distributed filesystem are synchronized across the cloud controllers for the distributed filesystem to ensure that the clients of the distributed filesystem share a consistent view of the files in the distributed filesystem; and collectively presenting a unified namespace for the distributed filesystem to the clients of the distributed filesystem via the two or more cloud controllers, wherein the clients access the distributed filesystem via the cloud controllers, wherein the file data for the distributed filesystem is stored in the cloud storage systems, wherein cloud controllers cache in their local storage devices a subset of the file data from the remote cloud storage system that is being actively accessed by each respective cloud controller'"'"'s clients, wherein new file data received by each cloud controller from its clients is written to the cloud storage systems; wherein the cloud controller is configured to archive one or more infrequently accessed files for the distributed filesystem in an archival cloud storage system; wherein the cloud controller is configured to determine a billing model that discloses cost trade-offs for restore operations for the archival cloud storage system; wherein the cloud controller is further configured to receive a request from a client system to access an archived file in the distributed filesystem, wherein the cloud controller tracks one or more distributed restore operations that are currently being performed by other cloud controllers for the distributed filesystem upon the archival cloud storage system; and wherein the cloud controller is further configured to request to restore the archived file from the archival cloud storage system, wherein requesting to restore the archived file comprises the cloud controller; determining a set of restore options for the archived file based on the billing model, wherein the set of restore options span a range of different restore time intervals and costs that are available for restoring the archived file; and adjusting the restore behavior for the distributed filesystem by contacting one or more other cloud controllers for the distributed filesystem to collectively adjust one or more of the distributed restore operations for the distributed filesystem to ensure that the full set of restore operations for the distributed filesystem do not exceed a cost and bandwidth constraint for the billing model.
-
Specification