ARCHIVING DATA FOR A DISTRIBUTED FILESYSTEM
First Claim
1. A computer-implemented method for archiving data for a distributed filesystem, the method comprising:
- collectively managing the data of the distributed filesystem using two or more cloud controllers, wherein collectively managing the data comprises;
upon receiving in a cloud controller new data from a client, sending from the cloud controller an incremental metadata snapshot for the new data, wherein the incremental metadata snapshot is received by the other cloud controllers of the distributed filesystem;
storing the data for the distributed filesystem in one or more cloud storage systems, wherein the cloud controllers cache and ensure data consistency for data stored in the cloud storage systems; and
sending an incremental data snapshot containing the new data from the cloud controller to a first cloud storage system;
at a subsequent time, determining that a cloud file in the incremental data snapshot is no longer actively referenced in the distributed filesystem; and
transferring the cloud file from the first cloud storage system to an archival cloud storage system.
9 Assignments
0 Petitions
Accused Products
Abstract
The disclosed embodiments provide a system that archives data for a distributed filesystem. Two or more cloud controllers collectively manage distributed filesystem data that is stored in one or more cloud storage systems; the cloud controllers cache and ensure data consistency for the stored data. During operation, a cloud controller determines that a cloud file in a previously stored data snapshot is no longer being actively referenced in the distributed filesystem. The cloud controller transfers this cloud file from the (first) cloud storage system to an archival cloud storage system, thereby reducing storage costs while preserving the data in the cloud file in case it is ever needed again.
90 Citations
20 Claims
-
1. A computer-implemented method for archiving data for a distributed filesystem, the method comprising:
-
collectively managing the data of the distributed filesystem using two or more cloud controllers, wherein collectively managing the data comprises; upon receiving in a cloud controller new data from a client, sending from the cloud controller an incremental metadata snapshot for the new data, wherein the incremental metadata snapshot is received by the other cloud controllers of the distributed filesystem; storing the data for the distributed filesystem in one or more cloud storage systems, wherein the cloud controllers cache and ensure data consistency for data stored in the cloud storage systems; and sending an incremental data snapshot containing the new data from the cloud controller to a first cloud storage system; at a subsequent time, determining that a cloud file in the incremental data snapshot is no longer actively referenced in the distributed filesystem; and transferring the cloud file from the first cloud storage system to an archival cloud storage system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-readable storage medium storing instructions that when executed by a computer cause the computer to perform a method for archiving data from a distributed filesystem, the method comprising:
-
collectively managing the data of the distributed filesystem using two or more cloud controllers, wherein collectively managing the data comprises; upon receiving in a cloud controller new data from a client, sending from the cloud controller an incremental metadata snapshot for the new data, wherein the incremental metadata snapshot is received by the other cloud controllers of the distributed filesystem; storing the data for the distributed filesystem in one or more cloud storage systems, wherein the cloud controllers cache and ensure data consistency for data stored in the cloud storage systems; and sending an incremental data snapshot containing the new data from the cloud controller to a first cloud storage system; at a subsequent time, determining that a cloud file in the incremental data snapshot is no longer actively referenced in the distributed filesystem; and transferring the cloud file from the first cloud storage system to an archival cloud storage system. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. A cloud controller that archives data from a distributed filesystem, comprising:
-
a processor; a storage mechanism that stores metadata for the distributed filesystem; and a storage management mechanism; wherein two or more cloud controllers collectively manage the data of the distributed filesystem, wherein collectively managing the data comprises; upon receiving in a cloud controller new data from a client, sending from the cloud controller an incremental metadata snapshot for the new data, wherein the incremental metadata snapshot is received by the other cloud controllers of the distributed filesystem; storing the data for the distributed filesystem in one or more cloud storage systems, wherein the storage management mechanisms of the cloud controllers are configured to cache and ensure data consistency for data stored in the cloud storage systems; and sending an incremental data snapshot containing the new data from the cloud controller to a first cloud storage system; and wherein the storage management mechanism is configured to; at a subsequent time, determine that a cloud file in the incremental data snapshot is no longer actively referenced in the distributed filesystem; and transfer the cloud file from the first cloud storage system to an archival cloud storage system.
-
Specification