Deleting a file from a distributed filesystem
First Claim
1. A computer-implemented method for deleting a file from a distributed filesystem, the method comprising:
- collectively managing the data of the distributed filesystem using two or more cloud controllers, wherein the cloud controllers store metadata for the distributed filesystem;
collectively presenting a unified namespace for the distributed filesystem to the clients of the distributed filesystem via the two or more cloud controllers, wherein the clients access the distributed filesystem via the cloud controllers, wherein the file data for the distributed filesystem is stored in the cloud storage system, wherein each cloud controller caches a subset of the file data from the remote cloud storage system that is being actively accessed by that cloud controller'"'"'s respective clients, wherein new file data received by each cloud controller from its clients is written to the remote cloud storage system via the receiving cloud controller;
maintaining at each cloud controller a copy of the metadata for the files stored in the distributed filesystem, wherein changes to the metadata for the distributed filesystem are synchronized across the cloud controllers, wherein synchronizing metadata includes synchronizing deduplication information across the cloud controllers, wherein each cloud controller maintains a deduplication table that tracks deduplicated data for the distributed filesystem;
receiving at a cloud controller a request from a client to delete a target file in the distributed filesystem;
updating a user view of the distributed filesystem to present an appearance of the target file being deleted to clients of the cloud controller;
notifying the set of cloud controllers of the pending delete of the target file via an incremental metadata snapshot, wherein upon receiving the incremental metadata snapshot the other cloud controllers in the set of cloud controllers also present an appearance of the target file being deleted to their clients to ensure that the delete operation is consistent across the distributed filesystem; and
initiating a background deletion operation across the cloud controllers to delete the target file without negatively affecting the performance of other users of the distributed filesystem, wherein the background deletion operation comprises using the incremental metadata snapshot to update the deduplication tables across the set of cloud controllers to reflect deduplication updates for the deletion of the target file.
9 Assignments
0 Petitions
Accused Products
Abstract
The disclosed embodiments disclose techniques for deleting a file from a distributed filesystem. Two or more cloud controllers collectively manage distributed filesystem data that is stored in one or more cloud storage systems; the cloud controllers store metadata for the distributed filesystem, and cache and ensure data consistency for the data stored in the cloud storage systems. During operation, a cloud controller receives a request from a client to delete a file from the distributed filesystem. The cloud controller updates a user view of the distributed filesystem to present the appearance of the target file being deleted to the client, and then initiates a background deletion operation to delete the target file without negatively affecting the performance of the other users of the distributed filesystem.
107 Citations
20 Claims
-
1. A computer-implemented method for deleting a file from a distributed filesystem, the method comprising:
-
collectively managing the data of the distributed filesystem using two or more cloud controllers, wherein the cloud controllers store metadata for the distributed filesystem; collectively presenting a unified namespace for the distributed filesystem to the clients of the distributed filesystem via the two or more cloud controllers, wherein the clients access the distributed filesystem via the cloud controllers, wherein the file data for the distributed filesystem is stored in the cloud storage system, wherein each cloud controller caches a subset of the file data from the remote cloud storage system that is being actively accessed by that cloud controller'"'"'s respective clients, wherein new file data received by each cloud controller from its clients is written to the remote cloud storage system via the receiving cloud controller; maintaining at each cloud controller a copy of the metadata for the files stored in the distributed filesystem, wherein changes to the metadata for the distributed filesystem are synchronized across the cloud controllers, wherein synchronizing metadata includes synchronizing deduplication information across the cloud controllers, wherein each cloud controller maintains a deduplication table that tracks deduplicated data for the distributed filesystem; receiving at a cloud controller a request from a client to delete a target file in the distributed filesystem; updating a user view of the distributed filesystem to present an appearance of the target file being deleted to clients of the cloud controller; notifying the set of cloud controllers of the pending delete of the target file via an incremental metadata snapshot, wherein upon receiving the incremental metadata snapshot the other cloud controllers in the set of cloud controllers also present an appearance of the target file being deleted to their clients to ensure that the delete operation is consistent across the distributed filesystem; and initiating a background deletion operation across the cloud controllers to delete the target file without negatively affecting the performance of other users of the distributed filesystem, wherein the background deletion operation comprises using the incremental metadata snapshot to update the deduplication tables across the set of cloud controllers to reflect deduplication updates for the deletion of the target file. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A non-transitory computer-readable storage medium storing instructions that when executed by a computer cause the computer to perform a method for deleting a file from a distributed filesystem, the method comprising:
-
collectively managing the data of the distributed filesystem using two or more cloud controllers, wherein the cloud controllers store metadata for the distributed filesystem; collectively presenting a unified namespace for the distributed filesystem to the clients of the distributed filesystem via the two or more cloud controllers, wherein the clients access the distributed filesystem via the cloud controllers, wherein the file data for the distributed filesystem is stored in the cloud storage system, wherein each cloud controller caches a subset of the file data from the remote cloud storage system that is being actively accessed by that cloud controller'"'"'s respective clients, wherein new file data received by each cloud controller from its clients is written to the remote cloud storage system via the receiving cloud controller; maintaining at each cloud controller a copy of the metadata for the files stored in the distributed filesystem, wherein changes to the metadata for the distributed filesystem are synchronized across the cloud controllers, wherein synchronizing metadata includes synchronizing deduplication information across the cloud controllers, wherein each cloud controller maintains a deduplication table that tracks deduplicated data for the distributed filesystem; receiving at a cloud controller a request from a client to delete a target file in the distributed filesystem; updating a user view of the distributed filesystem to present an appearance of the target file being deleted to clients of the cloud controller; notifying the set of cloud controllers of the pending delete of the target file via an incremental metadata snapshot, wherein upon receiving the incremental metadata snapshot the other cloud controllers in the set of cloud controllers also present an appearance of the target file being deleted to their clients to ensure that the delete operation is consistent across the distributed filesystem; and initiating a background deletion operation across the cloud controllers to delete the target file without negatively affecting the performance of other users of the distributed filesystem, wherein the background deletion operation comprises using the incremental metadata snapshot to update the deduplication tables across the set of cloud controllers to reflect deduplication updates for the deletion of the target file. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. A cloud controller that deletes a file from a distributed filesystem, comprising:
-
a processor; a storage mechanism that stores metadata for the distributed filesystem; and a storage management mechanism; wherein two or more cloud controllers collectively manage the data of the distributed filesystem, wherein collectively managing the data of the distributed filesystem comprises; collectively presenting a unified namespace for the distributed filesystem to the clients of the distributed filesystem via the two or more cloud controllers, wherein the clients access the distributed filesystem via the cloud controllers, wherein the file data for the distributed filesystem is stored in the cloud storage system, wherein each cloud controller caches a subset of the file data from the remote cloud storage system that is being actively accessed by that cloud controller'"'"'s respective clients, wherein new file data received by each cloud controller from its clients is written to the remote cloud storage system via the receiving cloud controller; and maintaining at each cloud controller a copy of the metadata for the files stored in the distributed filesystem, wherein changes to the metadata for the distributed filesystem are synchronized across the cloud controllers, wherein synchronizing metadata includes synchronizing deduplication information across the cloud controllers, wherein each cloud controller maintains a deduplication table that tracks deduplicated data for the distributed filesystem; wherein the cloud controller is configured to receive a request from a client to delete a target file in the distributed filesystem; wherein the storage management mechanism is configured to update a user view of the distributed filesystem to present an appearance of the target file being deleted to clients of the cloud controller; wherein the storage management mechanism is configured to notify the set of cloud controllers of the pending delete of the target file via an incremental metadata snapshot, wherein upon receiving the incremental metadata snapshot the other cloud controllers in the set of cloud controllers also present an appearance of the target file being deleted to their clients to ensure that the delete operation is consistent across the distributed filesystem; and wherein the storage management mechanism is further configured to initiate a background deletion operation across the cloud controllers to delete the target file without negatively affecting the performance of other users of the distributed filesystem, wherein the background deletion operation comprises using the incremental metadata snapshot to update the deduplication tables across the set of cloud controllers to reflect deduplication updates for the deletion of the target file.
-
Specification