Performing anti-virus checks for a distributed filesystem
First Claim
1. A computer-implemented method for performing file checks for a distributed filesystem, the method comprising:
- collectively managing the data of the distributed filesystem using two or more cloud controllers, wherein collectively managing the data comprises;
storing the data for the distributed filesystem in a cloud storage system, wherein the cloud controllers cache and ensure data consistency for data stored in the cloud storage system;
maintaining in each cloud controller a metadata hierarchy that reflects the current state of the distributed filesystem, wherein changes to the metadata for the distributed filesystem are synchronized across the cloud controllers,wherein the metadata hierarchy in the cloud controller tracks the location of distributed filesystem data blocks in the cloud storage system and cached distributed filesystem data blocks in the cloud controller, wherein the cloud controller uses the metadata hierarchy to locate and download requested, uncached data blocks in the distributed filesystem from the cloud storage system, wherein the file data for the distributed filesystem is stored in the cloud storage systems, wherein cloud controllers cache in their local storage devices a subset of the file data from the cloud storage system that is being actively accessed by each respective cloud controller'"'"'s clients;
upon receiving at a cloud controller a write request from a client system that seeks to store a target file in the distributed filesystem, sending an incremental metadata snapshot containing metadata for the target file from the cloud controller to the other cloud controllers for the distributed filesystem;
wherein a third cloud controller that is associated with scanning operations uses the incremental metadata snapshot to retrieve the target file from the cloud storage system and initiates a scan for the target file, wherein the third cloud controller is co-located with the cloud storage system and geographically distinct and separate from the cloud controller receiving the target file and the client system;
wherein the third cloud controller determines that the target file has been successfully scanned and sends a subsequent incremental metadata snapshot indicating that the target file has been scanned to the other cloud controllers of the distributed filesystem;
wherein a second cloud controller receiving a client request to access the target file uses the contents of the subsequent incremental metadata snapshot to confirm that the target file has been scanned before allowing access to the target file.
9 Assignments
0 Petitions
Accused Products
Abstract
The disclosed embodiments disclose techniques that facilitate the process of performing anti-virus checks for a distributed filesystem. Two or more cloud controllers collectively manage distributed filesystem data that is stored in one or more cloud storage systems; the cloud controllers ensure data consistency for the stored data, and each cloud controller caches portions of the distributed filesystem. During operation, a cloud controller receives a write request from a client system that seeks to store a target file in the distributed system. A scan is then performed for this target file. For instance, the scan may be an anti-virus scan that ensures that viruses are not spread to the distributed filesystem or the clients of the distributed filesystem.
-
Citations
20 Claims
-
1. A computer-implemented method for performing file checks for a distributed filesystem, the method comprising:
-
collectively managing the data of the distributed filesystem using two or more cloud controllers, wherein collectively managing the data comprises; storing the data for the distributed filesystem in a cloud storage system, wherein the cloud controllers cache and ensure data consistency for data stored in the cloud storage system; maintaining in each cloud controller a metadata hierarchy that reflects the current state of the distributed filesystem, wherein changes to the metadata for the distributed filesystem are synchronized across the cloud controllers, wherein the metadata hierarchy in the cloud controller tracks the location of distributed filesystem data blocks in the cloud storage system and cached distributed filesystem data blocks in the cloud controller, wherein the cloud controller uses the metadata hierarchy to locate and download requested, uncached data blocks in the distributed filesystem from the cloud storage system, wherein the file data for the distributed filesystem is stored in the cloud storage systems, wherein cloud controllers cache in their local storage devices a subset of the file data from the cloud storage system that is being actively accessed by each respective cloud controller'"'"'s clients; upon receiving at a cloud controller a write request from a client system that seeks to store a target file in the distributed filesystem, sending an incremental metadata snapshot containing metadata for the target file from the cloud controller to the other cloud controllers for the distributed filesystem; wherein a third cloud controller that is associated with scanning operations uses the incremental metadata snapshot to retrieve the target file from the cloud storage system and initiates a scan for the target file, wherein the third cloud controller is co-located with the cloud storage system and geographically distinct and separate from the cloud controller receiving the target file and the client system; wherein the third cloud controller determines that the target file has been successfully scanned and sends a subsequent incremental metadata snapshot indicating that the target file has been scanned to the other cloud controllers of the distributed filesystem; wherein a second cloud controller receiving a client request to access the target file uses the contents of the subsequent incremental metadata snapshot to confirm that the target file has been scanned before allowing access to the target file. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-readable storage medium storing instructions that when executed by a computer cause the computer to perform a method for performing anti-virus checks for a distributed filesystem, the method comprising:
-
collectively managing the data of the distributed filesystem using two or more cloud controllers, wherein collectively managing the data comprises; storing the data for the distributed filesystem in a cloud storage system, wherein the cloud controllers cache and ensure data consistency for data stored in the cloud storage system; maintaining in each cloud controller a metadata hierarchy that reflects the current state of the distributed filesystem, wherein changes to the metadata for the distributed filesystem are synchronized across the cloud controllers, wherein the metadata hierarchy in the cloud controller tracks the location of distributed filesystem data blocks in the cloud storage system and cached distributed filesystem data blocks in the cloud controller, wherein the cloud controller uses the metadata hierarchy to locate and download requested, uncached data blocks in the distributed filesystem from the cloud storage system, wherein the file data for the distributed filesystem is stored in the cloud storage systems, wherein cloud controllers cache in their local storage devices a subset of the file data from the cloud storage system that is being actively accessed by each respective cloud controller'"'"'s clients; upon receiving at a cloud controller a write request from a client system that seeks to store a target file in the distributed filesystem, sending an incremental metadata snapshot containing metadata for the target file from the cloud controller to the other cloud controllers for the distributed filesystem; wherein a third cloud controller that is associated with scanning operations uses the incremental metadata snapshot to retrieve the target file from the cloud storage system and initiates a scan for the target file, wherein the third cloud controller is co-located with the cloud storage system and geographically distinct and separate from the cloud controller receiving the target file and the client system; wherein the third cloud controller determines that the target file has been successfully scanned and sends a subsequent incremental metadata snapshot indicating that the target file has been scanned to the other cloud controllers of the distributed filesystem; wherein a second cloud controller receiving a client request to access the target file uses the contents of the subsequent incremental metadata snapshot to confirm that the target file has been scanned before allowing access to the target file. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. A cloud controller that initiates anti-virus checks for a distributed filesystem, comprising:
- a processor;
a storage mechanism that stores metadata for the distributed filesystem; and
a storage management mechanism;
wherein two or more cloud controllers collectively manage the data of the distributed filesystem, wherein collectively managing the data comprises;
storing the data for the distributed filesystern in a cloud storage system, wherein the cloud controllers cache and ensure data consistency for data stored in the cloud storage system;
maintaining in each cloud controller a metadata hierarchy that reflects the current state of the distributed filesystem, wherein changes to the metadata for the distributed filesystem are synchronized across the cloud controllers, wherein the metadata hierarchy in the cloud controller tracks the location of distributed filesystem data blocks in the cloud storage system and cached distributed filesystem data blocks in the cloud controller, wherein the cloud controller uses the metadata hierarchy to locate and download requested, uncached data blocks in the distributed filesystem from the cloud storage system, wherein the file data for the distributed filesystem is stored in the cloud storage systems, wherein cloud controllers cache in their local storage devices a subset of the data from the cloud storage system that is being actively accessed by each respective cloud controller'"'"'s clients; and
wherein the cloud controller is configured to;
receive an incremental metadata snapshot containing metadata for a new the that has been written to the distributed filesystem by a distinct cloud controller for the distributed filesystem;
use the incremental metadata snapshot to download the new the from the cloud storage system, wherein the cloud controller is co-located with the cloud storage system and geographically distinct and separate from the distinct cloud controller receiving the target file;
transfer the target the to an anti-virus service to initiate an anti-virus scan for the target file; and
upon determining from the anti-virus service that no virus was found for the target file, send a subsequent incremental metadata snapshot that indicates that the target the is dean to the other cloud controllers for the distributed filesystem;
wherein a second cloud controller receiving a client request to access the target file uses the contents of the subsequent incremental metadata snapshot to confirm that the target has been successfully scanned before allowing access to the target file.
- a processor;
Specification