×

Managing metadata and data storage for a cloud controller in a distributed filesystem

  • US 9,792,298 B1
  • Filed: 02/15/2013
  • Issued: 10/17/2017
  • Est. Priority Date: 05/03/2010
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for managing metadata and data storage for a cloud controller in a distributed filesystem, the method comprising:

  • collectively managing the data of the distributed filesystem using two or more cloud controllers, wherein collectively managing the data comprises;

    storing the data for the distributed filesystem in a remote cloud storage system, wherein the cloud controllers cache and ensure data consistency for data stored in the remote cloud storage system, wherein the cloud controller includes a local storage device, wherein the local storage device comprises a rotating disk drive that comprises one or more disk platters;

    maintaining in each cloud controller a metadata hierarchy that reflects the current state of the distributed filesystem, wherein the metadata hierarchy is stored in the local storage device, wherein changes to the metadata for the distributed filesystem are synchronized across the cloud controllers for the distributed filesystem to ensure that the clients of the distributed filesystem share a consistent view of the files in the distributed filesystem; and

    collectively presenting a unified namespace for the distributed filesystem to the clients of the distributed filesystem via the two or more cloud controllers, wherein the clients access the distributed filesystem via the cloud controllers, wherein the file data for the distributed filesystem is stored in the remote cloud storage system, wherein cloud controllers cache in their local storage devices a subset of the file data from the remote cloud storage system that is being actively accessed by each respective cloud controller'"'"'s clients, wherein new file data received by each cloud controller from its clients is written to the remote cloud storage system, wherein the metadata hierarchy in the cloud controller tracks the location of distributed filesystem data blocks in the remote cloud storage system and cached distributed filesystem data blocks in the cloud controller, wherein the cloud controller uses the metadata hierarchy to locate and download requested, uncached data blocks in the distributed filesystem from the remote cloud storage system;

    defining in a disk platter of the rotating disk drive two or more metadata regions in which the cloud controller stores metadata for the distributed filesystem, wherein the metadata regions are distinct from two or more allocated data regions that are defined in the disk platter that cache distributed filesystem data, wherein different regions of the disk platter in the local storage device have different levels of performance, wherein a metadata region is defined in an outer region of the disk platter that supports the highest access bandwidth and lower access latency;

    receiving an incremental metadata snapshot that references new data written to the distributed filesystem;

    storing a new metadata entry for the distributed filesystem from the incremental metadata snapshot in the metadata region on the disk platter; and

    upon receiving a client request to access a new data block referenced in the incremental metadata snapshot, selecting a data region that is in near proximity to the metadata region and caching the new data block in that data region to ensure that the new metadata entry and the new data block are in relative proximity on the disk platter, thereby ensuring that associated metadata and data can be read without substantially degrading access performance, wherein the data region is distinct from the metadata region;

    wherein the cloud controller predicts that the new metadata entry and the new data block are likely to be accessed frequently, wherein the cloud controller selects the metadata region and the data region for the new metadata entry and the new data block respectively because they are on an outer region of the disk platter and hence more favorable for frequent accesses, wherein outer regions of the disk platter have higher spatial density and hence higher effective data bandwidth that improves access rates for frequently accessed data stored in such regions.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×