×

ZFS block-level deduplication at cloud scale

  • US 10,698,941 B2
  • Filed: 05/31/2017
  • Issued: 06/30/2020
  • Est. Priority Date: 01/06/2017
  • Status: Active Grant
First Claim
Patent Images

1. A method of deduplicating data blocks on a cloud object store that is remote from a block storage system, the method comprising:

  • receiving, at an application layer of the block storage system and through a system call interface of an interface layer of the block storage system, a first request to store or modify a file, the first request including file data;

    generating, at a transactional object layer of the block storage system, a plurality of data blocks, each data block of the plurality of data blocks corresponding to at least a portion of the file data;

    generating, at the transactional object layer of the block storage system and for each data block of the plurality of data blocks, a generated name using a naming protocol, the generated name being based on a content of the data block;

    determining, at a data management unit, that a generated name of a first data block of the plurality of data blocks is equivalent to an existing name associated with an existing data block, the existing data block corresponding to an existing cloud storage object stored in the cloud object store, and the existing name generated using the naming protocol;

    generating, at the transactional object layer of the block storage system, a set of data blocks, the set of data blocks including the plurality of data blocks while excluding the first data block;

    generating, at the transactional object layer of the block storage system, a plurality of metadata blocks corresponding to the existing data block and each data block of the set of data blocks, the plurality of metadata blocks being configured to hierarchically point to lower-level blocks associated with the file and thereby correspond to at least part of a tree hierarchy for the file, wherein;

    each metadata block of the plurality of metadata blocks includes one or more address pointers, each address pointer of the one or more address pointers pointing to the existing data block, a data block of the set of data blocks, or to a metadata block in the plurality of metadata blocks;

    the plurality of metadata blocks includes a root block that is positioned at a top of the tree hierarchy for the file and one or more non-root metadata blocks;

    each non-root metadata block of the plurality of metadata blocks being pointed to by at least one metadata block of the plurality of metadata blocks of the tree hierarchy for the file; and

    each data block of the set of data blocks is pointed to by a metadata block of the plurality of metadata blocks of the tree hierarchy for the file;

    causing a set of cloud storage objects to be stored in the cloud object store by transmitting the set of data blocks and the plurality of metadata blocks to a hybrid cloud storage system, the hybrid cloud storage system managing data storage in the cloud object store, wherein causing a set of cloud storage objects to be stored includes;

    generating the set of cloud storage blocks based on the data blocks of the set of data blocks; and

    generating, for each cloud storage object, an address pointer that points to the cloud storage object, the address pointer generated based on an identifier of the cloud storage object and a path specification of the cloud storage object;

    transmitting, to the hybrid cloud storage system, one or more second requests for a set of addresses, each address of the set of addresses corresponding to a cloud storage object of the set of cloud storage objects that correspond to the set of data blocks;

    receiving, from the hybrid cloud storage system, one or more responses to the one or more second requests, each response of the one or more responses identifying an address corresponding to a data block of the set of data blocks or a metadata block of the plurality of metadata blocks, the address identifying a storage location in the cloud object store; and

    generating, using the tree hierarchy and the set of addresses, a mapping between each data block of the plurality of data blocks to a cloud storage object of the set of cloud storage objects.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×