×

Deduplicating snapshots associated with a backup operation

  • US 9,990,156 B1
  • Filed: 06/13/2014
  • Issued: 06/05/2018
  • Est. Priority Date: 06/13/2014
  • Status: Active Grant
First Claim
Patent Images

1. A system, comprising:

  • a processor configured to;

    receive an indication to perform a backup operation on a plurality of storage areas of a source system;

    in response to the indication, perform the backup operation including by generating a plurality of snapshots corresponding to respective ones of the plurality of storage areas associated with the backup operation, wherein a snapshot corresponds to a point-in-time state of a corresponding storage area;

    maintain, at the source system, deduplication data corresponding to one or more data blocks that have already been written to backup media during the backup operation, wherein the deduplication data comprises a plurality of identifiers corresponding to respective ones of data blocks that have already been written to backup media; and

    use the deduplication data, at the source system, to deduplicate backup data across the plurality of snapshots associated with the backup operation, wherein to use the deduplication data comprises to compare, at the source system, an identifier associated with a data block to back up in a first snapshot of the plurality of snapshots to the plurality of identifiers,wherein in response to a first determination that a matching identifier is not found in the plurality of identifiers;

    determine, at the source system, that the data block has not already been written to backup media;

    send, from the source system, to a backup storage underlying data of the data block to be stored as an entry associated with the data block in the first snapshot at the backup media at the backup storage;

    send, from the source system, to the backup storage a metadata block corresponding to the data block, wherein the metadata block is to be stored in the first snapshot at the backup media at the backup storage, wherein the metadata block is configured to be used to determine to which file or directory, or both, the data block belongs; and

    update, at the source system, the deduplication data to include the identifier associated with the data block;

    wherein in response to a second determination that the matching identifier is found in the plurality of identifiers;

    determine, at the source system, that the data block has already been written to the backup media; and

    send, from the source system, to the backup storage a representation of the data block to be stored as the entry associated with the data block in the first snapshot on the backup media at the backup storage, wherein the representation of the data block comprises associating data to a location at the backup media to which the data block was previously written, wherein the representation of the data block is determined based at least in part on information stored in the deduplication data, wherein the data block was previously written to the location at the backup media for a second snapshot of the plurality of snapshots; and

    a memory coupled to the processor and configured to store the deduplication data.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×