×

Reducing data duplication in cloud storage

  • US 8,583,599 B2
  • Filed: 11/29/2010
  • Issued: 11/12/2013
  • Est. Priority Date: 11/29/2010
  • Status: Active Grant
First Claim
Patent Images

1. A method to reduce data duplication in cloud storage, the method comprising:

  • receiving a first snapshot of a remote volume via a network, the first snapshot including a copy of the remote volume at a first instant in time, the remote volume including a plurality of clusters, individual ones of the plurality of clusters being identified as valid or invalid, a valid cluster containing data to be backed up, and an invalid cluster being devoid of data to be backed up;

    identifying, responsive to and based on the first snapshot, unique clusters and duplicate clusters among the valid clusters, the duplicate clusters being valid clusters in the remote volume containing identical data;

    storing, in a backup file, the unique clusters and single instances of the duplicate clusters such that the backup file is devoid of duplicate clusters;

    receiving a second snapshot of the remote volume via the network, the second snapshot including a copy of the remote volume at a second instant in time, the second instant in time being after the first instant in time;

    identifying, responsive to and based on the second snapshot, a valid cluster in the remote volume not yet stored in the backup file and a cluster in the backup file that is no longer valid; and

    utilizing, responsive to the second snapshot, the cluster in the backup file that is no longer valid to store the valid cluster in the remote volume not yet stored in the backup file.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×