×

Creation of synthetic backups within deduplication storage system by a backup application

  • US 10,585,857 B2
  • Filed: 11/17/2017
  • Issued: 03/10/2020
  • Est. Priority Date: 12/01/2010
  • Status: Active Grant
First Claim
Patent Images

1. A method for deduplicating input backup data with data of a synthetic backup previously constructed by a deduplication storage system using a processor device, comprising:

  • constructing the synthetic backup by processing a plurality of metadata instructions provided by a backup application, the synthetic backup to be independent of referenced stored backups;

    processing each of the plurality of metadata instructions by each of;

    partitioning each data segment input into each of a plurality of fixed-sized data sub-segments, each sub-segment referencing a plurality of stored sub-segments,for each of the plurality of data sub-segments, during the construction of the synthetic backup, calculating each of a plurality of input deduplication digests based on a retrieved plurality of stored deduplication digests by aggregating calculated deduplication digests of the plurality of data sub-segments to produce a respective one of the plurality of input deduplication digests for each data segment input,locating those of the plurality of data sub-segments in the deduplication storage system specified by the data segment in each of the plurality of metadata instructions, andcreating metadata references to each of the plurality of data sub-segments and adding the metadata references to metadata of the synthetic backup being created;

    wherein the metadata references include physical and logical data patterns;

    transforming a set of the plurality of metadata instructions into a transformed set of the plurality of metadata instructions;

    creating the synthetic backup by the deduplication system and the backup application by consolidating the plurality of metadata instructions that reference adjacent backup data segments into a single metadata instruction;

    wherein the synthetic backup includes data from an existing full backup and subsequent incremental backups of the existing full backup dating until a specific point in time;

    calculating deduplication digests based on the data of the synthetic backup; and

    locating matching digests of previously constructed synthetic backups in a digests index, wherein each of the located matching digest references stored data included in the synthetic backup, and the stored data is similar to the input backup data.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×