×

Detection and deduplication of backup sets exhibiting poor locality

  • US 9,122,639 B2
  • Filed: 01/25/2011
  • Issued: 09/01/2015
  • Est. Priority Date: 01/25/2011
  • Status: Active Grant
First Claim
Patent Images

1. A computerized method for storing data comprising:

  • determining, by a computing device, a first set of summaries of a first data set, each summary of the first set of summaries being indicative of a data pattern in the first data set at an associated location in the first data set;

    determining, by the computing device, a second set of summaries of a second data set, each summary of the second set of summaries being indicative of a data pattern in the second data set at an associated location in the second data set;

    calculating, by the computing device, a set of comparison metrics, each comparison metric being based on a first subset of summaries from the first set of summaries and a second subset of summaries from the second set of summaries;

    calculating, by the computing device, a locality metric based on the set of comparison metrics, the locality metric being indicative of a ratio of data within the first data set which is distributed as redundant data within the second data set with distance greater than a predetermined threshold;

    adjusting at least one parameter of a deduplication process based on the locality metric, the at least one parameter including at least one of a detection parameter and a deduplication parameter; and

    deduplicating the first data set and the second data set using the deduplication process.

View all claims
  • 6 Assignments
Timeline View
Assignment View
    ×
    ×