×

Tuning global digests caching in a data deduplication system

  • US 10,007,610 B2
  • Filed: 11/30/2017
  • Issued: 06/26/2018
  • Est. Priority Date: 07/15/2013
  • Status: Active Grant
First Claim
Patent Images

1. A method for tuning the density of a global digests cache in a data deduplication system using a processor device in a computing environment, comprising:

  • partitioning input data into input data chunks;

    wherein input digest values are calculated for each of the input data chunks;

    finding positions of similar repository data in a repository of data for each of the input data chunks;

    loading a sample of the repository digests into a search mechanism within the global digests cache;

    applying the sampling of the repository digests for loading the repository digests into a hash table; and

    using the positions of the similar repository data to locate and linearly load into the global digests cache, digests and digest block boundaries of the similar repository data in a sequence corresponding to a placement order of calculated values of the digests of the similar repository data, the placement order of the calculated values of the digests of the similar repository data correlative to an order in which the input digest values were individually calculated such that the digests of the similar repository data are each individually stored in the global digests cache based on a calculation time and order of when each of the input digests were first calculated when in un-deduplicated form, thereby storing the digests of the similar repository data in a linear and sequential form independent of a deduplicated form by which data the digests describe is stored, wherein the global digest cache comprises a pool of a plurality of sequential arrays of digest entries of the digests and a hash table for pointing to contents within the plurality of sequential arrays.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×