×

Probabilistic summary data structure based encoding for garbage collection in backup systems

  • US 7,783,682 B1
  • Filed: 07/27/2007
  • Issued: 08/24/2010
  • Est. Priority Date: 06/30/2003
  • Status: Active Grant
First Claim
Patent Images

1. A backup system for performing garbage collection of an allocated address space in a set of one or more storage devices comprising:

  • a set of one or more active storage trees each that are maintained within the set of storage devices, the set of active storage trees representing a version of multiple data, each leaf node of the set of active storage trees representing a block of data from the multiple data that has been backed up onto the set of storage devices; and

    a garbage collection logic, executing on a processor, responsive to deletion of a version of the multiple data, to generate a probabilistic summary data structure, wherein the probabilistic summary data structure represents blocks of data that are referenced by the set of active storage trees, wherein the garbage collection logic includes an encoder to generate the probabilistic summary data structure, and wherein the encoder includes a hash module to perform a first plurality of hash functions on the referenced blocks to generate a set of offset values and to set bits within the probabilistic summary data structure that are at offsets equal to the set of offset values generated by the plurality of hash functions.

View all claims
  • 12 Assignments
Timeline View
Assignment View
    ×
    ×