×

Method for cleaning a delta storage system

  • US 9,141,301 B1
  • Filed: 06/13/2012
  • Issued: 09/22/2015
  • Est. Priority Date: 06/13/2012
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for performing garbage collection in a delta compressed data storage system, the method comprising:

  • selecting each of a plurality of file recipes stored in the data storage system to traverse to identify a plurality of data chunks, the plurality of file recipes representing a plurality of files, wherein each file recipe includes a plurality of chunk identifiers identifying a plurality of data chunks that constitute a corresponding file represented by the file recipe and locations of the data chunks within the corresponding file, wherein each chunk identifier includes a fingerprint of an associated data chunk;

    selecting each chunk identifier included in each of the file recipes;

    adding each of the chunk identifiers to a set of live data chunks;

    for each of the data chunks identified by the chunk identifiers of the file recipes, performing a lookup operation in an index of the storage system to determine whether there is a delta reference associated with the data chunk, the index including a plurality of entries corresponding to the plurality of data chunks stored in the data storage system respectively, wherein each entry of the index stores a fingerprint of a corresponding data chunk, a storage location of the corresponding data chunk, and a delta reference if the corresponding data chunk has been delta encoded;

    adding delta references for the file recipes corresponding to the chunk identifiers that are obtained from the index to the set of live data chunks, wherein the index is stored separately from the file recipes; and

    discarding data chunks in a data storage system not identified by the set of live data chunks, wherein the set of live data chunks is a listing that identifies live data chunks in the data storage system distinct from the listing of data chunks in the data storage system and the file recipes.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×