×

Techniques for managing deduplication based on recently written extents

  • US 8,799,601 B1
  • Filed: 06/28/2012
  • Issued: 08/05/2014
  • Est. Priority Date: 06/28/2012
  • Status: Active Grant
First Claim
Patent Images

1. In a data storage apparatus having processing circuitry and memory which stores extents, a method of managing deduplication of the extents, the method comprising:

  • constructing, by the processing circuitry, a recently written extent list which identifies recently written extents stored within the memory;

    referencing the recently written extent list to bypass extents identified by the recently written extent list when obtaining a candidate extent for possible deduplication; and

    processing the candidate extent for possible deduplication;

    wherein the data storage apparatus maintains an extent sharing index table having entries which (i) have existing hash values and (ii) identify extents; and

    wherein processing the candidate extent for possible deduplication includes;

    digesting the candidate extent to produce a current hash value,searching the extent sharing index table for an existing entry having an existing hash value which matches the current hash value,when an existing entry in the extent sharing index table is found to have an existing hash value which matches the current hash value,(i) searching the recently written extent list to confirm that an existing extent, which is identified by the existing entry, is not identified by the recently written extent list,(ii) when the existing extent is not identified by the recently written extent list, performing a comprehensive compare operation to determine whether to deduplicate the candidate extent with the existing extent, and(iii) when the existing extent is identified by the recently written extent list, adding a new entry to the extent sharing index table, the new entry having the current hash value and identifying the candidate extent, andwhen no existing entry in the extent sharing index table is found to have an existing hash value which matches the current hash value, adding a new entry to the extent sharing index table, the new entry having the current hash value and identifying the candidate extent.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×