×

Methods and apparatus for content-aware data de-duplication

  • US 7,925,683 B2
  • Filed: 12/18/2009
  • Issued: 04/12/2011
  • Est. Priority Date: 12/18/2008
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • partitioning digital data into a plurality of blocks, including a first block, and additional data, wherein the additional data includes at least one of position-dependent data, instance-dependent data, format-specific headers or footers, and format-specific transformations, andwherein a combination of the plurality of blocks and the additional data together represents all of the digital data;

    generating a file identifier based at least in part on the digital data;

    associating the file identifier with the digital data;

    generating a block identifier based at least in part on the first block;

    associating the block identifier with the first block;

    determining if the first block has already been stored;

    storing the first block if the first block has not already been stored;

    determining if a block map associated with the file identifier has already been stored, wherein the block map includes block identifiers associated respectively with each block of which the digital data is comprised, andif the block map associated with the file identifier has not already been stored;

    creating the block map,storing the block map,associating the additional data with the block map, andassociating the file identifier with the block map.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×