×

Using index partitioning and reconciliation for data deduplication

  • US 9,785,666 B2
  • Filed: 07/13/2015
  • Issued: 10/10/2017
  • Est. Priority Date: 12/28/2010
  • Status: Active Grant
First Claim
Patent Images

1. An apparatus, comprising:

  • a processor; and

    a memory containing logic operative on the processor to cause the processor to perform a process comprising;

    loading a subspace index comprising less than all index entries of a signature index service from a secondary media into a primary memory, wherein the loaded subspace index corresponds to a set of subspaces individually containing a set of signatures each corresponding to a data chunk stored in a data store, two of the subspaces having at least one signature representative of one subspace generally matching at least one signature representative of the other subspace; and

    reconciling the two subspaces associated with the loaded subspace index to remove at least one duplicate chunk by;

    using a resemblance metric to compare a signature of a data chunk in one of the two subspaces to multiple signatures of data chunks in the other of the two subspaces; and

    in response to determining that the signature of the data chunk in one of the two subspaces matches another signature in the other of the two subspaces, marking a data chunk corresponding to the signature or the another signature for deletion from a corresponding data store.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×