×

System and method for sampling based elimination of duplicate data

  • US 8,165,221 B2
  • Filed: 04/28/2006
  • Issued: 04/24/2012
  • Est. Priority Date: 04/28/2006
  • Status: Active Grant
First Claim
Patent Images

1. A method for removing duplicate data from a data set, the method comprising:

  • identifying, by a processor, an anchor within the data set, wherein the anchor is a specific section within the data set that defines a region of interest for potential data de-duplication;

    determining, by the processor, whether the identified anchor already exists within an anchor database;

    in response to determining that the identified anchor already exists within the anchor database, performing, by the processor, a data comparison between the data set and a stored data set to identify a forward delta value and a backward delta value which collectively identify a number of consecutive bits of data that match between the data set and the stored data set forward and backward from the identified anchor, respectively; and

    replacing, by the processor, a specific region of the data set identified by the anchor, the forward delta value and the backward delta value as duplicate data with a storage indicator to form a modified data set.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×