×

Integrated approach for deduplicating data in a distributed environment that involves a source and a target

  • US 9,058,298 B2
  • Filed: 07/16/2009
  • Issued: 06/16/2015
  • Est. Priority Date: 07/16/2009
  • Status: Active Grant
First Claim
Patent Images

1. A method for deduplicating a data file at each of a source and a target location in a distributed storage management system, the storage management system containing a source computing system connected to a target computing system and a target data store located within the target computing system, the method comprising:

  • maintaining, by use of a processor, a shared index for tracking deduplicated data chunks stored within the target data store, wherein the shared index is accessed by the source computing system and the target computing system;

    deduplicating a data file that is located at the source computing system into a set of deduplicated data chunks with the source computing system and transmitting the data file to the target system as a result of determining that the data file satisfies a policy;

    transmitting another data file to the target computing system and deduplicating the other data file into the set of deduplicated data chunks with the target computing system as a result of determining that the other data file does not satisfy the policy, wherein the policy is not satisfied if the data file contains sensitive data;

    wherein deduplicating at the source computer comprises fingerprinting and hashing the data chunks with a first set of fingerprinting and hashing algorithms on the source computing system if the data file satisfies the policy;

    wherein deduplicating at the target computing system comprises fingerprinting and hashing the data chunks with a second set of fingerprinting and hashing algorithms on the target computing system if the data file does not satisfy the policy;

    storing the set of deduplicated data chunks within the target data store; and

    updating deduplication information for the set of deduplicated data chunks within the shared index.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×