Integrated Approach for Deduplicating Data in a Distributed Environment that Involves a Source and a Target
First Claim
1. A method for enabling deduplication of a data file at each of a source and a target location in a distributed storage management system, the storage management system containing a source computing system connected to a target computing system and a target data store located within the target computing system, the method comprising:
- maintaining a shared index for tracking deduplicated data chunks stored within the target data store;
providing a deduplication process for deduplication of data chunks to be stored in deduplicated form within the target data store;
enabling execution of deduplication instructions by the target computing system and execution of deduplication instructions by the source computing system;
deduplicating a data file into a set of deduplicated data chunks with use of the deduplication process, the deduplication process comprising a set of deduplication instructions executed by either of the source computing system or the target computing system;
storing the set of deduplicated data chunks within the target data store; and
updating deduplication information for the set of deduplicated data chunks within the shared index.
1 Assignment
0 Petitions
Accused Products
Abstract
One aspect of the present invention includes a configuration of a storage management system that enables the performance of deduplication activities at both the client (source) and at the server (target) locations. The location of deduplication operations can then be optimized based on system conditions or predefined policies. In one embodiment, seamless switching of deduplication activities between the client and the server is enabled by utilizing uniform deduplication process algorithms and accessing the same deduplication index (containing information on the hashed data chunks). Additionally, any data transformations on the chunks are performed subsequent to identification of the data chunks. Accordingly, with use of this storage configuration, the storage system can find and utilize matching chunks generated with either client- or server-side deduplication.
-
Citations
20 Claims
-
1. A method for enabling deduplication of a data file at each of a source and a target location in a distributed storage management system, the storage management system containing a source computing system connected to a target computing system and a target data store located within the target computing system, the method comprising:
-
maintaining a shared index for tracking deduplicated data chunks stored within the target data store; providing a deduplication process for deduplication of data chunks to be stored in deduplicated form within the target data store; enabling execution of deduplication instructions by the target computing system and execution of deduplication instructions by the source computing system; deduplicating a data file into a set of deduplicated data chunks with use of the deduplication process, the deduplication process comprising a set of deduplication instructions executed by either of the source computing system or the target computing system; storing the set of deduplicated data chunks within the target data store; and updating deduplication information for the set of deduplicated data chunks within the shared index. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method of enabling deduplication of a data file at a selected source or target location in a distributed storage management system, the storage management system containing a source computing system connected to a target computing system and a target data store located within the target computing system, the method comprising:
-
tracking deduplication information for deduplicated data chunks stored within the target data store; providing a deduplication process for deduplication of a data file to be stored within the target data store; applying a selected policy from a plurality of defined policies to determine a location for execution of the deduplication process at either of the source computing system or the target computing system; deduplicating the data file at the determined location by executing the deduplication process; and updating the tracked deduplication information for the data file. - View Dependent Claims (12)
-
-
13. A storage management system, comprising:
-
a source computing system; a target computing system connected to the source computing system; a target data store located within the target computing system; at least one processor within the storage management system; and at least one memory within the storage management system storing instructions operable with the at least one processor for enabling deduplication of a data file at each of a source and a target location in the storage management system, the instructions being executed for; maintaining a shared index for tracking deduplicated data chunks stored within the target data store; providing a deduplication process for deduplication of data chunks to be stored in deduplicated form within the target data store; enabling execution of deduplication instructions by the target computing system and execution of deduplication instructions by the source computing system; deduplicating a data file into a set of deduplicated data chunks with use of the deduplication process, the deduplication process comprising a set of deduplication instructions executed by either of the source computing system or the target computing system; storing the set of deduplicated data chunks within the target data store; and updating deduplication information for the set of deduplicated data chunks within the shared index. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. A storage management system, comprising:
-
a source computing system; a target computing system connected to the source computing system; a target data store located within the target computing system; at least one processor within the storage management system; and at least one memory within the storage management system storing instructions operable with the at least one processor for enabling deduplication of a data file at a selected source or target location in the storage management system, the instructions being executed for; tracking deduplication information for deduplicated data chunks stored within the target data store; providing a deduplication process for deduplication of a data file to be stored within the target data store; applying a selected policy from a plurality of defined policies to determine a location for execution of the deduplication process at either of the source computing system or the target computing system; deduplicating the data file at the determined location by executing the deduplication process; and updating the tracked deduplication information for the data file. - View Dependent Claims (20)
-
Specification