Policy based sharing of redundant data across storage pools in a deduplicating system
First Claim
1. A method in a computer system for implementing policy-based sharing of redundant data across deduplicating storage pools within a storage management system, comprising:
- deduplicating a data object as data chunks across more than one of a plurality of storage pools based on a defined policy comprising a table with an entry for each storage pool, wherein an object index tracks a location of the data object, a deduplication index tracks data chunks, the plurality of storage pools are organized in a storage management system, each storage pool is designated as one of a deduplicating pool and a non-deduplicating pool, the table is stored on a storage management system level and is accessible by each of the plurality of storage pools, each entry of the table specifies allowed storage pools from which the storage pool references data chunks, tape storage pools are allowed to access disk storage pools and disk storage pools are not allowed to access tape storage pools, a first deduplicating storage pool stores a first link for a first data chunk of the data object linking to an instance of the first data chunk in an allowed deduplicating storage pool and does not store the first data chunk in response to the instance of the first data chunk existing in a third deduplicating storage pool that the allowed deduplicating storage pool is allowed to access, the first deduplicating storage pool stores the first data chunk of the data object in response to the instance of the first data chunk not existing in the allowed deduplicating storage pool, and a second non-deduplicating pool stores the data object without deduplication;
accessing the first data chunk from the allowed deduplicating storage pool using the first link if the first link is stored in the allowed deduplicating storage pool and accessing the first data chunk from the first storage pool if the first link is not stored in the allowed deduplicating storage pool; and
deleting the location of the data object from the object index and tracking the data chunks of the data object in the deduplication index in response turning off deduplication in the first storage pool.
5 Assignments
0 Petitions
Accused Products
Abstract
One aspect of the present invention includes enabling data chunks to be shared among different storage pools within a storage management system, according the use of deduplication and storage information kept at the system level, and applied with policy-based rules that define the scope of deduplication. In one embodiment, the parameters of performing deduplication are defined within the policy, particularly which of the plurality of storage pools allow deduplication to which other pools. Accordingly, a data object may be linked to deduplicated data chunks existent within other storage pools, and the transfer of a data object may occur by simply creating references to existing data chunks in other pools provided the policy allows the pool to reference chunks in these other pools. Additionally, a group of storage pools may be defined within the policy to perform a common set of deduplication activities across all pools within the group.
50 Citations
21 Claims
-
1. A method in a computer system for implementing policy-based sharing of redundant data across deduplicating storage pools within a storage management system, comprising:
-
deduplicating a data object as data chunks across more than one of a plurality of storage pools based on a defined policy comprising a table with an entry for each storage pool, wherein an object index tracks a location of the data object, a deduplication index tracks data chunks, the plurality of storage pools are organized in a storage management system, each storage pool is designated as one of a deduplicating pool and a non-deduplicating pool, the table is stored on a storage management system level and is accessible by each of the plurality of storage pools, each entry of the table specifies allowed storage pools from which the storage pool references data chunks, tape storage pools are allowed to access disk storage pools and disk storage pools are not allowed to access tape storage pools, a first deduplicating storage pool stores a first link for a first data chunk of the data object linking to an instance of the first data chunk in an allowed deduplicating storage pool and does not store the first data chunk in response to the instance of the first data chunk existing in a third deduplicating storage pool that the allowed deduplicating storage pool is allowed to access, the first deduplicating storage pool stores the first data chunk of the data object in response to the instance of the first data chunk not existing in the allowed deduplicating storage pool, and a second non-deduplicating pool stores the data object without deduplication; accessing the first data chunk from the allowed deduplicating storage pool using the first link if the first link is stored in the allowed deduplicating storage pool and accessing the first data chunk from the first storage pool if the first link is not stored in the allowed deduplicating storage pool; and deleting the location of the data object from the object index and tracking the data chunks of the data object in the deduplication index in response turning off deduplication in the first storage pool. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system, comprising:
-
at least one processor; and at least one memory which stores instructions operable with the at least one processor for implementing policy-based sharing of redundant data across deduplicating storage pools within a storage management system, the instructions being executed for; deduplicating a data object as data chunks across more than one of a plurality of storage pools based on a defined policy comprising a table with an entry for each storage pool, wherein an object index tracks a location of the data object, a deduplication index tracks data chunks, the plurality of storage pools are organized in a storage management system, each storage pool is designated as one of a deduplicating pool and a non-deduplicating pool, the table is stored on a storage management system level and is accessible by each of the plurality of storage pools, each entry of the table specifies allowed storage pools from which the storage pool references data chunks, tape storage pools are allowed to access disk storage pools and disk storage pools are not allowed to access tape storage pools, a first deduplicating storage pool stores a first link for a first data chunk of the data object linking to an instance of the first data chunk in an allowed deduplicating storage pool and does not store the first data chunk in response to the instance of the first data chunk existing in a third deduplicating storage pool that the allowed deduplicating storage pool is allowed to access, the first deduplicating storage pool stores the first data chunk of the data object in response to the instance of the first data chunk not existing in the allowed deduplicating storage pool, and a second non-deduplicating pool stores the data object without deduplication; accessing the first data chunk from the allowed deduplicating storage pool using the first link if the first link is stored in the allowed deduplicating storage pool and accessing the first data chunk from the first storage pool if the first link is not stored in the allowed deduplicating storage pool; and deleting the location of the data object from the object index and tracking the data chunks of the data object in the deduplication index in response turning off deduplication in the first storage pool. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer program product comprising a non-transitory computer readable medium storing a computer readable program for implementing policy-based sharing of redundant data across deduplicating storage pools within a storage management system, wherein the computer readable program when executed on a computer causes the computer to:
-
deduplicate a data object as data chunks across more than one of a plurality of storage pools based on a defined policy comprising a table with an entry for each storage pool, wherein an object index tracks a location of the data object, a deduplication index tracks data chunks, the plurality of storage pools are organized in a storage management system, each storage pool is designated as one of a deduplicating pool and a non-deduplicating pool, the table is stored on a storage management system level and is accessible by each of the plurality of storage pools, each entry of the table specifies allowed storage pools from which the storage pool references data chunks, tape storage pools are allowed to access disk storage pools and disk storage pools are not allowed to access tape storage pools, a first deduplicating storage pool stores a first link for a first data chunk of the data object linking to an instance of the first data chunk in an allowed deduplicating storage pool and does not store the first data chunk in response to the instance of the first data chunk existing in a third deduplicating storage pool that the allowed deduplicating storage pool is allowed to access, the first deduplicating storage pool stores the first data chunk of the data object in response to the instance of the first data chunk not existing in the allowed deduplicating storage pool, and a second non-deduplicating pool stores the data object without deduplication; accessing the first data chunk from the allowed deduplicating storage pool using the first link if the first link is stored in the allowed deduplicating storage pool and accessing the first data chunk from the first storage pool if the first link is not stored in the allowed deduplicating storage pool; and deleting the location of the data object from the object index and tracking the data chunks of the data object in the deduplication index in response turning off deduplication in the first storage pool. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification