Removal of reference information for storage blocks in a deduplication system
First Claim
Patent Images
1. A method for managing data in a data storage having data deduplication, comprising:
- under control of a processor and memory having executable instructions, performing;
for a back reference data structure incorporating reference information for at least one user data segment to a storage block, removing, by the processor, a user data segment identification (ID) representative of the at least one user data segment from the back reference data structure, the storage block being associated with both a reference counter and the user data segment ID of the back reference data structure;
wherein the removal of the user data segment ID is performed in response to determining that the at least one user data segment no longer references the storage block caused by failed data, thereby maintaining the back reference data structure so as to facilitate an efficient search operation for recovering or reclaiming the failed data within the data storage;
configuring the back reference data structure by partitioning the back reference data structure as form type bits specifying a type of the ID representative of the at least one user data segment, and storage bits storing the ID of a representation of the ID thereof;
defining a plurality of form types corresponding to the form type bits;
defining a first form type structure incorporating a full representation of the ID of the at least one user data segment to be stored in the back reference data structure;
storing the defined first form type structure in the back reference data structure;
defining second, intermediate form type structures implementing a hashed form of the at least one user data segment ID to be stored in the back reference data structure;
storing the defined second, intermediate form type structures in the back reference data structure;
defining a third form type structure implementing a representation of the at least one user data segment ID as a bit bucket in a hash table of the at least one user data segment to be stored in the back reference data structure; and
storing the third form type structure in the back reference data structure;
wherein a total number of the at least one user data segment ID increases as a bit per ID correspondingly decreases when migrating from the first form type through the second form types to the third form type.
1 Assignment
0 Petitions
Accused Products
Abstract
Various embodiments for managing data in a data storage having data deduplication. For a back reference data structure incorporating reference information for at least one user data segment to a storage block, a user data segment identification (ID) representative of the at least one user data segment is removed from the back reference data structure.
16 Citations
24 Claims
-
1. A method for managing data in a data storage having data deduplication, comprising:
under control of a processor and memory having executable instructions, performing; for a back reference data structure incorporating reference information for at least one user data segment to a storage block, removing, by the processor, a user data segment identification (ID) representative of the at least one user data segment from the back reference data structure, the storage block being associated with both a reference counter and the user data segment ID of the back reference data structure;
wherein the removal of the user data segment ID is performed in response to determining that the at least one user data segment no longer references the storage block caused by failed data, thereby maintaining the back reference data structure so as to facilitate an efficient search operation for recovering or reclaiming the failed data within the data storage;configuring the back reference data structure by partitioning the back reference data structure as form type bits specifying a type of the ID representative of the at least one user data segment, and storage bits storing the ID of a representation of the ID thereof; defining a plurality of form types corresponding to the form type bits; defining a first form type structure incorporating a full representation of the ID of the at least one user data segment to be stored in the back reference data structure; storing the defined first form type structure in the back reference data structure; defining second, intermediate form type structures implementing a hashed form of the at least one user data segment ID to be stored in the back reference data structure; storing the defined second, intermediate form type structures in the back reference data structure; defining a third form type structure implementing a representation of the at least one user data segment ID as a bit bucket in a hash table of the at least one user data segment to be stored in the back reference data structure; and storing the third form type structure in the back reference data structure;
wherein a total number of the at least one user data segment ID increases as a bit per ID correspondingly decreases when migrating from the first form type through the second form types to the third form type.- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
9. A system for managing data in a data storage having data deduplication, comprising:
-
a memory device; and a processor executing instructions stored in the memory device, wherein when executing the instructions, the processor; for a back reference data structure incorporating reference information for at least one user data segment to a storage block, removes a user data segment identification (ID) representative of the at least one user data segment from the back reference data structure, the storage block being associated with both a reference counter and the user data segment ID of the back reference data structure;
wherein the removal of the user data segment ID is performed in response to determining that the at least one user data segment no longer references the storage block caused by failed data, thereby maintaining the back reference data structure so as to facilitate an efficient search operation for recovering or reclaiming the failed data within the data storage;configures the back reference data structure by partitioning the back reference data structure as form type bits specifying a type of the ID representative of the at least one user data segment, and storage bits storing the ID of a representation of the ID thereof; defines a plurality of form types corresponding to the form type bits; defines a first form type structure incorporating a full representation of the ID of the at least one user data segment to be stored in the back reference data structure; stores the defined first form type structure in the back reference data structure; defines second, intermediate form type structures implementing a hashed form of the at least one user data segment ID to be stored in the back reference data structure; stores the defined second, intermediate form type structures in the back reference data structure; defines a third form type structure implementing a representation of the at least one user data segment ID as a bit bucket in a hash table of the at least one user data segment to be stored in the back reference data structure; and stores the third form type structure in the back reference data structure;
wherein a total number of user data segment IDs increases as a bit per ID correspondingly decreases when migrating from the first form type through the second form types to the third form type. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer program product for managing data in a data storage having data deduplication, the computer program product comprising a non-transitory computer-readable storage medium having computer-readable program code portions stored therein, the computer-readable program code portions comprising:
a first executable portion that; for a back reference data structure incorporating reference information for at least one user data segment to a storage block, removes, by the processor, a user data segment identification (ID) representative of the at least one user data segment from the back reference data structure, the storage block being associated with both a reference counter and the user data segment ID of the back reference data structure;
wherein the removal of the user data segment ID is performed in response to determining that the at least one user data segment no longer references the storage block caused by failed data, thereby maintaining the back reference data structure so as to facilitate an efficient search operation for recovering or reclaiming the failed data within the data storage;configures the back reference data structure by partitioning the back reference data structure as form type bits specifying a type of the ID representative of the at least one user data segment, and storage bits storing the ID of a representation of the ID thereof; defines a plurality of form types corresponding to the form type bits; defines a first form type structure incorporating a full representation of the ID of the at least one user data segment to be stored in the back reference data structure; stores the defined first form type structure in the back reference data structure; defines second, intermediate form type structures implementing a hashed form of the at least one user data segment ID to be stored in the back reference data structure; stores the defined second, intermediate form type structures in the back reference data structure; defines a third form type structure implementing a representation of the at least one user data segment ID as a bit bucket in a hash table of the at least one user data segment to be stored in the back reference data structure; and stores the third form type structure in the back reference data structure;
wherein a total number of the at least one user data segment ID increases as a bit per ID correspondingly decreases when migrating from the first form type through the second form types to the third form type.- View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
Specification