Managing dereferenced chunks in a deduplication system
First Claim
1. A computer program product for maintaining data objects in a storage space, the computer program product comprising a non-transitory computer readable storage medium having computer readable program code embodied therein that executes to perform operations, the operations comprising:
- maintaining a chunk index having information on chunks in the storage space referenced in the data objects to provide deduplication of the chunks, wherein the chunk index includes a reference count for each chunk indicating a number of the data objects in which the chunk is referenced and a reference measurement representing a level of the data objects references to the chunk;
incrementing the reference count for one chunk in response to including a reference to the chunk in one of the data objects;
selecting one chunk to remove from the storage space based on a criteria applied to the reference measurements of chunks having reference counts indicating that the chunks are not referenced in one of the data objects in the storage space; and
returning indication of the selected chunk to remove from the storage space.
1 Assignment
0 Petitions
Accused Products
Abstract
A chunk index has information on chunks in a storage space referenced in objects in the storage space. The chunk index includes a reference count for each chunk indicating a number of objects in which the chunk is referenced and a reference measurement representing a level of data object references to the chunk. One chunk is selected to remove from the storage space based on a criteria applied to the reference measurements of chunks having reference counts indicating that the chunks are not referenced in one object in the storage space.
14 Citations
20 Claims
-
1. A computer program product for maintaining data objects in a storage space, the computer program product comprising a non-transitory computer readable storage medium having computer readable program code embodied therein that executes to perform operations, the operations comprising:
-
maintaining a chunk index having information on chunks in the storage space referenced in the data objects to provide deduplication of the chunks, wherein the chunk index includes a reference count for each chunk indicating a number of the data objects in which the chunk is referenced and a reference measurement representing a level of the data objects references to the chunk; incrementing the reference count for one chunk in response to including a reference to the chunk in one of the data objects; selecting one chunk to remove from the storage space based on a criteria applied to the reference measurements of chunks having reference counts indicating that the chunks are not referenced in one of the data objects in the storage space; and returning indication of the selected chunk to remove from the storage space. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for maintaining data objects in a storage space, comprising:
-
a non-transitory computer readable storage medium including a chunk index having information on chunks in the storage space referenced in data objects to provide deduplication of the chunks, wherein the chunk index includes a reference count for each chunk indicating a number of the data objects in which the chunk is referenced and a reference measurement representing a level of the data objects references to the chunk; a processor executing code to perform operations comprising; incrementing the reference count for one chunk in response to including a reference to the chunk in one of the data objects; selecting one chunk to remove from the storage space based on a criteria applied to the reference measurements of chunks having reference counts indicating that the chunks are not referenced in one of the data objects in the storage space; and returning indication of the selected chunk to remove from the storage space. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer-implemented method for maintaining, by a processor, data objects in a storage space, comprising:
-
maintaining a chunk index having information on chunks in the storage space referenced in data objects to provide deduplication of the chunks, wherein the chunk index includes a reference count for each chunk indicating a number of the data objects in which the chunk is referenced and a reference measurement representing a level of the data objects references to the chunk; incrementing the reference count for one chunk in response to including a reference to the chunk in one of the data objects; selecting one chunk to remove from the storage space based on a criteria applied to the reference measurements of chunks having reference counts indicating that the chunks are not referenced in one of the data objects in the storage space; and returning indication of the selected chunk to remove from the storage space. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification