Detection of data affected by inaccessible storage blocks in a deduplication system
First Claim
1. A method, performed by a processor, for managing data in a data storage having data deduplication, comprising:
- to efficiently recover or reclaim failed data in the data storage, in response to a portion of the data storage determined to be inaccessible;
querying, by the processor, an identifier of a user data segment by examining a corresponding back reference data structure to determine if the user data segment references a particular storage block, the storage block being associated with both a reference counter and the identifier of the back reference data structure;
wherein;
if the outcome of the query is negative, the user data segment is determined not associated with the particular storage block, andif the outcome of the query is positive, the user data segment is warranted be examined further to determine if the user data segment is associated with the particular storage block;
wherein further examining the user data segment includes performing;
inspecting metadata of the back reference data structure associated with the inaccessible portion of the data storage,inspecting the identifier of the user data segment in the data storage, andinspecting metadata of user data segments whose identifiers returned a positive query outcome for at least one of the back reference data structures associated with the inaccessible portion of the data storage, and wherein the metadata and the identifier of the user data segment is inspected to determine the association with the particular storage block in lieu of scanning all metadata of all user objects in the data storage thereby efficiently identifying the failed data for reclamation.
1 Assignment
0 Petitions
Accused Products
Abstract
Various embodiments for managing data in a data storage having data deduplication. In response to a portion of the data storage determined to be inaccessible, an identifier of a user data segment is queried by examining a corresponding back reference data structure, the back reference data structure implemented as an approximation of a relationship between the user data segment and a particular storage block in the data storage. If the outcome of the query is negative, the user data segment is determined not associated with the particular storage block. If the outcome of the query is positive, the user data segment is warranted be examined further to determine if the user data segment is associated with the particular storage block.
13 Citations
19 Claims
-
1. A method, performed by a processor, for managing data in a data storage having data deduplication, comprising:
-
to efficiently recover or reclaim failed data in the data storage, in response to a portion of the data storage determined to be inaccessible; querying, by the processor, an identifier of a user data segment by examining a corresponding back reference data structure to determine if the user data segment references a particular storage block, the storage block being associated with both a reference counter and the identifier of the back reference data structure;
wherein;if the outcome of the query is negative, the user data segment is determined not associated with the particular storage block, and if the outcome of the query is positive, the user data segment is warranted be examined further to determine if the user data segment is associated with the particular storage block;
wherein further examining the user data segment includes performing;inspecting metadata of the back reference data structure associated with the inaccessible portion of the data storage, inspecting the identifier of the user data segment in the data storage, and inspecting metadata of user data segments whose identifiers returned a positive query outcome for at least one of the back reference data structures associated with the inaccessible portion of the data storage, and wherein the metadata and the identifier of the user data segment is inspected to determine the association with the particular storage block in lieu of scanning all metadata of all user objects in the data storage thereby efficiently identifying the failed data for reclamation. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system for managing data in a data storage having data deduplication, comprising:
-
a processor, operational in the data storage, wherein the processor, to efficiently recover or reclaim failed data in the data storage, in response to a portion of the data storage determined to be inaccessible; queries an identifier of a user data segment by examining a corresponding back reference data structure to determine if the user data segment references a particular storage block, the storage block being associated with both a reference counter and the identifier of the back reference data structure;
further wherein;if the outcome of the query is negative, the user data segment is determined not associated with the particular storage block, and if the outcome of the query is positive, the user data segment is warranted be examined further to determine if the user data segment is associated with the particular storage block;
wherein further examining the user data segment includes performing;inspecting metadata of the back reference data structure associated with the inaccessible portion of the data storage, inspecting the identifier of the user data segment in the data storage, and inspecting metadata of user data segments whose identifiers returned a positive query outcome for at least one of the back reference data structures associated with the inaccessible portion of the data storage, and wherein the metadata and the identifier of the user data segment is inspected to determine the association with the particular storage block in lieu of scanning all metadata of all user objects in the data storage thereby efficiently identifying the failed data for reclamation. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. A computer program product for managing data in a data storage having data deduplication, the computer program product comprising a non-transitory computer-readable storage medium having computer-readable program code portions stored therein, the computer-readable program code portions comprising:
-
a first executable portion that, to efficiently recover or reclaim failed data in the data storage, in response to a portion of the data storage determined to be inaccessible; queries an identifier of a user data segment by examining a corresponding back reference data structure to determine if the user data segment references a particular storage block, the storage block being associated with both a reference counter and the identifier of the back reference data structure;
wherein;if the outcome of the query is negative, the user data segment is determined not associated with the particular storage block, and if the outcome of the query is positive, the user data segment is warranted be examined further to determine if the user data segment is associated with the particular storage block;
wherein further examining the user data segment includes performing;inspecting metadata of the back reference data structure associated with the inaccessible portion of the data storage, inspecting the identifier of the user data segment in the data storage, and inspecting metadata of user data segments whose identifiers returned a positive query outcome for at least one of the back reference data structures associated with the inaccessible portion of the data storage, and wherein the metadata and the identifier of the user data segment is inspected to determine the association with the particular storage block in lieu of scanning all metadata of all user objects in the data storage thereby efficiently identifying the failed data for reclamation. - View Dependent Claims (15, 16, 17, 18, 19)
-
Specification