Self-healing by hash-based deduplication
First Claim
Patent Images
1. A method for self-healing in a hash-based deduplication system using a processor device in a computing environment, the method comprising:
- maintaining deduplication digests of data with a corresponding list of the deduplication digests in a table of contents (TOC) for the self-healing of data that is one of lost and unreadable within each one of a plurality of user-level stored entities, the user-level stored entities each comprising a file portion of a virtual tape cartridge, such that only the deduplication digests corresponding to the data stored on a given virtual tape cartridge are listed in the TOC of the given virtual tape cartridge;
comparing input data digests to the TOC if directed to data that is one of lost and unreadable and using the input data digests to repair the one of lost and unreadable data;
storing by each one of the plurality of user-level stored entities a deduplication digest of data belonging to each one of the plurality of user-level stored entities and the corresponding list of the deduplication digest in the TOC, wherein the TOC is a list of references to storage blocks in a common storage area where each entry lists one of a block and range of the storage blocks and an offset and range within one of the block and list of blocks, and wherein the data in the one of the plurality of user-level stored entities is a concatenation of the data in the offset and range within one of the block and list of blocks that are listed in the list of references to storage blocks;
determining if a digest-to-block mapping module contains an entry for a deduplication digest, the digest-to-block mapping module searching only for the deduplication digest within the given virtual tape cartridge of which the data for the entry resides;
using the digest-to-block mapping module to look up a storage block containing the deduplication digest in the digest-to-block mapping module; and
removing the deduplication digest from the digest-to-block mapping module when the storage block is found to be unreadable.
1 Assignment
0 Petitions
Accused Products
Abstract
For self-healing in a hash-based deduplication system using a processor device in a computing environment, deduplication digests of data and a corresponding list of the deduplication digests in a table of contents (TOC) are maintained for the self-healing of data that is lost or unreadable. The input data digests are compared to the TOC if directed to data that is lost or unreadable, and the input data digests are used to repair the one of lost and unreadable data.
-
Citations
15 Claims
-
1. A method for self-healing in a hash-based deduplication system using a processor device in a computing environment, the method comprising:
-
maintaining deduplication digests of data with a corresponding list of the deduplication digests in a table of contents (TOC) for the self-healing of data that is one of lost and unreadable within each one of a plurality of user-level stored entities, the user-level stored entities each comprising a file portion of a virtual tape cartridge, such that only the deduplication digests corresponding to the data stored on a given virtual tape cartridge are listed in the TOC of the given virtual tape cartridge; comparing input data digests to the TOC if directed to data that is one of lost and unreadable and using the input data digests to repair the one of lost and unreadable data; storing by each one of the plurality of user-level stored entities a deduplication digest of data belonging to each one of the plurality of user-level stored entities and the corresponding list of the deduplication digest in the TOC, wherein the TOC is a list of references to storage blocks in a common storage area where each entry lists one of a block and range of the storage blocks and an offset and range within one of the block and list of blocks, and wherein the data in the one of the plurality of user-level stored entities is a concatenation of the data in the offset and range within one of the block and list of blocks that are listed in the list of references to storage blocks;
determining if a digest-to-block mapping module contains an entry for a deduplication digest, the digest-to-block mapping module searching only for the deduplication digest within the given virtual tape cartridge of which the data for the entry resides;using the digest-to-block mapping module to look up a storage block containing the deduplication digest in the digest-to-block mapping module; and removing the deduplication digest from the digest-to-block mapping module when the storage block is found to be unreadable. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A system for self-healing in a hash-based deduplication system in a computing environment, the system comprising:
-
the hash-based deduplication system;
a plurality of user-level stored entities in the hash-based deduplication system;storage blocks in the hash-based deduplication system; a digest-to-block mapping module in the hash-based deduplication system; a repaired digest map in the hash-based deduplication system; a plurality of lists in the hash-based deduplication system, wherein the plurality of lists includes at least a corresponding list of the deduplication digests in a table of contents (TOC) and a damaged deduplication digests list; and at least one processor device operable in the hash-based deduplication system, wherein the at least one processor device; maintains deduplication digests of data with the corresponding list of the deduplication digests in the TOC for the self-healing of data that is one of lost and unreadable within each one of a plurality of user-level stored entities, the user-level stored entities each comprising a file portion of a virtual tape cartridge, such that only the deduplication digests corresponding to the data stored on a given virtual tape cartridge are listed in the TOC of the given virtual tape cartridge, compares input data digests to the TOC if directed to data that is one of lost and unreadable and using the input data digests to repair the one of lost and unreadable data, stores by each one of the plurality of user-level stored entities a deduplication digest of data belonging to each one of the plurality of user-level stored entities and the corresponding list of the deduplication digest in the TOC, wherein the TOC is a list of references to storage blocks in a common storage area where each entry lists one of a block and range of the storage determines if a digest-to-block mapping module contains an entry for a deduplication digest, the digest-to-block mapping module searching only for the deduplication digest within the given virtual tape cartridge of which the data for the entry resides, uses the digest-to-block mapping module to look up a storage block containing the deduplication digest in the digest-to-block mapping module, and removes the deduplication digest from the digest-to-block mapping module when the storage block is found to be unreadable. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A computer program product for self-healing in a hash-based deduplication system by a processor device, the computer program product comprising a non-transitory computer-readable storage medium having computer-readable program code portions stored therein, the computer-readable program code portions comprising:
-
a first executable portion that maintains deduplication digests of data with a corresponding list of the deduplication digests in a table of contents (TOC) for the self-healing of data that is one of lost and unreadable within each one of a plurality of user-level stored entities, the user-level stored entities each comprising a file portion of a virtual tape cartridge, such that only the deduplication digests corresponding to the data stored on a given virtual tape cartridge are listed in the TOC of the given virtual tape cartridge; a second executable portion that comparing input data digests to the TOC if directed to data that is one of lost and unreadable and using the input data digests to repair the one of lost and unreadable data; a third executable portion that stores by each one of the plurality of user-level stored entities a deduplication digest of data belonging to each one of the plurality of user-level stored entities and the corresponding list of the deduplication digest in the TOC, wherein the TOC is a list of references to storage blocks in a common storage area where each entry lists one of a block and range of the storage blocks and an offset and range within one of the block and list of blocks, and wherein the data in the one of the plurality of user-level stored entities is a a fourth executable portion that determines if a digest-to-block mapping module contains an entry for a deduplication digest, the digest-to-block mapping module searching only for the deduplication digest within the given virtual tape cartridge of which the data for the entry resides; a fifth executable portion that uses the digest-to-block mapping module to look up a storage block containing the deduplication digest in the digest-to-block mapping module; and
a sixth executable portion that removes the deduplication digest from the digest-to-block mapping module when the storage block is found to be unreadable. - View Dependent Claims (12, 13, 14, 15)
-
Specification