Backup and restore strategies for data deduplication
First Claim
1. A method for data stream backup, comprising:
- identifying for backup a plurality of optimized data streams stored in a chunk store, the chunk store including each optimized data stream as a plurality of chunks and corresponding optimized stream metadata, the plurality of chunks including at least one data chunk, the corresponding optimized stream metadata referencing the at least one data chunk, and the chunk store including all included data chunks in a deduplicated manner before the identification for the backup; and
storing at least a portion of the chunk store in a backup storage to backup the plurality of optimized data streams identified for backup,wherein said storing comprisesdetermining whether the plurality of optimized data streams were selected for backup according to an exclude mode or an include mode, the exclude mode being a first backup mode where at least one volume is specifically selected for backup and one or more data streams are specifically selected to be excluded from backup, and the include mode being a second backup mode where at least one data stream is specifically selected for backup; and
selecting the backup technique based on which of the exclude mode or include mode was determined to be selected.
2 Assignments
0 Petitions
Accused Products
Abstract
Techniques for backup and restore of optimized data streams are described. A chunk store includes each optimized data stream as a plurality of chunks including at least one data chunk and corresponding optimized stream metadata. The chunk store includes data chunks in a deduplicated manner. Optimized data streams stored in the chunk store are identified for backup. At least a portion of the chunk store is stored in backup storage according to an optimized backup technique, an un-optimized backup technique, an item level backup technique, or a data chunk identifier backup technique. Optimized data streams stored in the backup storage may be restored. A file reconstructor includes a callback module that generates calls to a restore application to request optimized stream metadata and any referenced data chunks from the backup storage. The file reconstructor reconstructs the data streams from the referenced data chunks.
29 Citations
20 Claims
-
1. A method for data stream backup, comprising:
-
identifying for backup a plurality of optimized data streams stored in a chunk store, the chunk store including each optimized data stream as a plurality of chunks and corresponding optimized stream metadata, the plurality of chunks including at least one data chunk, the corresponding optimized stream metadata referencing the at least one data chunk, and the chunk store including all included data chunks in a deduplicated manner before the identification for the backup; and storing at least a portion of the chunk store in a backup storage to backup the plurality of optimized data streams identified for backup, wherein said storing comprises determining whether the plurality of optimized data streams were selected for backup according to an exclude mode or an include mode, the exclude mode being a first backup mode where at least one volume is specifically selected for backup and one or more data streams are specifically selected to be excluded from backup, and the include mode being a second backup mode where at least one data stream is specifically selected for backup; and selecting the backup technique based on which of the exclude mode or include mode was determined to be selected. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for restoring files from backup, comprising:
-
receiving a request for an optimized data stream to be retrieved from a chunk store in backup storage, the request including an identifier for the optimized stream metadata corresponding to the data stream; generating a first call to a restore application based on the optimized stream metadata, the first call specifying a file name for a first chunk container in backup storage that stores optimized stream metadata identified by the optimized stream metadata identifier, and specifying an offset for the optimized stream metadata in the first chunk container; receiving the optimized stream metadata in response to the first call; determining at least one data chunk identifier referenced in the optimized stream metadata; generating at least one additional call to the restore application corresponding to the at least one data chunk identifier to obtain at least one data chunk from at least one chunk container in backup storage; and receiving the at least one data chunk in response to the at least one additional call. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A system, comprising:
-
a data backup module that receives an identification of a plurality of optimized data streams stored in a chunk store for backup, the chunk store including each optimized data stream as a plurality of chunks and corresponding optimized stream metadata, the plurality of chunks including at least one data chunk, the corresponding optimized stream metadata referencing the at least one data chunk, the chunk store including all included data chunks in a deduplicated configuration before the identification for the backup; and the data backup module being configured to store at least a portion of the chunk store in a backup storage to backup the plurality of optimized data streams identified for backup, determine a first amount of space in the chunk store consumed by data chunks that are not referenced by the plurality of optimized data streams identified for backup, determine a second amount of space as an amount of space to store all of the plurality of optimized data streams in un-optimized form minus an amount of space to store all of the plurality of optimized data streams in optimized form, and select a backup technique based on the determined first and second amounts of space. - View Dependent Claims (17, 18, 19, 20)
-
Specification