HIGH PERFORMANCE DATA DEDUPLICATION IN A VIRTUAL TAPE SYSTEM
First Claim
Patent Images
1. A method for data deduplication comprising:
- receiving a plurality of backup datasets, each backup dataset comprising of a plurality of data blocks;
storing metadata in a plurality of metadata disk segments (meta-segment(s));
storing the received data blocks in a plurality of data disk segments (data-segment(s));
identifying one or more data-segment(s) comprising of duplicate data, wherein the duplicate data in a data-segment is identical to data from one or more previous data-segment(s), and for each identified data-segment modifying metadata corresponding to duplicate data to correspond to the identical data, and releasing the identified data-segment; and
updating metadata for each data-segment checked for data deduplication.
0 Assignments
0 Petitions
Accused Products
Abstract
Data deduplication in a storage system, achieving high performance due to minimal overhead during a backup operation, reduced disk read operations to locate duplicate data and minimal impact for restore operations involving deduplicated data.
156 Citations
23 Claims
-
1. A method for data deduplication comprising:
-
receiving a plurality of backup datasets, each backup dataset comprising of a plurality of data blocks; storing metadata in a plurality of metadata disk segments (meta-segment(s)); storing the received data blocks in a plurality of data disk segments (data-segment(s)); identifying one or more data-segment(s) comprising of duplicate data, wherein the duplicate data in a data-segment is identical to data from one or more previous data-segment(s), and for each identified data-segment modifying metadata corresponding to duplicate data to correspond to the identical data, and releasing the identified data-segment; and updating metadata for each data-segment checked for data deduplication. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A system configured for data deduplication, the system comprising:
-
means for receiving a plurality of backup datasets, each backup dataset comprising of a plurality of data blocks; means for storing metadata in a plurality of metadata disk segments (meta-segment(s));
means for storing the received data blocks in a plurality of data disk segments (data-segment(s));means for identifying one or more data-segment(s) comprising of duplicate data, wherein the duplicate data in a data-segment is identical to data from one or more previous data-segment(s), and for each identified data-segment means for modifying metadata corresponding to duplicate data to correspond to the identical data and releasing the identified data-segment; and means updating metadata for each data-segment checked for data deduplication.
-
-
23. A computer readable medium for data deduplication, the computer readable medium including program instructions for performing the steps of:
-
receiving a plurality of backup datasets, each backup dataset comprising of a plurality of data blocks; storing metadata in a plurality of metadata disk segments (meta-segment(s)); storing the received data blocks in a plurality of data disk segments (data-segment(s)); identifying one or more data-segment(s) comprising of duplicate data, wherein the duplicate data in a data-segment is identical to data from one or more previous data-segment(s), and for each identified data-segment modifying metadata corresponding to duplicate data to correspond to the identical data, and releasing the identified data-segment; and updating metadata for each data-segment checked for data deduplication.
-
Specification