Tape drive memory deduplication
First Claim
1. A tape drive memory storage improvement method comprising:
- receiving, by a processor of a storage tape drive hardware device, a data stream, wherein said storage tape drive hardware device internally comprises a deduplication software engine, a first non-volatile memory device (NVS1), a second non-volatile memory device (NVS2), and a first data storage tape cartridge;
passing, by said processor through said NVS2, said data stream;
dividing, by said processor executing said deduplication software engine within said NVS2, said data stream into a plurality of adjacent variable length data chunks;
generating, by said processor, a chunk list file comprising similarity identifiers associated with each of said plurality of adjacent variable length data chunks;
storing, by said processor within said NVS1, said chunk list file;
identifying, by said processor, duplicate data chunks of said plurality of adjacent variable length data chunks, wherein said duplicate data chunks comprise duplicated data with respect to a first group of data chunks of said plurality of adjacent variable length data chunks;
deleting, by said processor from said NVS2, said duplicate data chunks such that said first group of data chunks remain within said NVS2;
writing, by said processor from said NVS2 to said first data storage tape cartridge, said first group of data chunks for storage;
generating, by said processor, pointers identifying each data chunk of said first group of data chunks and an associated storage position, within said first data storage tape cartridge, for each said data chunk of said first group of data chunks;
storing, by said processor, said pointers within said chunk list file located within said NVS1; and
writing, by said processor from said NVS1 to said first data storage tape cartridge, said chunk list file comprising said pointers for storage.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and system for improving tape drive memory storage is provided. The method includes receiving, by a storage tape drive, a data stream for storage. The data stream is passed through a non-volatile memory device (NVS2) of the storage tape drive. The data stream is divided into adjacent variable length data chunks and a chunk list file including similarity identifiers for each of the adjacent variable length data chunks is generated and stored within a (non-volatile memory device) NVS1. Duplicate data including duplicated data with respect to a group of data chunks of the adjacent variable length data chunks is identified and deleted from the NVS2 of the storage tape drive such that the group of data chunks remains within NVS2. The group of data chunks is written to a data storage tape cartridge. Pointers identifying each data chunk and an associated storage position are generated and stored.
-
Citations
25 Claims
-
1. A tape drive memory storage improvement method comprising:
-
receiving, by a processor of a storage tape drive hardware device, a data stream, wherein said storage tape drive hardware device internally comprises a deduplication software engine, a first non-volatile memory device (NVS1), a second non-volatile memory device (NVS2), and a first data storage tape cartridge; passing, by said processor through said NVS2, said data stream; dividing, by said processor executing said deduplication software engine within said NVS2, said data stream into a plurality of adjacent variable length data chunks; generating, by said processor, a chunk list file comprising similarity identifiers associated with each of said plurality of adjacent variable length data chunks; storing, by said processor within said NVS1, said chunk list file; identifying, by said processor, duplicate data chunks of said plurality of adjacent variable length data chunks, wherein said duplicate data chunks comprise duplicated data with respect to a first group of data chunks of said plurality of adjacent variable length data chunks; deleting, by said processor from said NVS2, said duplicate data chunks such that said first group of data chunks remain within said NVS2; writing, by said processor from said NVS2 to said first data storage tape cartridge, said first group of data chunks for storage; generating, by said processor, pointers identifying each data chunk of said first group of data chunks and an associated storage position, within said first data storage tape cartridge, for each said data chunk of said first group of data chunks; storing, by said processor, said pointers within said chunk list file located within said NVS1; and writing, by said processor from said NVS1 to said first data storage tape cartridge, said chunk list file comprising said pointers for storage. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer program product, comprising a computer readable hardware storage device storing a computer readable program code, said computer readable program code comprising an algorithm that when executed by a processor of a storage tape drive hardware device implements a tape drive memory storage improvement method, said method comprising:
-
receiving, by said processor, a data stream for storage, wherein said storage tape drive hardware device internally comprises a deduplication software engine, a first non-volatile memory device (NVS1), a second non-volatile memory device (NVS2), and a first data storage tape cartridge; passing, by said processor through said NVS2, said data stream; dividing, by said processor executing said deduplication software engine within said NVS2, said data stream into a plurality of adjacent variable length data chunks; generating, by said processor, a chunk list file comprising similarity identifiers associated with each of said plurality of adjacent variable length data chunks; storing, by said processor within said NVS1, said chunk list file; identifying, by said processor, duplicate data chunks of said plurality of adjacent variable length data chunks, wherein said duplicate data chunks comprise duplicated data with respect to a first group of data chunks of said plurality of adjacent variable length data chunks; deleting, by said processor from said NVS2, said duplicate data chunks such that said first group of data chunks remain within said NVS2; writing, by said processor from said NVS2 to said first data storage tape cartridge, said first group of data chunks for storage; generating, by said processor, pointers identifying each data chunk of said first group of data chunks and an associated storage position, within said first data storage tape cartridge, for each said data chunk of said first group of data chunks; storing, by said processor, said pointers within said chunk list file located within said NVS1; and writing, by said processor from said NVS1 to said first data storage tape cartridge, said chunk list file comprising said pointers for storage. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A storage tape drive hardware device comprising a processor coupled to a computer-readable memory unit, said memory unit comprising instructions that when executed by the processor implements a tape drive memory storage improvement method comprising:
-
receiving, by said processor, a data stream for storage, wherein said storage tape drive hardware device internally comprises a deduplication software engine, a first non-volatile memory device (NVS1), a second non-volatile memory device (NVS2), and a first data storage tape cartridge; passing, by said processor through said NVS2, said data stream; dividing, by said processor executing said deduplication software engine within said NVS2, said data stream into a plurality of adjacent variable length data chunks; generating, by said processor, a chunk list file comprising similarity identifiers associated with each of said plurality of adjacent variable length data chunks; storing, by said processor within said NVS1, said chunk list file; identifying, by said processor, duplicate data chunks of said plurality of adjacent variable length data chunks, wherein said duplicate data chunks comprise duplicated data with respect to a first group of data chunks of said plurality of adjacent variable length data chunks; deleting, by said processor from said NVS2, said duplicate data chunks such that said first group of data chunks remain within said NVS2; writing, by said processor from said NVS2 to said first data storage tape cartridge, said first group of data chunks for storage; generating, by said processor, pointers identifying each data chunk of said first group of data chunks and an associated storage position, within said first data storage tape cartridge, for each said data chunk of said first group of data chunks; storing, by said processor, said pointers within said chunk list file located within said NVS1; and writing, by said processor from said NVS1 to said first data storage tape cartridge, said chunk list file comprising said pointers for storage. - View Dependent Claims (17, 18, 19, 20)
-
-
21. A tape drive memory storage improvement method comprising:
-
receiving, by a processor of a storage tape drive hardware device, a data file for storage, wherein said storage tape drive hardware device internally comprises a deduplication software engine, a first non-volatile memory device (NVS1), a second non-volatile memory device (NVS2), and a first data storage tape cartridge; dividing, by said processor executing said deduplication software engine, said data file into a plurality of adjacent variable length data chunks; identifying, by said processor executing said deduplication software engine, duplicate data chunks of said plurality of adjacent variable length data chunks, wherein said duplicate data chunks comprise duplicated data with respect to a first group of data chunks of said plurality of adjacent variable length data chunks; storing, by said processor within a first database within said NVS2, said first group of data chunks; generating, by said processor, pointers identifying each data chunk of said first group of data chunks and an associated storage position, within said first database of said NVS2, for each said data chunk of said first group of data chunks; storing, by said processor within a second database within said NVS1, said pointers; first writing, by said processor from said NVS2 to said first data storage tape cartridge, said first group of data chunks for storage; and second writing, by said processor from said NVS1 to said first data storage tape cartridge, said pointers. - View Dependent Claims (22, 23)
-
-
24. A computer program product, comprising a computer readable hardware storage device storing a computer readable program code, said computer readable program code comprising an algorithm that when executed by a processor of a storage tape drive hardware device implements a tape drive memory storage improvement method, said method comprising:
-
receiving, by said processor, a data file for storage, wherein said storage tape drive hardware device internally comprises a deduplication software engine, a first non-volatile memory device (NVS1), a second non-volatile memory device (NVS2), and a first data storage tape cartridge; dividing, by said processor executing said deduplication software engine, said data file into a plurality of adjacent variable length data chunks; identifying, by said processor executing said deduplication software engine, duplicate data chunks of said plurality of adjacent variable length data chunks, wherein said duplicate data chunks comprise duplicated data with respect to a first group of data chunks of said plurality of adjacent variable length data chunks; storing, by said processor within a first database within said NVS2, said first group of data chunks; generating, by said processor, pointers identifying each data chunk of said first group of data chunks and an associated storage position, within said first database of said NVS2, for each said data chunk of said first group of data chunks; storing, by said processor within a second database within said NVS1, said pointers; first writing, by said processor from said NVS2 to said first data storage tape cartridge, said first group of data chunks for storage; and second writing, by said processor from said NVS1 to said first data storage tape cartridge, said pointers. - View Dependent Claims (25)
-
Specification