Method for optimizing the memory usage and performance of data deduplication storage systems
First Claim
1. A method for optimizing the memory usage and performance of data deduplication storage systems, said data deduplication storage system having a memory, a low-latency storage, a disk storage location having data blocks and associate meta data stored thereon, the method comprising:
- dividing the meta data into a three level hierarchy including a first level that stores the meta data on disk along with the data blocks;
a second level that uses low latency storage to cache a copy of the on-disk meta data for faster direct access; and
a third level that organizes references to fingerprints using a Trie that is entirely resident in random access memory of the data deduplication storage system, the meta data comprising a fingerprint of the data block, data address of the data block, and a reference count; and
conducting a search of the Trie for fingerprints to determine whether a data block is unique or a duplicate within the data deduplication storage system.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and system of optimizing the memory usage and performance of data deduplication storage systems includes organizing the metadata of data blocks needed by deduplicating storage systems. A three level hierarchy is used. Level 1 stores the metadata on disk along with the user data. Level 2 uses low latency storage (e.g. RAM and Solid State Disks) to cache the on-disk meta data for faster direct access. Level 3 organizes the fingerprints using a Trie and is entirely resident in RAM. Thus, the search, to determine whether a data block is unique or not and a candidate for transfer, can be more efficiency executed and to ensure that the meta data is transactionally secure.
-
Citations
20 Claims
-
1. A method for optimizing the memory usage and performance of data deduplication storage systems, said data deduplication storage system having a memory, a low-latency storage, a disk storage location having data blocks and associate meta data stored thereon, the method comprising:
-
dividing the meta data into a three level hierarchy including a first level that stores the meta data on disk along with the data blocks;
a second level that uses low latency storage to cache a copy of the on-disk meta data for faster direct access; and
a third level that organizes references to fingerprints using a Trie that is entirely resident in random access memory of the data deduplication storage system, the meta data comprising a fingerprint of the data block, data address of the data block, and a reference count; andconducting a search of the Trie for fingerprints to determine whether a data block is unique or a duplicate within the data deduplication storage system. - View Dependent Claims (2, 3, 4)
-
-
5. A data deduplication system, comprising:
-
a storage location having data blocks and associate meta data stored thereon, said meta data comprising fingerprints of the data block, a reference count and a physical address of the data blocks referenced; a low latency storage containing a copy of the meta data; and a memory containing a searchable trie of fingerprints referencing the copy of the meta data stored in the low-latency storage. - View Dependent Claims (6, 7, 8)
-
-
9. A method for optimizing the memory usage and performance of data deduplication storage systems, said data deduplication storage system having a random access memory, a low-latency storage, a disk storage location having data blocks and associate meta data stored thereon, said meta data comprising a fingerprint of a particular data block, a reference count, and physical address of said particular data block, the method comprising:
-
storing data blocks and meta data on a disk storage location within a data deduplication storage system; storing a copy of said meta data in said low latency storage of the data deduplication storage system for fast access; building and maintaining a Trie consisting of a reference to a fingerprint of said meta data stored in said low latency storage of the data deduplication storage system; storing said Trie entirely in random access memory of the data deduplication storage system; and conducting a search in said Trie to determine whether a data block is unique or a duplicate in response to a request to copy data blocks to said data deduplication storage system by comparing a fingerprint of said data blocks requested to be copied to said data deduplication storage system to fingerprints of data blocks currently stored in said data deduplication storage system. - View Dependent Claims (10, 11, 12)
-
-
13. A computer program product, comprising a non-transitory computer-readable medium having a computer-readable program code embodied therein, said computer-readable program code adapted to be executed to implement a method for optimizing the memory usage and performance of data deduplication storage systems, said data deduplication storage system having a memory, a low-latency storage, a disk storage location having data blocks and associate meta data stored thereon, the method comprising:
-
dividing the meta data into a three level hierarchy including a first level that stores the meta data on disk along with the data blocks;
a second level that uses low latency storage to cache a copy of the on-disk meta data for faster direct access; and
a third level that organizes references to fingerprints using a Trie that is entirely resident in random access memory of the data deduplication storage system, the meta data comprising a fingerprint of the data block, data address of the data block, and a reference count; andconducting a search of the Trie for fingerprints to determine whether a data block is unique or a duplicate within the data deduplication storage system. - View Dependent Claims (14, 15, 16)
-
-
17. A computer program product, comprising a non-transitory computer-readable medium having a computer-readable program code embodied therein, said computer-readable program code adapted to be executed to implement a method for optimizing the memory usage and performance of data deduplication storage systems, said data deduplication storage system having a random access memory, a low-latency storage, a disk storage location having data blocks and associate meta data stored thereon, said meta data comprising a fingerprint of a particular data block, a reference count, and physical address of said particular data block, the method comprising:
-
storing data blocks and meta data on a disk storage location within the data deduplication storage system; storing a copy of said meta data in said low latency storage of the data deduplication storage system for fast access; building and maintaining a Trie consisting of a reference to a fingerprint of said meta data stored in said low latency storage of the data deduplication storage system; storing said Trie entirely in random access memory of the data deduplication storage system; and conducting a search in said Trie to determine whether a data block is unique or a duplicate in response to a request to copy data blocks to said data deduplication storage system by comparing a fingerprint of said data blocks requested to be copied to said data deduplication storage system to fingerprints of data blocks currently stored in said data deduplication storage system. - View Dependent Claims (18, 19, 20)
-
Specification