Indexing a deduplicated cache system by integrating fingerprints of underlying deduplicated storage system
First Claim
1. A computer-implemented method for indexing content stored in a cache memory device, the method comprising:
- maintaining a file index having a plurality of extent entries, each extent entry corresponding to one of a plurality of file extents stored in a cache memory device that caches data stored in a persistent storage device of a deduplicated storage system, wherein each extent entry maps a particular data region of a particular file to a storage location of the cache memory device storing a corresponding file extent;
maintaining a fingerprint index having a plurality of fingerprint entries, each mapping a fingerprint to a data region of a file indexed in the file index, wherein each fingerprint indexed in the fingerprint index is retrieved from metadata stored in the persistent storage device of the storage system when one or more corresponding data chunks were accessed; and
deduplicating and accessing the file extents stored in the cache memory device using the file index and the fingerprint index, wherein the file index is used to determine whether the particular data region of the particular file has been previously stored in the cache memory device, and wherein the fingerprint index is used to determine whether the particular data region is shared by another data region of the particular file or shared by another file and the particular data region has been stored in the cache memory device during access of another data region of the particular file or another file.
9 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method for indexing content stored in a cache memory device is disclosed. The method starts with maintaining a file index having a plurality of extent entries, each extent entry corresponding to one of a plurality of file extents stored in a cache memory device that caches data stored in a persistent storage device of a storage system. The method continues with maintaining a fingerprint index having a plurality of fingerprint entries, each mapping a fingerprint to a data region of a file indexed in the file index, wherein each fingerprint indexed in the fingerprint index is retrieved from metadata stored in the persistent storage device of the storage system when one or more corresponding data chunks were accessed, and deduplicating and accessing the file extents stored in the cache memory device using the file index and the fingerprint index.
-
Citations
22 Claims
-
1. A computer-implemented method for indexing content stored in a cache memory device, the method comprising:
-
maintaining a file index having a plurality of extent entries, each extent entry corresponding to one of a plurality of file extents stored in a cache memory device that caches data stored in a persistent storage device of a deduplicated storage system, wherein each extent entry maps a particular data region of a particular file to a storage location of the cache memory device storing a corresponding file extent; maintaining a fingerprint index having a plurality of fingerprint entries, each mapping a fingerprint to a data region of a file indexed in the file index, wherein each fingerprint indexed in the fingerprint index is retrieved from metadata stored in the persistent storage device of the storage system when one or more corresponding data chunks were accessed; and deduplicating and accessing the file extents stored in the cache memory device using the file index and the fingerprint index, wherein the file index is used to determine whether the particular data region of the particular file has been previously stored in the cache memory device, and wherein the fingerprint index is used to determine whether the particular data region is shared by another data region of the particular file or shared by another file and the particular data region has been stored in the cache memory device during access of another data region of the particular file or another file. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A storage system, comprising:
-
one or more storage units to store a plurality of files; a cache memory device to cache at least some data blocks of at least some of the files; and a cache manager executed by the processor configured to maintain a file index having a plurality of extent entries, each extent entry corresponding to one of a plurality of file extents stored in the cache memory device that caches data stored in the storage units, wherein each extent entry maps a particular data region of a particular data object to a storage location of the cache memory device storing a corresponding file extent, maintain a fingerprint index having a plurality of fingerprint entries, each mapping a fingerprint to a data region of a file indexed in the file index, wherein each fingerprint indexed in the fingerprint index is retrieved from metadata stored in the storage units when one or more corresponding data chunks were accessed, and deduplicate and access the file extents stored in the cache memory device using the file index and the fingerprint index, wherein the file index is used to determine whether the particular data region of the particular file has been previously stored in the cache memory device, and wherein the fingerprint index is used to determine whether the particular data region is shared by another data region of the particular file or shared by another file and the particular data region has been stored in the cache memory device during access of another data region of the particular file or another file. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A non-transitory computer-readable storage medium having instructions stored therein, which when executed by a processor, cause the processor to perform operations for indexing content stored in a cache memory device, the operations comprising:
-
maintaining a file index having a plurality of extent entries, each extent entry corresponding to one of a plurality of file extents stored in a cache memory device that caches data stored in a persistent storage device of a deduplicated storage system, wherein each extent entry maps a particular data region of a particular data object to a storage location of the cache memory device storing a corresponding file extent; maintaining a fingerprint index having a plurality of fingerprint entries, each mapping a fingerprint to a data region of a file indexed in the file index, wherein each fingerprint indexed in the fingerprint index is retrieved from metadata stored in the persistent storage device of the storage system when one or more corresponding data chunks were accessed; and deduplicating and accessing the file extents stored in the cache memory device using the file index and the fingerprint index, wherein the file index is used to determine whether the particular data region of the particular file has been previously stored in the cache memory device, and wherein the fingerprint index is used to determine whether the particular data region is shared by another data region of the particular file or shared by another file and the particular data region has been stored in the cache memory device during access of another data region of the particular file or another file. - View Dependent Claims (18, 19, 20, 21, 22)
-
Specification