Enhancing data processing performance by cache management of fingerprint index
First Claim
Patent Images
1. A method for improving hash index key lookup caching performance in a computing environment by a processor, comprising:
- for a cached fingerprint map having a plurality of entries corresponding to a plurality of data fingerprints used by a data deduplication system, using reference count information obtained from a data deduplication engine to determine a length of time to retain the plurality of entries of the fingerprint map in cache, by;
examining the reference count information of the plurality of entries of the fingerprint map in the cache and a storage policy related to the plurality of entries to establish a retention duration for the plurality of entries, andquerying whether the reference count information for a data segment has been incremented, the incremented reference count information indicating a frequency of times the data segment has been accessed, or whether a predetermined time interval has expired;
if the reference count information for a data segment has not been incremented or if the predetermined time interval has not expired, reiterating the step of querying;
if the reference count information for a data segment has been incremented or if the predetermined time interval has expired in which no physical activity has been observed on a physical block, re-determining a new appropriate duration of retention in the cache; and
when the cache is full, retaining in the cache the plurality of entries of the fingerprint map having higher reference counts and removing from the cache the plurality of entries of the fingerprint map having lower reference counts.
0 Assignments
0 Petitions
Accused Products
Abstract
Various embodiments for improving hash index key lookup caching performance in a computing environment are provided. In one embodiment, for a cached fingerprint map having a plurality of entries corresponding to a plurality of data fingerprints, reference count information is used to determine a length of time to retain the plurality of entries in cache. Those of the plurality of entries having a higher reference counts are retained longer than those having lower reference counts.
-
Citations
5 Claims
-
1. A method for improving hash index key lookup caching performance in a computing environment by a processor, comprising:
-
for a cached fingerprint map having a plurality of entries corresponding to a plurality of data fingerprints used by a data deduplication system, using reference count information obtained from a data deduplication engine to determine a length of time to retain the plurality of entries of the fingerprint map in cache, by; examining the reference count information of the plurality of entries of the fingerprint map in the cache and a storage policy related to the plurality of entries to establish a retention duration for the plurality of entries, and querying whether the reference count information for a data segment has been incremented, the incremented reference count information indicating a frequency of times the data segment has been accessed, or whether a predetermined time interval has expired; if the reference count information for a data segment has not been incremented or if the predetermined time interval has not expired, reiterating the step of querying; if the reference count information for a data segment has been incremented or if the predetermined time interval has expired in which no physical activity has been observed on a physical block, re-determining a new appropriate duration of retention in the cache; and when the cache is full, retaining in the cache the plurality of entries of the fingerprint map having higher reference counts and removing from the cache the plurality of entries of the fingerprint map having lower reference counts. - View Dependent Claims (2, 3, 4, 5)
-
Specification