DEDUPLICATING STORAGE WITH ENHANCED FREQUENT-BLOCK DETECTION
First Claim
Patent Images
1. A method for detecting data duplication, comprising:
- maintaining a fingerprint directory comprising one or more entries, each entry including a data fingerprint and a data location for a data chunk;
associating each said entry with a seen-count attribute which is an indication of how often the fingerprint has been seen in arriving data chunks;
retaining higher-frequency entries, while also taking into account recency of data accesses; and
detecting that the data fingerprint for a new chunk is the same as the data fingerprint contained in an entry in the fingerprint directory.
1 Assignment
0 Petitions
Accused Products
Abstract
Detecting data duplication comprises maintaining a fingerprint directory including one or more entries, each entry including a data fingerprint and a data location for a data chunk. Each entry is associated with a seen-count attribute which is an indication of how often the fingerprint has been seen in arriving data chunks. Higher-frequency entries in the directory are retained, while also taking into account recency of data accesses. A data duplication detector detects that the data fingerprint for a new chunk is the same as the data fingerprint contained in an entry in the fingerprint directory.
35 Citations
20 Claims
-
1. A method for detecting data duplication, comprising:
-
maintaining a fingerprint directory comprising one or more entries, each entry including a data fingerprint and a data location for a data chunk; associating each said entry with a seen-count attribute which is an indication of how often the fingerprint has been seen in arriving data chunks; retaining higher-frequency entries, while also taking into account recency of data accesses; and detecting that the data fingerprint for a new chunk is the same as the data fingerprint contained in an entry in the fingerprint directory. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer program product for detecting data duplication, the computer program product comprising:
-
a tangible storage medium readable by a computer system and storing instructions for execution by the computer system for performing a method comprising; maintaining a fingerprint directory comprising one or more entries, each entry including a data fingerprint and a data location for a data chunk; associating each said entry with a seen-count attribute which is an indication of how often the fingerprint has been seen in arriving data chunks; retaining higher-frequency entries, while also taking into account recency of data accesses; and detecting that the data fingerprint for a new chunk is the same as the data fingerprint contained in an entry in the fingerprint directory. - View Dependent Claims (10, 11, 12)
-
-
13. A system for detecting data duplication, comprising:
-
a fingerprint controller that maintains a fingerprint directory comprising one or more entries, each entry including a data fingerprint and a data location for a data chunk in a storage device; wherein each entry is associated with a seen-count attribute which is an indication of how often the fingerprint has been seen in arriving data chunks, and wherein the fingerprint controller retains higher-frequency entries, while also taking into account recency of data accesses; and a duplicate detector that detects if the data fingerprint for a new chunk is the same as the data fingerprint contained in an entry in the fingerprint directory. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification