Deduplication of Data on Disk Devices Based on a Threshold Number of Sequential Blocks
First Claim
1. A storage system for deduplicating blocks of data based on a predetermined threshold number (THN) of sequential blocks, the storage system comprising:
- a set of one or more disk devices for storing a plurality of blocks, each disk device comprising a set of tracks for storing blocks; and
a deduplication layer configured for;
receiving a set of blocks;
determining whether a series of THN or more received blocks (THN series) matches a sequence of THN or more stored blocks (THN sequence), a series of blocks comprising a set of consecutive blocks and a sequence of blocks comprising a series of blocks stored on a same track of a disk device, THN having a value of 2 or greater; and
upon determining that a matching THN sequence is found, deduplicating the blocks of the THN series using the matching THN sequence.
2 Assignments
0 Petitions
Accused Products
Abstract
Deduplication of data on disk devices based on a threshold number (THN) of sequential blocks is described herein, the threshold number being two or greater. Deduplication may be performed when a series of THN or more received blocks (THN series) match a sequence of THN or more stored blocks (THN sequence), whereby a sequence comprises blocks stored on the same track of a disk device. Deduplication may be performed using a block-comparison mechanism comprising metadata entries of stored blocks and a mapping mechanism containing mappings of deduplicated blocks to their matching blocks. The mapping mechanism may be used to perform later read requests received for the deduplicated blocks. The deduplication described herein may reduce the read latency as the number of seeks between tracks may be reduced. Also, when a seek to a different track is performed, the seek time cost is spread over THN or more blocks.
-
Citations
22 Claims
-
1. A storage system for deduplicating blocks of data based on a predetermined threshold number (THN) of sequential blocks, the storage system comprising:
-
a set of one or more disk devices for storing a plurality of blocks, each disk device comprising a set of tracks for storing blocks; and a deduplication layer configured for; receiving a set of blocks; determining whether a series of THN or more received blocks (THN series) matches a sequence of THN or more stored blocks (THN sequence), a series of blocks comprising a set of consecutive blocks and a sequence of blocks comprising a series of blocks stored on a same track of a disk device, THN having a value of 2 or greater; and upon determining that a matching THN sequence is found, deduplicating the blocks of the THN series using the matching THN sequence. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A storage system for deduplicating blocks of data for storage based on a predetermined threshold number (THN) of blocks, the storage system comprising:
-
a set of one or more disk devices for storing a plurality of blocks, each block having an address location; and a deduplication layer configured for; receiving a set of blocks; determining whether a series of THN or more received blocks (THN series) match a sequence of THN or more stored blocks (THN sequence), a series of blocks comprising a set of consecutive blocks and a sequence of blocks comprising blocks having consecutive address locations, THN having a value of 2 or greater; and upon determining that a matching THN sequence is found, deduplicating the blocks of the THN series. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. A storage system for deduplicating blocks of data based on a predetermined threshold number (THN) of sequential blocks, the storage system comprising:
-
a set of one or more disk devices for storing a plurality of blocks, each disk device comprising a set of tracks for storing blocks; a deduplication layer configured for; receiving a set of blocks; using a comparison mechanism, determining whether a series of THN or more received blocks (THN series) matches a sequence of THN or more stored blocks (THN sequence), a series of blocks comprising a set of consecutive blocks and a sequence of blocks comprising a series of blocks stored on a same track of a disk device, THN having a value of 2 or greater; and upon determining that a matching THN sequence is found, deduplicating the blocks of the THN series using the matching THN sequence; and the comparison mechanism for storing metadata entries for a plurality of THN sequences comprising full or partial THN sequences, a partial THN sequence comprising a subset of THN or more blocks of a full THN sequence, a set of zero or more partial THN sequences being derived from each full THN sequence, each partial THN sequence in the set having a different combination of block size and offset from the beginning of the full THN sequence. - View Dependent Claims (20, 21, 22)
-
Specification