Accelerated deduplication
First Claim
1. A system comprising:
- a processor configured to maintain a state machine providing a scatter-gather vector, the scatter-gather vector designating a plurality of memory areas for chunk boundary identification and chunk fingerprinting for a data stream, wherein the processor does not perform deduplication; and
a deduplication accelerator configured to access the memory areas designated in the scatter-gather vector to delineate a respective chunk boundary in a single pipeline stage and determine a respective chunk fingerprint for each of a plurality of data chunks in the same pipeline stage, the deduplication accelerator being a specially configured hardware accelerator separate from the processor and specifically configured for deduplication.
23 Assignments
0 Petitions
Accused Products
Abstract
Mechanisms are provided for accelerated data deduplication. A data stream is received an input interface and maintained in memory. Chunk boundaries are detected and chunk fingerprints are calculated using a deduplication accelerator while a processor maintains a state machine. A deduplication dictionary is accessed using a chunk fingerprint to determine if the associated data chunk has previously been written to persistent memory. If the data chunk has previously been written, reference counts may be updated but the data chunk need not be stored again. Otherwise, datastore suitcases, filemaps, and the deduplication dictionary may be updated to reflect storage of the data chunk. Direct memory access (DMA) addresses are provided to directly transfer a chunk to an output interface as needed.
55 Citations
20 Claims
-
1. A system comprising:
-
a processor configured to maintain a state machine providing a scatter-gather vector, the scatter-gather vector designating a plurality of memory areas for chunk boundary identification and chunk fingerprinting for a data stream, wherein the processor does not perform deduplication; and a deduplication accelerator configured to access the memory areas designated in the scatter-gather vector to delineate a respective chunk boundary in a single pipeline stage and determine a respective chunk fingerprint for each of a plurality of data chunks in the same pipeline stage, the deduplication accelerator being a specially configured hardware accelerator separate from the processor and specifically configured for deduplication. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method comprising:
-
maintaining, by a processor, a state machine providing a scatter-gather vector, the scatter-gather vector designating a plurality of memory areas for chunk boundary identification and chunk fingerprinting for a data stream, wherein the processor does not perform deduplication; and accessing the memory areas designated in the scatter-gather vector with a deduplication accelerator to delineate a respective chunk boundary in a single pipeline stage and determine a respective chunk fingerprint for each of a plurality of data chunks in the same pipeline stage, the deduplication accelerator being a specially configured hardware accelerator separate from the processor and specifically configured for deduplication. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. One or more computer readable media having instructions stored thereon for performing a method, the method comprising:
-
maintaining, by a processor, a state machine providing a scatter-gather vector, the scatter-gather vector designating a plurality of memory areas for chunk boundary identification and chunk fingerprinting for a data stream, wherein the processor does not perform deduplication; and accessing the memory areas designated in the scatter-gather vector with a deduplication accelerator to delineate a respective chunk boundary in a single pipeline stage and determine a respective chunk fingerprint for each of a plurality of data chunks in the same pipeline stage, the deduplication accelerator being a specially configured hardware accelerator separate from the processor and specifically configured for deduplication. - View Dependent Claims (20)
-
Specification