Apparatus and method for inline compression and deduplication
First Claim
1. An apparatus comprising:
- a memory unit for storing data streams; and
a processor coupled to said memory unit, the processor configured to perform a compression operation and a deduplication operation in a single pass, said processor configured to;
use a subset of data from a data stream to generate a reference data block corresponding to said subset of data;
compute a first hash value for said subset of data and a second hash value for said reference block using a same function;
compare the first hash value computed for said subset of data to the second hash value computed for said reference data block, wherein said first hash value and said second hash value are stored in separate hash tables;
generate a compressed and deduplicated representation of said subset of data by at least modifying header data corresponding to said subset of data responsive to a detected match between said first hash value and said second hash value,wherein said compressed representation is generated using said reference data block responsive to the detection of the match between said first hash value and said second hash value,wherein said separate hash tables comprises a reference hash table and said second hash value is stored in said reference hash table; and
initiate decompression procedures upon storing said reference data block in a memory buffer and upon generation of said compressed representation.
9 Assignments
0 Petitions
Accused Products
Abstract
An apparatus for inline compression and deduplication includes a memory unit and a processor coupled to the memory unit. The processor is configured to receive a subset of data from a data stream and select a reference data block corresponding to the subset of data, in which the reference data block is stored in a memory buffer resident in the memory unit. The processor is also configured to compare a first hash value computed for the subset of data to a second hash value computed for the reference data block, in which the first hash value and the second hash value are stored in separate hash tables and generate a compressed representation of the subset of data by modifying header data corresponding to the subset of data responsive to a detected match between the first hash value and the second hash value in one of the separate hash tables.
-
Citations
8 Claims
-
1. An apparatus comprising:
-
a memory unit for storing data streams; and a processor coupled to said memory unit, the processor configured to perform a compression operation and a deduplication operation in a single pass, said processor configured to; use a subset of data from a data stream to generate a reference data block corresponding to said subset of data; compute a first hash value for said subset of data and a second hash value for said reference block using a same function; compare the first hash value computed for said subset of data to the second hash value computed for said reference data block, wherein said first hash value and said second hash value are stored in separate hash tables; generate a compressed and deduplicated representation of said subset of data by at least modifying header data corresponding to said subset of data responsive to a detected match between said first hash value and said second hash value, wherein said compressed representation is generated using said reference data block responsive to the detection of the match between said first hash value and said second hash value, wherein said separate hash tables comprises a reference hash table and said second hash value is stored in said reference hash table; and initiate decompression procedures upon storing said reference data block in a memory buffer and upon generation of said compressed representation. - View Dependent Claims (2, 3, 4)
-
-
5. A computer-implemented method of performing data reduction operations on an input data stream during a single pass, said method comprising:
-
receiving a subset of data from a data stream; selecting a reference data block corresponding to said subset of data, wherein said reference data block is stored in a memory buffer; comparing a first hash value computed for said subset of data to a second hash value computed for said reference data block, wherein said first hash value and said second hash value are stored in separate hash tables, wherein said first hash value and said second hash value are computed using a same function; generating a compressed representation of said subset of data by at least modifying header data corresponding to said subset of data responsive to a detected match between said first hash value and said second hash value, wherein said compressed representation is generated using said reference data block responsive to the detected match between said first hash value and said second hash value, wherein said separate hash tables comprises a reference hash table and said second hash value is stored in said reference table; and initiating decompression procedures upon storing said reference data block in said memory buffer and upon generation of said compressed representation. - View Dependent Claims (6, 7, 8)
-
Specification