Method and apparatus for detecting the presence of subblocks in a reduced-redundancy storage system
First Claim
1. A method for representing the presence of a subblock in a storage system, comprising:
- hashing a subblock to obtain an index hash value for the subblock;
creating an array of bits using a function that inputs the index hash value for the subblock and outputs bits into the array of bits and maps the subblock to a position in the array of bits, where a bit in the array of bits is a predetermined bit value that indicates whether at least one subblock in the storage system maps to the bit position in the array of bits;
selectively storing the index hash value in a subblock index located on the storage system if the subblock is absent from the storage system;
selectively storing the index hash value in a subblock index entry write buffer located in memory if the subblock is absent from the storage system, where the index entry write buffer is divided into a plurality of buffer portions corresponding to a portion of the subblock index located on the storage system; and
transferring the contents of the index entry write buffer to the subblock index located on the storage system using a single sequential read and write operation of the storage system.
11 Assignments
0 Petitions
Accused Products
Abstract
Method and apparatus for rapidly determining whether a particular subblock of data is present in a reduced-redundancy storage system. An aspect of the invention achieves this by hashing each subblock in the storage system into a bitfilter that contains a ‘1’ bit for each position to which at least one subblock hashes. This bitfilter provides an extremely fast way to determine whether a subblock is in the storage system. In a further aspect of the invention, index entries for new subblocks may be buffered in a subblock index write buffer so as to convert a large number of random access read and write operations into a single sequential read and a single sequential write operation. The combination of the bitfilter and the write buffer yields a reduced-redundancy storage system that uses significantly less high speed random access memory than is used by systems that store the entire subblock index in memory.
43 Citations
18 Claims
-
1. A method for representing the presence of a subblock in a storage system, comprising:
-
hashing a subblock to obtain an index hash value for the subblock; creating an array of bits using a function that inputs the index hash value for the subblock and outputs bits into the array of bits and maps the subblock to a position in the array of bits, where a bit in the array of bits is a predetermined bit value that indicates whether at least one subblock in the storage system maps to the bit position in the array of bits; selectively storing the index hash value in a subblock index located on the storage system if the subblock is absent from the storage system; selectively storing the index hash value in a subblock index entry write buffer located in memory if the subblock is absent from the storage system, where the index entry write buffer is divided into a plurality of buffer portions corresponding to a portion of the subblock index located on the storage system; and transferring the contents of the index entry write buffer to the subblock index located on the storage system using a single sequential read and write operation of the storage system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
Specification