Using double hashing schema to reduce short hash handle collisions and improve memory allocation in content-addressable storage systems
First Claim
Patent Images
1. A method for content addressable storage of data blocks in a distributed data storage system comprising:
- receiving a hash signature, including a short hash handle, for a data block, wherein the hash signature corresponds to a physical address of the data block within the content addressable storage, wherein the hash signature comprises a first identification of a first bucket and a second identification of a second bucket;
selecting one of the first bucket and the second bucket based at least in part on the first identification and the second identification as a bucket with which the hash signature should be associated; and
associating the hash signature with the selected one of the first bucket and the second bucket, wherein associating the hash signature includes modifying a bit of the short hash handle to provide an indication of which one of the first identification of the first bucket and the second identification of the second bucket in the hash signature corresponds to the selected one of the first bucket and the second bucket.
4 Assignments
0 Petitions
Accused Products
Abstract
Example embodiments of the present invention relate and a method and an apparatus for double hashing. The method including receiving a hash signature, including a short hash handle, for a data block. The method then includes determining a bucket with which the hash signature should be associated and associating the hash signature with the bucket.
-
Citations
21 Claims
-
1. A method for content addressable storage of data blocks in a distributed data storage system comprising:
-
receiving a hash signature, including a short hash handle, for a data block, wherein the hash signature corresponds to a physical address of the data block within the content addressable storage, wherein the hash signature comprises a first identification of a first bucket and a second identification of a second bucket; selecting one of the first bucket and the second bucket based at least in part on the first identification and the second identification as a bucket with which the hash signature should be associated; and associating the hash signature with the selected one of the first bucket and the second bucket, wherein associating the hash signature includes modifying a bit of the short hash handle to provide an indication of which one of the first identification of the first bucket and the second identification of the second bucket in the hash signature corresponds to the selected one of the first bucket and the second bucket. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system for content addressable storage of data blocks in a distributed data storage system comprising:
-
one or more processors; and memory storing computer program code that when executed on the one or more processors causes the system to perform the operations of; receiving a hash signature, including a short hash handle, for a data block, wherein the hash signature corresponds to a physical address of the data block within the content addressable storage, wherein the hash signature comprises a first identification of a first bucket and a second identification of a second bucket; selecting one of the first bucket and the second bucket based at least in part on the first identification and the second identification as a bucket with which the hash signature should be associated; and associating the hash signature with the selected one of the first bucket and the second bucket, wherein associating the hash signature includes modifying a bit of the short hash handle to provide an indication of which one of the first identification of the first bucket and the second identification of the second bucket in the hash signature corresponds to the selected one of the first bucket and the second bucket. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer program product for content addressable storage of data blocks in a distributed data storage system including a non-transitory compute readable storage medium having computer program code thereon that when executed on a processor of a computer causes the computer to perform double hashing, the computer program code comprising:
-
computer program code for receiving a hash signature, including a short hash handle, for a data block, wherein the hash signature corresponds to a physical address of the data block within the content addressable storage, wherein the hash signature comprises a first identification of a first bucket and a second identification of a second bucket; computer program code for selecting one of the first bucket and the second bucket based at least in part on the first identification and the second identification as a bucket with which the hash signature should be associated; and computer program code for associating the hash signature with the selected one of the first bucket and the second bucket, wherein associating the hash signature includes modifying a bit of the short hash handle to provide an indication of which one of the first identification of the first bucket and the second identification of the second bucket in the hash signature corresponds to the selected one of the first bucket and the second bucket. - View Dependent Claims (20, 21)
-
Specification