Extent hashing technique for distributed storage architecture
First Claim
1. A system comprising:
- a central processing unit (CPU) of a node of a cluster having additional nodes, each node coupled to one or more solid state drives (SSDs); and
a memory coupled to the CPU and configured to store a set of hash tables embodying mappings of cluster-wide identifiers associated with storage locations on the SSDs for write data of write requests organized into extents, the memory further configured to store a storage input/output (I/O) stack having a plurality of layers that cooperate with components of the additional nodes to provide a distributed storage architecture of the cluster, the layers of the storage I/O stack implemented as one or more processes executable by the CPU to;
generate a hash value from a hash function applied to each extent; and
overload the hash value for multiple purposes within the distributed storage architecture, including (i) a remainder computation on the hash value to select a bucket of a plurality of buckets representative of the extents, (ii) a hash table selector of the hash value to select a hash table from the set of hash tables, and (iii) a hash table index computed from the hash value to select an entry from a plurality of entries of the selected hash table having a cluster-wide identifier identifying a SSD storage location for an extent.
1 Assignment
0 Petitions
Accused Products
Abstract
In one embodiment, a technique is provided for distributing data and associated metadata within a distributed storage architecture. A set of hash tables that embody mappings of cluster-wide identifiers associated with storage locations are stored for write data of write requests organized into extents. A hash value is generated from a hash function applied to each extent. The hash value is overloaded and used for multiple purposes within the distributed storage architecture, including (i) a remainder computation on the hash value to select a bucket of a plurality of buckets representative of the extents, (ii) a hash table selector of the hash value to select a hash table from the set of hash tables, and (iii) a hash table index computed from the hash value to select an entry from a plurality of entries of the selected hash table having a cluster-wide identifier identifying a storage location for the extent.
83 Citations
20 Claims
-
1. A system comprising:
-
a central processing unit (CPU) of a node of a cluster having additional nodes, each node coupled to one or more solid state drives (SSDs); and a memory coupled to the CPU and configured to store a set of hash tables embodying mappings of cluster-wide identifiers associated with storage locations on the SSDs for write data of write requests organized into extents, the memory further configured to store a storage input/output (I/O) stack having a plurality of layers that cooperate with components of the additional nodes to provide a distributed storage architecture of the cluster, the layers of the storage I/O stack implemented as one or more processes executable by the CPU to; generate a hash value from a hash function applied to each extent; and overload the hash value for multiple purposes within the distributed storage architecture, including (i) a remainder computation on the hash value to select a bucket of a plurality of buckets representative of the extents, (ii) a hash table selector of the hash value to select a hash table from the set of hash tables, and (iii) a hash table index computed from the hash value to select an entry from a plurality of entries of the selected hash table having a cluster-wide identifier identifying a SSD storage location for an extent. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method comprising:
-
storing a set of hash tables that embody mappings of cluster-wide identifiers associated with storage locations on one or more solid state drives (SSDs) of a distributed storage architecture for write data of write requests organized into extents; generating a hash value from a hash function applied to each extent; and overloading the hash value for multiple purposes within the distributed storage architecture, including (i) a remainder computation on the hash value to select a bucket of a plurality of buckets representative of the extents, (ii) a hash table selector of the hash value to select a hash table from the set of hash tables, and (iii) a hash table index computed from the hash value to select an entry from a plurality of entries of the selected hash table having a cluster-wide identifier identifying a SSD storage location for an extent. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory computer readable medium including program instructions for execution on one or more processors, the program instructions when executed operable to:
-
store a set of hash tables that embody mappings of cluster-wide identifiers associated with storage locations on one or more solid state drives (SSDs) of a distributed storage architecture for write data of write requests organized into extents; generate a hash value from a hash function applied to each extent; and overload the hash value for multiple purposes within the distributed storage architecture, including (i) a remainder computation on the hash value to select a bucket of a plurality of buckets representative of the extents, (ii) a hash table selector of the hash value to select a hash table from the set of hash tables, and (iii) a hash table index computed from the hash value to select an entry from a plurality of entries of the selected hash table having a cluster-wide identifier identifying a SSD storage location for an extent. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification