Extent hashing technique for distributed storage architecture
First Claim
1. A method comprising:
- organizing write data of write requests into one or more extents, wherein each extent is a block of data that provides a unit of storage on one or more storage devices of a cluster, the write requests processed at a node of the cluster, the cluster having a plurality of nodes;
applying a hash function to each extent to generate a hash value;
dividing a hash space of the hash value into a plurality of buckets representative of the one or more extents and associated extent metadata, a number of the buckets being less than a number of values in the hash space; and
assigning the buckets to extent store instances of the nodes in the cluster based on results of a remainder computation, wherein the remainder computation divides a remainder of the hash value of each extent by the number of buckets using modulo arithmetic, and the results of the remainder calculation are bucket numbers that operate as indexes in a bucket mapping data structure having bucket number entries, wherein each bucket number entry maps to an extent store instance.
0 Assignments
0 Petitions
Accused Products
Abstract
In one embodiment, an extent hashing technique is used to efficiently distribute data and associated metadata substantially evenly among nodes of a cluster. The data may be write data associated with a write request issued by a host and received at a node of the cluster. The write data may be organized into one or more extents. A hash function may be applied to the extent to generate a result which may be truncated or trimmed to generate a hash value. A hash space of the hash value may be divided into a plurality of buckets representative of the write data, i.e., the extents, and the associated metadata, i.e., extent metadata. A number of buckets may be assigned to each extent store instance of the nodes to distribute ownership of the buckets, along with their extents and extent metadata, across all of the extent store instances of the nodes.
-
Citations
19 Claims
-
1. A method comprising:
-
organizing write data of write requests into one or more extents, wherein each extent is a block of data that provides a unit of storage on one or more storage devices of a cluster, the write requests processed at a node of the cluster, the cluster having a plurality of nodes; applying a hash function to each extent to generate a hash value; dividing a hash space of the hash value into a plurality of buckets representative of the one or more extents and associated extent metadata, a number of the buckets being less than a number of values in the hash space; and assigning the buckets to extent store instances of the nodes in the cluster based on results of a remainder computation, wherein the remainder computation divides a remainder of the hash value of each extent by the number of buckets using modulo arithmetic, and the results of the remainder calculation are bucket numbers that operate as indexes in a bucket mapping data structure having bucket number entries, wherein each bucket number entry maps to an extent store instance. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method comprising:
-
forming an extent at a node of a cluster having a plurality of nodes, wherein the extent is a block of data that provides a unit of storage on one or more storage devices of the cluster; applying a hash function to the extent to generate a hash value; processing the hash value using a remainder computation to generate a bucket number of a bucket that includes a portion of a hash space of the hash function, wherein the remainder computation divides a remainder of the hash value by a number of buckets using modulo arithmetic; and indexing into a selected entry of a bucket mapping data structure having bucket number entries using the bucket number, each bucket number entry mapping to an extent store instance, the indexing to identify an extent store instance of the node that serves the extent. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. A system comprising:
-
a central processing unit (CPU) of a node of a cluster having a plurality of nodes; and a memory coupled to the CPU and configured to store one or more processes executable by the CPU, the one or more processes when executed operable to; organize write data of write requests into one or more extents, an extent being a block of data that provides a unit of storage on one or more storage devices of the cluster; apply a hash function to each extent to generate a hash value; divide a hash space of the hash value into a plurality of buckets representative of the one or more extents and associated extent metadata, a number of the buckets being less than a number of values in the hash space; and assign the buckets to extent store instances of the nodes in the cluster to thereby distribute ownership of the buckets across the extent store instances of the nodes, the assignment of buckets based on a remainder computation that divides a remainder of the hash value of each extent by the number of buckets using modulo arithmetic to generate bucket numbers, the bucket numbers used as indexes in a bucket mapping data structure having bucket number entries, wherein each bucket number entry maps to an extent store instance. - View Dependent Claims (15, 16)
-
-
17. A method comprising:
-
organizing write data of write requests into one or more extents, wherein each extent is a block of data that provides a unit of storage on one or more storage devices of a cluster, the write requests processed at a node of the cluster, the cluster having a plurality of nodes; applying a hash function to each extent to generate a hash value; dividing a hash space of the hash value into a plurality of buckets, a number of the buckets being less than a number of values in the hash space; assigning the buckets to extent store instances of the nodes in the cluster by performing a remainder computation, wherein the remainder computation divides a remainder of the hash value of each extent by the number of buckets using modulo arithmetic to generate bucket numbers, and using the bucket numbers as indexes in a bucket mapping data structure having bucket number entries, wherein each bucket number entry maps to an extent store instance. - View Dependent Claims (18, 19)
-
Specification