Methods and apparatus for optimizing resource utilization in distributed storage systems
First Claim
1. A distributed storage system, comprising:
- a plurality of storage units each coupled to a network, wherein the plurality of storage units collectively store data for a plurality of clients, wherein the plurality of clients each read previously stored data from and store new data to the plurality of storage units via the network;
one or more devices coupled to the network configured to;
detect addition of one or more new storage units to the network;
in response to said detecting the addition of the one or more new storage units to the network, block new data from being stored to the one or more new storage units;
in response to said detecting the addition of the one or more new storage units to the network, migrate previously stored data units from the data collectively stored on the plurality of storage units to the one or more new storage units until determining that a storage load on each of the one or more new storage units is at a target level; and
in response to said determining that the storage load on each of the one or more new storage units is at the target level and after said blocking new data from being stored to the one or more new storage units, allow new data to be stored to the one or more new storage units.
0 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus for optimizing resource utilization in distributed storage systems. A data migration technique is described that may operate in the background in a distributed storage data center to migrate data among a fleet of storage units to achieve a substantially even and randomized data storage distribution among all storage units in the fleet. When new storage units are added to the fleet and coupled to the data center network, the new storage units are detected. Instead of processing and storing new data to the newly added storage units, as in conventional distributed storage systems, the new units are blocked from general client I/O to allow the data migration technique to migrate data from other, previously installed storage hardware in the data center onto the new storage hardware. Once the storage load on the new storage units is balanced with the rest of the fleet, the new storage units are released for general client I/O.
-
Citations
20 Claims
-
1. A distributed storage system, comprising:
-
a plurality of storage units each coupled to a network, wherein the plurality of storage units collectively store data for a plurality of clients, wherein the plurality of clients each read previously stored data from and store new data to the plurality of storage units via the network; one or more devices coupled to the network configured to; detect addition of one or more new storage units to the network; in response to said detecting the addition of the one or more new storage units to the network, block new data from being stored to the one or more new storage units; in response to said detecting the addition of the one or more new storage units to the network, migrate previously stored data units from the data collectively stored on the plurality of storage units to the one or more new storage units until determining that a storage load on each of the one or more new storage units is at a target level; and in response to said determining that the storage load on each of the one or more new storage units is at the target level and after said blocking new data from being stored to the one or more new storage units, allow new data to be stored to the one or more new storage units. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method, comprising:
-
detecting, by one or more computing devices coupled to a network, one or more new storage units added to a plurality of storage units coupled to the network, wherein the plurality of storage units collectively store data, wherein a plurality of clients each read previously stored data from and store new data to the plurality of storage units via the network; in response to said detecting the one or more new storage units, blocking new data from being stored to the one or more new storage units; in response to said detecting the one or more new storage units, migrating previously stored data units from the data collectively stored on the plurality of storage units to the one or more new storage units until determining that a storage load on each of the one or more new storage units is at a target level; and in response to said determining that the storage load on each of the one or more new storage units is at the target level and after said blocking new data from being stored to the one or more new storage units, allowing new data to be stored to the one or more new storage units. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15)
-
-
16. A non-transitory computer-accessible storage medium storing program instructions that, when executed on one or more processors, cause the one or more processors to implement:
-
detecting one or more new storage units added to a plurality of storage units coupled to a network, wherein the plurality of storage units collectively store data, wherein a plurality of clients each read previously stored data from and store new data to the plurality of storage units via the network; in response to said detecting the one or more new storage units, blocking new data from being stored to the one or more new storage units; in response to said detecting the one or more new storage units, migrating previously stored data units from the data collectively stored on the plurality of storage units to the one or more new storage units until determining that a storage load on each of the one or more new storage units is at a target level; and allowing new data to be stored to the one or more new storage units subsequent to said determining that the storage load on each of the one or more new storage units is at the target level and after said preventing new data from being stored to the one or more new storage units. - View Dependent Claims (17, 18, 19, 20)
-
Specification