Methods and apparatus for optimizing resource utilization in distributed storage systems
First Claim
1. A distributed storage system, comprising:
- a plurality of storage units configured for access by a plurality of clients and each coupled to a network, wherein the plurality of storage units collectively store data for the plurality of clients;
at least one hardware processor and associated memory coupled to the network that implement a distributed storage control system configured to manage data storage across the plurality of storage units, wherein to manage the data storage the distributed storage control system is configured to;
track storage space utilization among the plurality of storage units, including an aggregate storage space utilization for the plurality of storage units;
based at least in part on the tracked storage space utilization, select, from among the plurality of storage units, one or more source storage units and one or more destination storage units, wherein the storage space utilization of the one or more source storage units is higher than the aggregate storage space utilization, and wherein the storage space utilization of the of the one or more destination storage units is lower than the aggregate storage space utilization;
determine previously stored data on the one or more source storage units to migrate to the one or more destination storage units according to at least the tracked storage space utilization; and
migrate the determined previously stored data from the one or more selected source storage units to the one or more selected destination storage units, resulting in the storage space utilization across the plurality of storage units being more evenly balanced.
0 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus for optimizing resource utilization in distributed storage systems. A data migration technique is described that may operate in the background in a distributed storage data center to migrate data among a fleet of storage units to achieve a substantially even and randomized data storage distribution among all storage units in the fleet. When new storage units are added to the fleet and coupled to the data center network, the new storage units are detected. Instead of processing and storing new data to the newly added storage units, as in conventional distributed storage systems, the new units are blocked from general client I/O to allow the data migration technique to migrate data from other, previously installed storage hardware in the data center onto the new storage hardware. Once the storage load on the new storage units is balanced with the rest of the fleet, the new storage units are released for general client I/O.
33 Citations
20 Claims
-
1. A distributed storage system, comprising:
-
a plurality of storage units configured for access by a plurality of clients and each coupled to a network, wherein the plurality of storage units collectively store data for the plurality of clients; at least one hardware processor and associated memory coupled to the network that implement a distributed storage control system configured to manage data storage across the plurality of storage units, wherein to manage the data storage the distributed storage control system is configured to; track storage space utilization among the plurality of storage units, including an aggregate storage space utilization for the plurality of storage units; based at least in part on the tracked storage space utilization, select, from among the plurality of storage units, one or more source storage units and one or more destination storage units, wherein the storage space utilization of the one or more source storage units is higher than the aggregate storage space utilization, and wherein the storage space utilization of the of the one or more destination storage units is lower than the aggregate storage space utilization; determine previously stored data on the one or more source storage units to migrate to the one or more destination storage units according to at least the tracked storage space utilization; and migrate the determined previously stored data from the one or more selected source storage units to the one or more selected destination storage units, resulting in the storage space utilization across the plurality of storage units being more evenly balanced.
-
-
2. The distributed storage system as recited in claim 1, wherein the distributed storage control system is configured to perform said track, said select, said determine and said migrate as a background process while general client I/O traffic is performed at the plurality of storage units for the plurality of clients to read previously stored data from and store new data to the plurality of storage units via the network.
-
3. The distributed storage system as recited in claim 1, wherein to track storage space utilization among the plurality of storage units, the distributed storage control system is configured to track both storage space utilization on individual storage units and the aggregate storage space utilization across the plurality of storage units.
-
4. The distributed storage system as recited in claim 3, wherein to perform said select one or more source storage units and one or more destination storage units, the distributed storage control system is configured to compare the storage space utilization on individual storage units to an aggregate target based on the aggregate storage space utilization.
-
5. The distributed storage system as recited in claim 1, wherein to perform said migrate the determined previously stored data, the distributed storage control system is configured to migrate data from a plurality of selected source storage units to one destination storage unit.
-
6. The distributed storage system as recited in claim 1, wherein the policy further specifies one or more caps or thresholds on how much network bandwidth or processing capacity is allowed for migrating stored data from selected source storage units to destination storage units.
-
7. The distributed storage system as recited in claim 1, wherein the policy further specifies one or more data type criteria to avoid overloading a particular storage unit with a particular type of data based on data object size or activity level.
-
8. A method, comprising:
performing, by one or more computing devices; managing data storage across a plurality of storage units configured for access by a plurality of clients and coupled to a same network as the one or more computing devices, wherein the plurality of storage units collectively store data for the plurality of clients, and wherein said managing data storage access comprises; tracking storage space utilization among the plurality of storage units, including an aggregate storage space utilization for the plurality of storage units; based at least in part on the tracked storage space utilization, selecting, from among the plurality of storage units, one or more source storage units and one or more destination storage units, wherein the storage space utilization of the one or more source storage units is higher than the aggregate storage space utilization, and wherein the storage space utilization of the of the one or more destination storage units is lower than the aggregate storage space utilization; determining previously stored data on the one or more source storage units to migrate to the one or more destination storage units according to at least the tracked storage space utilization; and migrating the determined previously stored data from the one or more selected source storage units to the one or more selected destination storage units, resulting in the storage space utilization across the plurality of storage units being more evenly balanced.
-
9. The method of claim 8, wherein said tracking, said selecting, said determining and said migrating are performed as part of a background process while general client I/O traffic is performed at the plurality of storage units for the plurality of clients to read previously stored data from and store new data to the plurality of storage units via the network.
-
10. The method of claim 8, wherein said tracking storage space utilization among the plurality of storage units, comprises tracking both storage space utilization on individual storage units and the aggregate storage space utilization across the plurality of storage units.
-
11. The method of claim 10, wherein said selecting one or more source storage units and one or more destination storage units comprises comparing the storage space utilization on individual storage units to an aggregate target based on the aggregate storage space utilization.
-
12. The method of claim 8, wherein said migrating the determined previously stored data comprises migrating data from a plurality of selected source storage units to one destination storage unit.
-
13. The method of claim 8, wherein the policy further specifies one or more caps or thresholds on how much network bandwidth or processing capacity is allowed for migrating stored data from selected source storage units to destination storage units.
-
14. The method of claim 8, wherein data items are replicated or erasure coded among the plurality of storage units for redundancy, wherein said determining previously stored data to migrate avoids migrating redundancy data for a data item from a source storage unit to a destination storage unit already storing redundancy data for that data item.
-
15. A non-transitory, computer-readable storage medium, storing program instructions that when executed by one or more computing devices cause the one or more computing devices to implement a distributed storage control system that implements:
managing data storage across a plurality of storage units configured for access by a plurality of clients and coupled to a same network as the distributed storage control system, wherein the plurality of storage units collectively store data for the plurality of clients, and wherein the managing data storage access comprises; tracking storage space utilization among the plurality of storage units, including an aggregate storage space utilization for the plurality of storage units; based at least in part on the tracked storage space utilization, selecting, from among the plurality of storage units, one or more source storage units and one or more destination storage units, wherein the storage space utilization of the one or more source storage units is higher than the aggregate storage space utilization, and wherein the storage space utilization of the of the one or more destination storage units is lower than the aggregate storage space utilization; determining previously stored data on the one or more source storage units to migrate to the one or more destination storage units according to at least the tracked storage space utilization; and migrating the determined previously stored data from the one or more selected source storage units to the one or more selected destination storage units, resulting in the storage space utilization across the plurality of storage units being more evenly balanced.
-
16. The non-transitory, computer-readable storage medium of claim 15, wherein said tracking, said selecting, said determining and said migrating are performed as part of a background process while general client I/O traffic is performed at the plurality of storage units for the plurality of clients to read previously stored data from and store new data to the plurality of storage units via the network.
-
17. The non-transitory, computer-readable storage medium of claim 15, wherein, in said tracking storage space utilization among the plurality of storage units, the program instructions cause the distributed storage control system to implement tracking both storage space utilization on individual storage units and the aggregate storage space utilization across the plurality of storage units.
-
18. The non-transitory, computer-readable storage medium of claim 17, wherein, in said selecting one or more source storage units and one or more destination storage units, the program instructions cause the distributed storage control system to implement comparing the storage space utilization on individual storage units to an aggregate target based on the aggregate storage space utilization.
-
19. The non-transitory, computer-readable storage medium of claim 15, wherein, in said migrating the determined previously stored data, the program instructions cause the distributed storage control system to implement migrating data from a plurality of selected source storage units to one destination storage unit.
-
20. The non-transitory, computer-readable storage medium of claim 15, wherein, in said selecting one or more source storage units, the program instructions cause the distributed storage control system to implement applying an N-choices technique that randomly selects a subset of the plurality of storage units and then selects the one or more source storage units from among the storage units of the selected subset according to one or more selection criteria specified in the policy.
Specification