Consolidating data in storage host groupings
First Claim
1. A computer-implemented method comprising:
- determining a plurality of data storage host pairs having a respective first and second storage host corresponding to a respective first and second physical storage location, the plurality of data storage host pairs determined based at least in part on physical characteristics of the respective first and second physical storage locations;
for each data object of a plurality of data objects, using a redundancy encoding scheme to generate a plurality of data fragments;
distributing the plurality of data fragments among at least a subset of the plurality of data storage host pairs according to one or more placement rules based at least in part on the physical characteristics of the respective first and second physical storage locations, wherein distribution of the plurality of data fragments is performed to avoid a possibility of correlated loss of multiple data fragments of the plurality of data fragments by consolidation of data of a data storage host pair onto a single data storage host;
selecting a data storage host pair from the plurality of data storage host pairs;
selecting a data storage host from the selected data storage host pair;
consolidating data from the selected data storage host pair onto the selected data storage host; and
updating metadata for each data storage host of the selected data storage host pair to specify a location of data stored therein.
1 Assignment
0 Petitions
Accused Products
Abstract
A data storage service distributes a plurality of data fragments corresponding to a data object among one or more data storage host groupings in a manner that avoids a possibility of correlated loss of multiple data fragments by consolidation of data of a data storage host grouping onto a single data storage host. The data storage service selects a data storage host grouping and determines an amount of used capacity for the selected data storage host grouping. If the selected grouping satisfies an emptiness threshold, the data storage service selects a data storage host from the grouping and consolidates one or more data sets of the grouping onto the selected data storage host. Subsequently, the data storage service updates metadata for each data storage host of the selected data storage host grouping to specify a location of data stored therein.
20 Citations
22 Claims
-
1. A computer-implemented method comprising:
-
determining a plurality of data storage host pairs having a respective first and second storage host corresponding to a respective first and second physical storage location, the plurality of data storage host pairs determined based at least in part on physical characteristics of the respective first and second physical storage locations; for each data object of a plurality of data objects, using a redundancy encoding scheme to generate a plurality of data fragments; distributing the plurality of data fragments among at least a subset of the plurality of data storage host pairs according to one or more placement rules based at least in part on the physical characteristics of the respective first and second physical storage locations, wherein distribution of the plurality of data fragments is performed to avoid a possibility of correlated loss of multiple data fragments of the plurality of data fragments by consolidation of data of a data storage host pair onto a single data storage host; selecting a data storage host pair from the plurality of data storage host pairs; selecting a data storage host from the selected data storage host pair; consolidating data from the selected data storage host pair onto the selected data storage host; and updating metadata for each data storage host of the selected data storage host pair to specify a location of data stored therein. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system, comprising at least one computing device configured to implement one or more services, wherein the one or more services are configured to:
-
for each data object of a plurality of data objects that has been redundancy encoded to generate a plurality of data fragments, distribute the plurality of data fragments among a plurality of data storage host groupings according to one or more placement rules based at least in part on the physical characteristics of a respective plurality of physical storage locations that define the plurality of data storage host groupings, such that consolidation of data of a data storage host grouping to fewer data storage hosts maintains compliance with one or more conditions for independent failure of the plurality of data fragments of the plurality of data objects; select a data storage host grouping from a plurality of data storage host groupings, the plurality of data storage host groupings storing a plurality of data fragments, each data fragment of the plurality of data fragments; determine to consolidate data of the data storage host grouping onto fewer data storage hosts; consolidate the data of the data storage host grouping onto fewer data storage hosts; and remove at least one data storage host from the selected data storage host grouping. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory computer-readable storage medium having stored thereon executable instructions that, if executed by one or more processors of a computer system, cause the computer system to at least:
-
for each data object of a plurality of data objects that has been redundancy encoded to generate a plurality of data fragments, distribute the plurality of data fragments among a plurality of data storage host groupings according to one or more placement rules based at least in part on the physical characteristics of a respective plurality of physical storage locations that define the plurality of data storage host groupings, such that consolidation of data of a data storage host grouping to fewer data storage hosts maintains compliance with one or more conditions for independent failure of the plurality of data fragments of the plurality of data objects; select, from the plurality of data storage host groupings, a data storage host grouping; select fewer data storage hosts than a number of data storage hosts comprising the selected data storage host grouping; consolidate the data of the selected data storage host grouping onto the fewer data storage hosts; and remove at least one data storage host from the selected data storage host grouping. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22)
-
Specification