Partition management in a partitioned, scalable, and available structured storage
First Claim
1. A method implemented by one or more computing devices within a structured storage system in which structured storage is represented by one or more tables, each table including a plurality of rows, each row representing a data entity stored by the structured storage system and including one or more keys for identifying the row, the plurality of rows being divided among a plurality of partitions, each partition including a range of rows from the plurality of rows within the table, wherein the plurality of partitions are stored on a plurality of table servers, and wherein a table master controls partition assignment to the plurality of table servers, the method for splitting a partition assigned to a table server of the plurality of table servers into child partitions and comprising:
- identifying the partition for the splitting of the partition based on load information for the partition, wherein the load information includes information specifying an amount of load tracked on the partition from actual read and write requests made on each of two or more portions of the partition;
determining, by the table master, a split ratio for the splitting of the partition based on a distribution of the amount of the load tracked on the partition such that the split ratio specifies a load distribution that represents a point in the partition in which a first portion of the partition includes a first portion of the amount of load tracked across the partition and a second portion of the partition includes a second portion of the amount of load tracked across the partition;
providing a request to the table server for key information indicating an actual location within the partition for the splitting that corresponds to the load distribution specified by the split ratio, the request includes the split ratio;
receiving the key information at the table master from the table server, the key information indicating the actual location within the partition;
sending a split request from the table master to the table server, the split request indicating to the table server to perform the splitting of the partition based on the key information;
performing the splitting of the partition at a location corresponding to the actual location to create the child partitions;
notifying the table master that the splitting has completed; and
updating a partition map based on the partition being split into the child partitions, the partition map storing mappings between the plurality of partitions and the plurality of table servers serving the plurality of partitions.
3 Assignments
0 Petitions
Accused Products
Abstract
Partition management for a scalable, structured storage system is provided. The storage system provides storage represented by one or more tables, each of which includes rows that represent data entities. A table is partitioned into a number of partitions, each partition including a contiguous range of rows. The partitions are served by table servers and managed by a table master. Load distribution information for the table servers and partitions is tracked, and the table master determines to split and/or merge partitions based on the load distribution information.
-
Citations
20 Claims
-
1. A method implemented by one or more computing devices within a structured storage system in which structured storage is represented by one or more tables, each table including a plurality of rows, each row representing a data entity stored by the structured storage system and including one or more keys for identifying the row, the plurality of rows being divided among a plurality of partitions, each partition including a range of rows from the plurality of rows within the table, wherein the plurality of partitions are stored on a plurality of table servers, and wherein a table master controls partition assignment to the plurality of table servers, the method for splitting a partition assigned to a table server of the plurality of table servers into child partitions and comprising:
-
identifying the partition for the splitting of the partition based on load information for the partition, wherein the load information includes information specifying an amount of load tracked on the partition from actual read and write requests made on each of two or more portions of the partition; determining, by the table master, a split ratio for the splitting of the partition based on a distribution of the amount of the load tracked on the partition such that the split ratio specifies a load distribution that represents a point in the partition in which a first portion of the partition includes a first portion of the amount of load tracked across the partition and a second portion of the partition includes a second portion of the amount of load tracked across the partition; providing a request to the table server for key information indicating an actual location within the partition for the splitting that corresponds to the load distribution specified by the split ratio, the request includes the split ratio; receiving the key information at the table master from the table server, the key information indicating the actual location within the partition; sending a split request from the table master to the table server, the split request indicating to the table server to perform the splitting of the partition based on the key information; performing the splitting of the partition at a location corresponding to the actual location to create the child partitions; notifying the table master that the splitting has completed; and updating a partition map based on the partition being split into the child partitions, the partition map storing mappings between the plurality of partitions and the plurality of table servers serving the plurality of partitions. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. One or more computer-storage media storing computer-usable instructions for performing a method for managing a structured storage system represented by one or more tables, each table including a plurality of rows, each row representing a data entity stored by the structured storage system and including one or more keys for identifying the row, the plurality of rows being divided among a plurality of partitions, each partition including a range of rows from the plurality of rows within the table, wherein the plurality of partitions are stored on a plurality of table servers, and wherein a table master controls partition assignment to the plurality of table servers, the method for merging at least two partitions of the plurality of partitions of the table into a merged partition and comprising:
-
tracking load information for the plurality of partitions on the plurality of table servers, wherein the load information indicates an amount of load tracked across the partition for each of the plurality of partitions from actual traffic read and write access requests made on the partitions on the table servers; determining, by the table master, the at least two partitions to merge into the merged partition based on a distribution of the amount of load tracked across the partition for each of the at least two partitions compared to the amount of load for others of the plurality of partitions; and performing the merging based on the determining of the at least two partitions, the merging comprising; creating, by the table master, a metadata stream for the merged partition; offloading the at least two partitions from at least one table server serving the at least two partitions based on the determining by the table master; assigning, by the table master, the merged partition to a selected table server from the plurality of table servers; and loading and serving the merged partition at the selected table server. - View Dependent Claims (11, 12, 13, 14)
-
-
15. One or more computer-storage media storing computer-usable instructions for performing a method for managing a structured storage system represented by one or more tables, each table including a plurality of rows, each row representing a data entity stored by the structured storage system and including one or more keys
for identifying the row, the plurality of rows being divided among a plurality of partitions, each partition including a contiguous range of rows from the plurality of rows within the table, wherein the plurality of partitions are stored on a plurality of table servers, and wherein a table master controls partition assignment to the plurality of table servers, the method for splitting a partition of the plurality of partitions of the table into at least two child partitions and comprising: -
tracking load information for the plurality of partitions on the plurality of table servers, wherein the load information comprises an amount of load tracked on each of the plurality of partitions from actual traffic read and write access requests made on the plurality of partitions; determining by the table master, to perform the splitting of the partition by analyzing the amount of load tracked on the partition from the load information, wherein the load information for the partition identifies different loads on different portions of the partition; determining, at the table master, a split ratio for the splitting of the partition, wherein the split ratio is determined based on a distribution of the amount of load tracked across the partition such that the split ratio specifies a load distribution that represents a point in the partition in which a first portion of the partition includes a first portion of the amount of load tracked across the partition and a second portion of the partition includes a second portion of the amount of load tracked across the partition; sending a request from the table master to the table server, the request being for key information identifying an actual location in the partition for the splitting that corresponds to the load distribution specified by the split ratio, the split ratio being included in the request; determining based on the split ratio specified in the request, by the table server, the actual location in the partition for the splitting by attempting to match the load distribution specified by the split ratio; communicating the key information from the table server to the table master for the splitting; and constructing, at the table master, a metadata stream for each of the child partitions; sending a split request from the table master to the table server, the split request indicating to the table server to perform the splitting of the partition based on the key information; performing the splitting in response to the table server receiving the split request, the splitting comprising; building, at the table server, the child partitions from the partition; ceasing to serve the partition at the table server; and loading and serving the child partitions at the table server; sending a split completion notification from the table server to the table master that indicates completion of the splitting; and updating a partition map based on the partition being split into the child partitions by the splitting, the partition map storing mappings between the plurality of partitions and the plurality of table servers serving the plurality of partitions. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification