Distributed system and method for replicated storage of structured data records
First Claim
1. A system, comprising:
- one or more computing devices configured to implement a network-based data storage service configured to store data tables for a plurality of distinct clients, wherein the network-based data storage service comprises;
a plurality of storage hosts each configured to store and retrieve structured data records; and
a data store manager configured to, for each of the plurality of distinct clients;
receive a request from the distinct client to store a structured data record within a table;
in response to receiving said request;
map said structured data record to a block, wherein to map said structured data record to a block, the data store manager is configured to;
apply a hash function to a value of a partition key field of said structured data record to compute a hash value for said structured data record,based on the hash value and an identifier of the table, identify a block to which said structured data record is to be mapped, such that a block identifier corresponding to said block is determined,map said structured data record to said block, andmap said block to a subset of storage hosts, wherein to map said block to a subset of storage hosts, the data store manager is configured to;
compute a set of hash values including a respective hash value for each of the plurality of storage hosts, wherein each respective hash value of the set of hash values is distinct from the hash value for said structured data record and is based on the hash value for said structured data record and a respective storage host identifier, wherein each respective hash value of the set of hash values is a result of applying a hash function to a concatenation of said block identifier and an identifier of a respective one of a plurality of data centers,apply a selection criterion to the set of hash values to select a subset of storage hosts of the plurality of storage hosts, wherein said subset of storage hosts comprises at least two of said plurality of storage hosts, andmap said block directly to each of the subset of storage hosts of said plurality of storage hosts, wherein to map said block to said subset of storage hosts, said data store manager is further configured to map said block to two or more of said plurality of data centers, wherein members of said subset of storage hosts are distributed among said two or more data centers; and
upon successful storage of said structured data record to said block within said subset of storage hosts, return to said distinct client an indication that said request is complete.
1 Assignment
0 Petitions
Accused Products
Abstract
A distributed system and method for replicated storage of structure data records. According to one embodiment, a system may include storage hosts each configured to store and retrieve structured data records, and a data store manager configured to receive a request from a client to store a structured data record within a table. In response to receiving the request, the data store manager may be further configured to map the structured data record to a block according to a partition key value of the structured data record and an identifier of the table and to map the block to a subset comprising at least two of the plurality of storage hosts. Upon successfully storing the structured data record to the block within at least two storage hosts within the subset, the data store manager may be further configured to return to the client an indication that said request is complete.
86 Citations
40 Claims
-
1. A system, comprising:
-
one or more computing devices configured to implement a network-based data storage service configured to store data tables for a plurality of distinct clients, wherein the network-based data storage service comprises; a plurality of storage hosts each configured to store and retrieve structured data records; and a data store manager configured to, for each of the plurality of distinct clients; receive a request from the distinct client to store a structured data record within a table; in response to receiving said request; map said structured data record to a block, wherein to map said structured data record to a block, the data store manager is configured to; apply a hash function to a value of a partition key field of said structured data record to compute a hash value for said structured data record, based on the hash value and an identifier of the table, identify a block to which said structured data record is to be mapped, such that a block identifier corresponding to said block is determined, map said structured data record to said block, and map said block to a subset of storage hosts, wherein to map said block to a subset of storage hosts, the data store manager is configured to; compute a set of hash values including a respective hash value for each of the plurality of storage hosts, wherein each respective hash value of the set of hash values is distinct from the hash value for said structured data record and is based on the hash value for said structured data record and a respective storage host identifier, wherein each respective hash value of the set of hash values is a result of applying a hash function to a concatenation of said block identifier and an identifier of a respective one of a plurality of data centers, apply a selection criterion to the set of hash values to select a subset of storage hosts of the plurality of storage hosts, wherein said subset of storage hosts comprises at least two of said plurality of storage hosts, and map said block directly to each of the subset of storage hosts of said plurality of storage hosts, wherein to map said block to said subset of storage hosts, said data store manager is further configured to map said block to two or more of said plurality of data centers, wherein members of said subset of storage hosts are distributed among said two or more data centers; and upon successful storage of said structured data record to said block within said subset of storage hosts, return to said distinct client an indication that said request is complete. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer implemented method, comprising:
-
receiving a request from a client to store a structured data record within a table; in response to receiving said request; mapping said structured data record to a block, wherein mapping said structured data record to a block comprises; applying a hash function to a value of a partition key field of said structured data record to compute a hash value for said structured data record, based on the hash value and an identifier of the table, identifying a block to which said structured data record is to be mapped, such that a block identifier corresponding to said block is determined, mapping said structured data record to said block, and mapping said block to a subset of storage hosts, wherein mapping said block to the subset of storage hosts comprises; computing a set of hash values including a respective hash value for each of the plurality of storage hosts, wherein each respective hash value of the set of hash values is distinct from the hash value for said structured data record and is based on the hash value for said structured data record and a respective storage host identifier, wherein each respective hash value of the set of hash values is a result of applying a hash function to a concatenation of said block identifier and an identifier of a respective one of a plurality of data centers, applying a selection criterion to the set of hash values to select a subset of storage hosts of the plurality of storage hosts, wherein said subset of storage hosts comprises at least two of said plurality of storage hosts, and mapping said block directly to each of the subset of storage hosts of said plurality of storage hosts, wherein each of said plurality of storage hosts is configured to store and retrieve structured data records, wherein mapping said block to said subset of storage hosts further comprises mapping said block to two or more of said plurality of data centers, wherein member of said subsets of storage hosts are distributed among said two are or more data centers; and upon successful storage of said structured data record to said block within said subset of storage hosts, returning to said client an indication that said request is complete. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40)
-
Specification