ATOMIC WRITES FOR MULTIPLE-EXTENT OPERATIONS
First Claim
1. A system, comprising:
- one or more computing devices configured to;
select a particular component of a multi-tenant file storage service as a coordinator of a distributed transaction;
generate, at the coordinator, a transaction preparation message comprising at least;
(a) an indication of a respective proposed page modification to be performed at each page of a set of physical data pages managed by the service, and (b) an order in which respective page-level commit decisions associated with the proposed page modifications are to be made;
propagate the transaction preparation message sequentially among nodes of a storage node chain collectively storing the set of physical data pages, wherein the storage node chain ends at a decider node designated by the coordinator as responsible for a decision to commit the distributed transaction;
in response to receiving the transaction preparation message at a particular storage node of the chain other than the decider node,store, at a persistent repository, subsequent to a commit analysis of a proposed modification to a physical data page of the particular storage node, a record of an intent to perform the proposed modification, wherein the commit analysis includes an examination of a logical timestamp corresponding to a read of the physical data page;
lock the physical data page of the particular storage node; and
transmit the transaction preparation message to a next storage node of the storage node chain;
in response to receiving the transaction preparation message at the decider node,determine, subsequent to a commit analysis of a proposed modification to a physical data page of the decider node, that the distributed transaction is to be committed;
initiate the proposed modification to the physical data page of decider node; and
initiate a propagation of a transaction commit message to other nodes of the chain; and
in response to receiving the transaction commit message at a particular storage node of the chain,initiate a proposed modification corresponding to the record of the intent to commit; and
release the lock on the physical data page.
1 Assignment
0 Petitions
Accused Products
Abstract
A node of a storage service is selected as a coordinator of a distributed transaction involving multiple page-level modifications. The coordinator identifies other nodes as members of a node chain collectively storing physical data pages at which proposed modifications are to be performed, including a decider node responsible for a decision to commit the transaction. The coordinator generates a transaction preparation message comprising a representation of an order of respective commit decisions associated with the proposed modifications, and transmits the message to a selected node of the chain for a sequential propagation along the chain. Each chain node performs a local commit analysis for its changes and stores a record of its intent to commit. If a decision to commit is reached at the decider, the proposed modifications are completed.
-
Citations
20 Claims
-
1. A system, comprising:
one or more computing devices configured to; select a particular component of a multi-tenant file storage service as a coordinator of a distributed transaction; generate, at the coordinator, a transaction preparation message comprising at least;
(a) an indication of a respective proposed page modification to be performed at each page of a set of physical data pages managed by the service, and (b) an order in which respective page-level commit decisions associated with the proposed page modifications are to be made;propagate the transaction preparation message sequentially among nodes of a storage node chain collectively storing the set of physical data pages, wherein the storage node chain ends at a decider node designated by the coordinator as responsible for a decision to commit the distributed transaction; in response to receiving the transaction preparation message at a particular storage node of the chain other than the decider node, store, at a persistent repository, subsequent to a commit analysis of a proposed modification to a physical data page of the particular storage node, a record of an intent to perform the proposed modification, wherein the commit analysis includes an examination of a logical timestamp corresponding to a read of the physical data page; lock the physical data page of the particular storage node; and transmit the transaction preparation message to a next storage node of the storage node chain; in response to receiving the transaction preparation message at the decider node, determine, subsequent to a commit analysis of a proposed modification to a physical data page of the decider node, that the distributed transaction is to be committed; initiate the proposed modification to the physical data page of decider node; and initiate a propagation of a transaction commit message to other nodes of the chain; and in response to receiving the transaction commit message at a particular storage node of the chain, initiate a proposed modification corresponding to the record of the intent to commit; and release the lock on the physical data page. - View Dependent Claims (2, 3, 4, 5)
-
6. A method, comprising:
performing, by one or more computing devices; generating, at a particular component of a multi-tenant storage service, wherein the particular component is designated as a coordinator of a distributed transaction, a transaction preparation message comprising at least;
(a) an indication of a respective proposed page modification to be performed at each page of a set of physical data pages managed by the service, and (b) an order in which respective commit decisions associated with the proposed page modifications are to be made;propagating the transaction preparation message sequentially among nodes of a storage node chain collectively storing the set of physical data pages; in response to receiving the transaction preparation message at a particular storage node of the chain other than a terminal node of the chain, storing, at a persistent repository, subsequent to a commit analysis of a proposed modification to a first physical data page of the particular storage node, a record of an intent to commit the proposed modification; and transmitting the transaction preparation message to a next storage node of the storage node chain; in response to receiving the transaction preparation message at the terminal node, determining, subsequent to a commit analysis of a proposed modification to a second physical data page of the terminal node, that the distributed transaction is to be committed; initiating the proposed modification to the second physical data page of the decider node; and propagating a transaction commit message to other nodes of the chain; and in response to receiving the transaction commit message at the particular storage node of the chain, initiating a proposed modification corresponding to the record of the intent to commit. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14, 15)
-
16. A non-transitory computer-accessible storage medium storing program instructions that when executed on one or more processors implement a first node of a distributed storage service configured to:
-
receive an indication that the first node has been selected as a coordinator of a distributed transaction to be implemented in response to a work request directed to the service; identify one or more other nodes of the distributed storage service as members of a storage node chain collectively storing a group of physical data pages corresponding to proposed modifications to be performed as part of the distributed transaction, including a decider node designated as responsible for a decision to commit the distributed transaction; generate a transaction preparation message comprising at least a representation of an order in which respective commit decisions associated with the proposed modifications are to be made; and transmit the transaction preparation message to a selected node of the chain, for a sequential propagation among the nodes of the storage node chain, wherein the sequential propagation ends at the decider node. - View Dependent Claims (17, 18, 19, 20)
-
Specification