FLOW FOR MULTI-MASTER REPLICATION IN DISTRIBUTED STORAGE
First Claim
1. One or more computer-storage media storing computer-executable instructions that, when executed by a computing device having a processor, cause the computing device to perform a method of replicating data in distributed storage, the method comprising:
- retrieving a replication message from a message queue associated with a source table, the source table being one of a plurality of source tables in a replication group, the replication message comprising a row identifier;
identifying one or more target storages within the replication group, the target storages including tables within the replication group;
obtaining a table row corresponding to the row identifier and a first entity tag (eTag) for the table row from each of the one or more target storages, an eTag comprises an identifier for a specific version of the table row;
determining a winning row from the obtained rows based on a latest client timestamp of the obtained rows, the winning row being a table row with a latest version of the table row;
creating replication operations based on the winning row, wherein a replication operation comprises instructions on inserting data from the winning row in to one or more target storages; and
performing batch execution of the replication operations to the one or more target storages.
2 Assignments
0 Petitions
Accused Products
Abstract
Embodiments are directed to replicating data in distributed storage. A replication message may be retrieved from a message queue associated with a source table. The replication message may include a row identifier. One or more target storages within a same replication group as the source table may be identified. A row from each of the one or more target storages may be obtained corresponding to the row identifier. A winning row may be determined from the obtained rows based on a latest timestamp of the row. A replication operation may be created based on the winning row. The replication operation may be performed on the obtained rows from each of the target storages.
11 Citations
20 Claims
-
1. One or more computer-storage media storing computer-executable instructions that, when executed by a computing device having a processor, cause the computing device to perform a method of replicating data in distributed storage, the method comprising:
-
retrieving a replication message from a message queue associated with a source table, the source table being one of a plurality of source tables in a replication group, the replication message comprising a row identifier; identifying one or more target storages within the replication group, the target storages including tables within the replication group; obtaining a table row corresponding to the row identifier and a first entity tag (eTag) for the table row from each of the one or more target storages, an eTag comprises an identifier for a specific version of the table row; determining a winning row from the obtained rows based on a latest client timestamp of the obtained rows, the winning row being a table row with a latest version of the table row; creating replication operations based on the winning row, wherein a replication operation comprises instructions on inserting data from the winning row in to one or more target storages; and performing batch execution of the replication operations to the one or more target storages. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-implemented method of processing replication messages in distributed storage, the method comprising:
-
receiving a replication message from a message queue associated with a first table among a plurality of tables, at least two tables of the plurality of tables being located at different locations; obtaining a row identifier identifying a row to be updated from the replication message; fetching a row from each of the plurality of tables corresponding to the row identifier; determining a winning row based on a latest timestamp of the fetched row from each of the plurality of tables, the winning row being the obtained row with the latest timestamp; and replicating the winning row among the plurality of tables. - View Dependent Claims (13, 14, 15, 16, 17)
-
-
18. A system for guaranteeing replication and eventual consistency in multi-master configuration, comprising:
-
a message queue; a plurality of tables; and a replication worker component configured to; receive a replication message from a message queue associated with a first table among the plurality of tables, at least two tables of the plurality of tables being located at different locations; obtain row identifiers from the replication message; determine if a matching row corresponding to the row identifiers exists in each of the plurality of tables; when a matching row exists in a table of the plurality of tables, return the matching row and client timestamp and first entity tag (eTag) corresponding to the matching row; when a matching row does not exist in a table of the plurality of tables, return a placeholder row instance and first eTag corresponding to the placeholder row instance; select a winning row from the returned rows with a latest client timestamp, the winning row being the returned row with the latest client timestamp indicating a latest version of the returned row; create replication operations based on the winning row; perform batch execution of the replication operations to the target storages; and remove the replication message from the message queue. - View Dependent Claims (19, 20)
-
Specification