Flow for multi-master replication in distributed storage
First Claim
1. One or more computer-storage media storing computer-executable instructions that, when executed by a computing device having a processor, cause the computing device to perform a method of replicating data in distributed storage, the method comprising:
- retrieving a replication message from a message queue associated with a source table, the source table being one of a plurality of source tables in a replication group, the replication message comprising a row identifier;
identifying one or more target storages within the replication group, the target storages including tables within the replication group;
obtaining a table row corresponding to the row identifier and a first entity tag (eTag) for the table row from each of the one or more target storages, wherein an eTag comprises an identifier for a specific version of the table row, and wherein the obtaining comprises for each target storage;
determining if a matching row corresponding to the row identifier exists in the target storage; and
when a matching row exists in the target storage, returning the matching row and the first entity tag (eTag) corresponding to the matching row;
determining whether at least one of the obtained rows has a later client timestamp than the table row corresponding to the row identifier from the source table;
determining a winning row from the obtained rows based on a latest client timestamp of the obtained rows, the winning row being a table row with a latest version of the table row;
creating replication operations based on the winning row, wherein a replication operation comprises instructions on inserting data from the winning row in to one or more target storages; and
performing batch execution of the replication operations to the one or more target storages.
2 Assignments
0 Petitions
Accused Products
Abstract
Embodiments are directed to replicating data in distributed storage. A replication message may be retrieved from a message queue associated with a source table. The replication message may include a row identifier. One or more target storages within a same replication group as the source table may be identified. A row from each of the one or more target storages may be obtained corresponding to the row identifier. A winning row may be determined from the obtained rows based on a latest timestamp of the row. A replication operation may be created based on the winning row. The replication operation may be performed on the obtained rows from each of the target storages.
33 Citations
20 Claims
-
1. One or more computer-storage media storing computer-executable instructions that, when executed by a computing device having a processor, cause the computing device to perform a method of replicating data in distributed storage, the method comprising:
-
retrieving a replication message from a message queue associated with a source table, the source table being one of a plurality of source tables in a replication group, the replication message comprising a row identifier; identifying one or more target storages within the replication group, the target storages including tables within the replication group; obtaining a table row corresponding to the row identifier and a first entity tag (eTag) for the table row from each of the one or more target storages, wherein an eTag comprises an identifier for a specific version of the table row, and wherein the obtaining comprises for each target storage; determining if a matching row corresponding to the row identifier exists in the target storage; and when a matching row exists in the target storage, returning the matching row and the first entity tag (eTag) corresponding to the matching row; determining whether at least one of the obtained rows has a later client timestamp than the table row corresponding to the row identifier from the source table; determining a winning row from the obtained rows based on a latest client timestamp of the obtained rows, the winning row being a table row with a latest version of the table row; creating replication operations based on the winning row, wherein a replication operation comprises instructions on inserting data from the winning row in to one or more target storages; and performing batch execution of the replication operations to the one or more target storages. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer-implemented method of processing replication messages in distributed storage, the method comprising:
-
receiving a replication message from a message queue associated with a first table among a plurality of tables, at least two tables of the plurality of tables being located at different locations; obtaining a row identifier identifying a row to be updated from the replication message; fetching a row from each of the plurality of tables corresponding to the row identifier, the fetching comprising for each table; determining if a matching row corresponding to the row identifier exists in the table; and when a matching row exists in the table, returning the matching row and a first entity tag (eTag) corresponding to the matching row, wherein an eTag comprises an identifier for a specific version of the row; determining whether at least one fetched row from the plurality of tables has a later timestamp than the identified row from the first table; determining a winning row based on a latest timestamp of the fetched row from each of the plurality of tables, the winning row being the fetched row with the latest timestamp; and replicating the winning row among the plurality of tables. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A computer-implemented system comprising:
-
one or more processors; and one or more computer-storage media storing computer-useable instructions that, when executed by the one or more processors, cause the one or more processors to perform a method comprising; receiving a replication message from a message queue associated with a first table among a plurality of tables, at least two tables of the plurality of tables being located at different locations; obtaining a row identifier identifying a row to be updated from the replication message; fetching a row from each of the plurality of tables corresponding to the row identifier, the fetching comprising for each table; determining if a matching row corresponding to the row identifier exists in the table; and when a matching row exists in the table, returning the matching row and a first entity tag (eTag) corresponding to the matching row, wherein an eTag comprises an identifier for a specific version of the row; determining whether at least one fetched row from the plurality of tables has a later timestamp than the identified row from the first table; determining a winning row based on a latest timestamp of the fetched row from each of the plurality of tables, the winning row being the fetched row with the latest timestamp; and replicating the winning row among the plurality of tables. - View Dependent Claims (17, 18, 19, 20)
-
Specification