Replication processes in a distributed storage environment
First Claim
1. A computer-implemented method in a distributed storage environment utilizing a processor and memory for replicating data utilizing a combination of replication processes in the distributed computing environment, the method comprising:
- identifying a first one or more checkpoints that describes complete contents of an object for a data partition range at a primary data store, wherein the data partition range maintains the object;
initiating replicating the first one or more checkpoints to the secondary data store utilizing a full-object replication process that replicates the complete contents of the object from the first one or more checkpoints to the secondary data store;
identifying a second one or more checkpoints for the data partition range at the primary data store;
initiating replicating the second one or more checkpoints to the secondary data store utilizing a delta-checkpoint replication process that updates the complete contents of the object at the secondary data store with changes made to the object at the primary data store after the full-object replication process; and
initiating replicating data from the data partition range at the primary data store to the secondary data store utilizing a log-based replication process.
2 Assignments
0 Petitions
Accused Products
Abstract
Embodiments of the present invention relate to systems, methods, and computer storage media for replicating data in a distributed computing environment utilizing a combination of replication methodologies. A full-object replication may be utilized to replicate a full state of an object from a primary data store to a secondary data store. A checkpoint created after initiating the full-object replication may be parsed to identify changes to the object that have been entered since initiating the full-object replication. This replication process is referred to as a delta-checkpoint replication methodology. Additionally, in an embodiment, a log-based replication methodology may be utilized. The log-based replication may communicate data from a log of the primary data store to the secondary data store. It is also contemplated in an exemplary embodiment that when the log-based replication fails to maintain a throughput threshold, one of the other replication methodologies may be initiated, at least temporarily.
32 Citations
20 Claims
-
1. A computer-implemented method in a distributed storage environment utilizing a processor and memory for replicating data utilizing a combination of replication processes in the distributed computing environment, the method comprising:
-
identifying a first one or more checkpoints that describes complete contents of an object for a data partition range at a primary data store, wherein the data partition range maintains the object; initiating replicating the first one or more checkpoints to the secondary data store utilizing a full-object replication process that replicates the complete contents of the object from the first one or more checkpoints to the secondary data store; identifying a second one or more checkpoints for the data partition range at the primary data store; initiating replicating the second one or more checkpoints to the secondary data store utilizing a delta-checkpoint replication process that updates the complete contents of the object at the secondary data store with changes made to the object at the primary data store after the full-object replication process; and initiating replicating data from the data partition range at the primary data store to the secondary data store utilizing a log-based replication process. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. One or more computer storage media having computer-executable instructions embodied thereon, that when executed by a computing system having a processor and memory, cause the computing system to perform a method for replicating data in a distributed computing environment utilizing a plurality of replication processes, the method comprising:
-
initiating replicating data from a primary data store to a secondary data store utilizing a log-based replication process, wherein the data is maintained in a log of the primary data store; determining the log-based replication process is performing below a predefined threshold of replication throughput from the primary data store to the secondary data store; and in response to determining the log-based replication process is below the predefined threshold, switching from the log-based replication process for replicating data from the primary data store to the secondary data store to a checkpoint replication process. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. One or more computer storage media having computer-executable instructions embodied thereon, that when executed by a computing system having a processor and memory, cause the computing system to perform a method for replicating data in a distributed computing environment utilizing a plurality of replication processes, the method comprising:
-
replicating a first data of a first object from a primary data store to a secondary data store utilizing a full-object replication process based on a first checkpoint that describes contents of the first object, wherein the full-object replication replicates complete contents of the first object; replicating a second data of the first object from the primary data store to the secondary data store utilizing a delta-checkpoint replication process based on a second checkpoint that describes changes made to the first object at the primary data store, wherein the delta-checkpoint replication process replicates delta differences of the first object that are determined from the changes described by the second checkpoint to the secondary data store, wherein the second checkpoint was generated after the full-object replication process initiated replicating the first data; and replicating a third data of the first object from the primary data store to the secondary data store utilizing a log-based replication process, wherein the log-based replication process replicates the third data from a log of the primary data store, wherein the third data is not maintained in the first checkpoint or the second checkpoint.
-
Specification