PUSH-BASED PIGGYBACK SYSTEM FOR SOURCE-DRIVEN LOGICAL REPLICATION IN A STORAGE ENVIRONMENT
First Claim
1. A non-transitory computer readable storage medium having instructions stored thereon, which when executed by one or more processors of a destination node of a source-driven replication system, cause the destination node to:
- identify a missing extent of a plurality of extents associated with a replicated data set,wherein when the replicated data set is reconstructed at the destination node with the same logical layout but a different physical layout than the original data set at a source node; and
in response to a push inquiry, send, for delivery to the source node, a push inquiry response indicating the missing extent;
process a data stream pushed from the source node, the data stream including the missing extent,wherein the push inquiry response causes the source node to initiate a push of the missing extent.
1 Assignment
0 Petitions
Accused Products
Abstract
The disclosed techniques enable push-based piggybacking of a source-driven logical replication system. Logical replication of a data set (e.g., a snapshot) from a source node to a destination node can be achieved from a source-driven system while preserving the effects of storage efficiency operations (deduplication) applied at the source node. However, if missing data extents are detected at the destination, the destination has an extent pulling problem as the destination may not have knowledge of the physical layout on the source-side and/or mechanisms for requesting extents. The techniques overcome the extent pulling problem in a source-driven replication system by introducing specific protocols for obtaining missing extents within an existing replication environment by piggybacking data pushes from the source.
15 Citations
20 Claims
-
1. A non-transitory computer readable storage medium having instructions stored thereon, which when executed by one or more processors of a destination node of a source-driven replication system, cause the destination node to:
-
identify a missing extent of a plurality of extents associated with a replicated data set, wherein when the replicated data set is reconstructed at the destination node with the same logical layout but a different physical layout than the original data set at a source node; and in response to a push inquiry, send, for delivery to the source node, a push inquiry response indicating the missing extent; process a data stream pushed from the source node, the data stream including the missing extent, wherein the push inquiry response causes the source node to initiate a push of the missing extent. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method comprising:
-
sending, by a source node of a source-driven replication system for delivery to a destination node, a first data stream including a plurality of extents associated with a data set, wherein the data set is deduplicated at the source node of the source-driven replication system and, when the data set is reconstructed at the destination node, the data set maintains the deduplication; sending, by the source node for delivery to the destination node, a push inquiry to identify if an extent of the plurality of extents associated with the data set is missing during logical reconstruction at the destination node, wherein when reconstructed at the destination node, the data set has a same logical layout but a different physical layout than the data set at the source node; in response to sending the push inquiry, receiving, by the source node, a push inquiry response initiated by the destination node, the push inquiry response identifying the missing extent, in response to the push inquiry response identifying the missing extent, sending, by the source node for delivery to the destination node, a second data stream including the missing extent. - View Dependent Claims (11, 12, 13)
-
-
14. A source node of a source-driven replication system, the node comprising:
-
one or more processors; a storage interface, operatively coupled to the one or more processors, through which to access a plurality of mass storage devices; a communication interface, operatively coupled to the one or more processors, through which to communicate with a destination node; a source replication module operatively coupled to the one or more processors and configured to send a first data stream including a plurality of extents associated with a data set, wherein the data set is deduplicated at the source node of the source-driven replication system and, when the data set is reconstructed at the destination node, the data set maintains the deduplication; and a source piggyback module operatively coupled to the one or more processors and the source replication module and configured to; send a push inquiry to identify if an extent of the plurality of extents associated with the data set is missing during logical reconstruction at the destination node, wherein when reconstructed at the destination node, the data set has a same logical layout but a different physical layout than the data set at the source node, in response to sending the push inquiry, receive a push inquiry response initiated by the destination node, the push inquiry response identifying the missing extent, and in response to the push inquiry response identifying the missing extent, send, for delivery to the destination node, a second data stream including the missing extent. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification