Systems and methods for performing data replication
First Claim
1. A method for performing data replication, the method comprising:
- monitoring with one or more processors of a source storage system a plurality of journal entries associated with writing data to a source storage device associated with the source storage system;
identifying a first journal entry of the plurality of journal entries, the first journal entry comprising a first data write operation, a first file identifier descriptor (FID) of a file to be modified by the first data write operation on the source storage device, and a first location of a first portion of the file to be modified,wherein a FID identifies a file or directory of a file system on the source storage system and is usable to construct an absolute file name for transmitting data to a replication system;
identifying a second journal entry of the plurality of journal entries, the second journal entry comprising a second data write operation, a second FID of a file to be modified by the second data write operation on the source storage device, and a second location of a second portion of the file to be modified,wherein the first journal entry and the second journal entry are from the same journal;
determining with the one or more processors that the first and second data write operations can be combined into a single write operation based on a determination that the first and second FIDs both correspond to a first value;
combining the first and second data write operations based on said determination;
constructing with the one or more processors an absolute file name by associating the first value with a short name and at least one directory name, wherein neither the first nor second journal entries comprises the absolute file name; and
transmitting the single write operation and the absolute file name to a destination storage device to replay on the destination storage device the data modifications associated with the first and second write operations,wherein the destination storage device stores a replicated version of data written to the source storage device.
4 Assignments
0 Petitions
Accused Products
Abstract
Preparing source data to be replicated in a continuous data replication environment. Certain systems and methods populate a file name database with entries having a unique file identifier descriptor (FID), short name and a FID of the parent directory of each directory or file on a source storage device. Such information is advantageously gathered during scanning of a live file system without requiring a snapshot of the source storage device. The database can be further used to generate absolute file names associated with data operations to be replayed on a destination storage device. Based on the obtained FIDs, certain embodiments can further combine write operations to be replayed on the destination storage device and/or avoid replicating temporary files to the destination system.
617 Citations
20 Claims
-
1. A method for performing data replication, the method comprising:
-
monitoring with one or more processors of a source storage system a plurality of journal entries associated with writing data to a source storage device associated with the source storage system; identifying a first journal entry of the plurality of journal entries, the first journal entry comprising a first data write operation, a first file identifier descriptor (FID) of a file to be modified by the first data write operation on the source storage device, and a first location of a first portion of the file to be modified, wherein a FID identifies a file or directory of a file system on the source storage system and is usable to construct an absolute file name for transmitting data to a replication system; identifying a second journal entry of the plurality of journal entries, the second journal entry comprising a second data write operation, a second FID of a file to be modified by the second data write operation on the source storage device, and a second location of a second portion of the file to be modified, wherein the first journal entry and the second journal entry are from the same journal; determining with the one or more processors that the first and second data write operations can be combined into a single write operation based on a determination that the first and second FIDs both correspond to a first value; combining the first and second data write operations based on said determination; constructing with the one or more processors an absolute file name by associating the first value with a short name and at least one directory name, wherein neither the first nor second journal entries comprises the absolute file name; and transmitting the single write operation and the absolute file name to a destination storage device to replay on the destination storage device the data modifications associated with the first and second write operations, wherein the destination storage device stores a replicated version of data written to the source storage device. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system for performing data replication, the system comprising:
-
at least one computer application executing on a computing device and configured to generate operations associated with data on a source storage device associated with a source storage system; a filter module disposed between the at least one computer application and the source storage device, the filter module configured to; monitor a plurality of journal entries associated with writing data to the source storage device; identify a first journal entry of the plurality of journal entries, the first journal entry comprising a first data modification operation, a first file identifier descriptor (FID) of a file to be modified by the first data modification operation, and a first location of a first portion of the file to be modified, wherein a FID identifies a file or directory of a file system on the source storage system and is usable to construct an absolute file name for transmitting data to a replication system; identify a second journal entry of the plurality of journal entries, the second journal entry comprising a second data modification operation, a second FID of a file to be modified by the second data modification operation, and a second location of a second portion of the file to be modified, wherein the first journal entry and the second journal entry are from the same journal; a processing module configured to; determine that the first and second data modification operations can be combined into a single modification operation based on a determination that the first and second FIDs both correspond to a first value; and combine the first and second data modification operations based on said determination; and at least one database thread configured to construct an absolute file name for replaying the single modification operation on replication data of a destination storage device by associating the first value with a short name and at least one directory name, wherein neither the first nor second journal entries comprises the absolute file name, wherein the destination storage device stores a replicated version of data written to the source storage device. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. A system for performing data replication, the system comprising:
-
means for monitoring a plurality of journal entries associated with writing data to a source storage device associated with a source storage system; means for identifying a first journal entry of the plurality of journal entries, the first journal entry comprising a first data write operation, a first file identifier descriptor (FID) of a file to be modified on the source storage device, and a first location of a first portion of the file to be modified, and for identifying a second journal entry of the plurality of journal entries, the second journal entry comprising a second data write operation, a second FID of a file to be modified on the source storage device, and a second location of a second portion of the file to be modified, wherein a FID identifies a file or directory of a file system on the source storage system and is usable to construct an absolute file name for transmitting data to a replication system, and wherein the first journal entry and the second journal entry are from the same journal; means for determining that the first and second data write operations can be combined into a single write operation based on a determination that the first and second FIDs both correspond to a first value, and for combining the first and second data write operations based on said determination; means for constructing an absolute file name by associating the first value with a short name and at least one directory name, wherein neither the first nor second journal entries comprises the absolute file name; and means for transmitting the single write operation and the absolute file name to a destination storage device to replay on the destination storage device the data modifications associated with the first and second write operations, wherein the destination storage device stores a replicated version of data written to the source storage device. - View Dependent Claims (20)
-
Specification