Systems and methods for performing data replication
First Claim
1. A method for identifying data to be copied in a data replication system, the method comprising:
- obtaining with a scanning module executing on a computing device a first file identifier descriptor (FID) of a first directory on a live source file system, the first FID being one of a plurality of unique identifiers corresponding to a plurality of directories and files on the source file system;
adding the first FID to a queue;
storing a current journal sequence number from a file system filter driver identifying a first time;
following said storing, accessing a current directory of the plurality of directories on the source file system that corresponds to a next FID stored in the queue;
obtaining additional FIDs for each immediate child directory and immediate child file in the current directory;
if no changes have been made to the current directory since the first time,populating a file name database with the additional FIDs of each immediate child directory and immediate child file in the current directory,adding the additional FIDs of each immediate child directory of the current directory to the queue, andremoving the next FID from the queue; and
if changes have been made to the first directory since the first time, repeating said storing, said accessing and said obtaining the additional FIDs.
4 Assignments
0 Petitions
Accused Products
Abstract
Preparing source data to be replicated in a continuous data replication environment. Certain systems and methods populate a file name database with entries having a unique file identifier descriptor (FID), short name and a FID of the parent directory of each directory or file on a source storage device. Such information is advantageously gathered during scanning of a live file system without requiring a snapshot of the source storage device. The database can be further used to generate absolute file names associated with data operations to be replayed on a destination storage device. Based on the obtained FIDs, certain embodiments can further combine write operations to be replayed on the destination storage device and/or avoid replicating temporary files to the destination system.
-
Citations
17 Claims
-
1. A method for identifying data to be copied in a data replication system, the method comprising:
-
obtaining with a scanning module executing on a computing device a first file identifier descriptor (FID) of a first directory on a live source file system, the first FID being one of a plurality of unique identifiers corresponding to a plurality of directories and files on the source file system; adding the first FID to a queue; storing a current journal sequence number from a file system filter driver identifying a first time; following said storing, accessing a current directory of the plurality of directories on the source file system that corresponds to a next FID stored in the queue; obtaining additional FIDs for each immediate child directory and immediate child file in the current directory; if no changes have been made to the current directory since the first time, populating a file name database with the additional FIDs of each immediate child directory and immediate child file in the current directory, adding the additional FIDs of each immediate child directory of the current directory to the queue, and removing the next FID from the queue; and if changes have been made to the first directory since the first time, repeating said storing, said accessing and said obtaining the additional FIDs. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for preparing data for replication from a source computing device in a network, the system comprising:
-
a queue configured to store a plurality of file identifier descriptors (FIDs) each comprising a unique identifier that corresponds to one of a plurality of directories and files on a source file system; a scanning module executing on a computing device and configured to scan the source file system while in a live state and to populate the queue with the plurality of FIDs; a database comprising file name data that associates each of the plurality of FIDs with a short name and a parent FID, wherein the scanning module is further configured to populate the database with the file name data based on said scan of the source file system in the live state; and at least one database thread configured to receive a data entry identifying a data management operation associated with at least one of the plurality of directories and files on the source file system and to construct from the FID associated with the at least one directory or file an absolute file name for transmission to a destination system along with a copy of the data management operation for replying on the destination system. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
Specification