Systems and methods for performing data replication
First Claim
1. A method for identifying data to be copied in a data replication system, the method comprising:
- using a computing device, adding a first file identifier descriptor (FID) of a first directory on a live source file system to a queue, the first FID being one of a plurality of unique identifiers corresponding to a plurality of directories and files on the source file system;
storing a current journal sequence number from a file system filter driver identifying a first time;
following said storing, accessing a current directory of the plurality of directories on the source file system that corresponds to a next FID stored in the queue;
obtaining additional FIDs for each immediate child directory and immediate child file in the current directory; and
in response to determining that no changes have been made to the current directory since the first time;
populating a file name database with the additional FIDs of each immediate child directory and immediate child file in the current directory;
adding the additional FIDs of each immediate child directory of the current directory to the queue; and
removing the next FID from the queue.
5 Assignments
0 Petitions
Accused Products
Abstract
Preparing source data to be replicated in a continuous data replication environment. Certain systems and methods populate a file name database with entries having a unique file identifier descriptor (FID), short name and a FID of the parent directory of each directory or file on a source storage device. Such information is advantageously gathered during scanning of a live file system without requiring a snapshot of the source storage device. The database can be further used to generate absolute file names associated with data operations to be replayed on a destination storage device. Based on the obtained FIDs, certain embodiments can further combine write operations to be replayed on the destination storage device and/or avoid replicating temporary files to the destination system.
697 Citations
20 Claims
-
1. A method for identifying data to be copied in a data replication system, the method comprising:
-
using a computing device, adding a first file identifier descriptor (FID) of a first directory on a live source file system to a queue, the first FID being one of a plurality of unique identifiers corresponding to a plurality of directories and files on the source file system; storing a current journal sequence number from a file system filter driver identifying a first time; following said storing, accessing a current directory of the plurality of directories on the source file system that corresponds to a next FID stored in the queue; obtaining additional FIDs for each immediate child directory and immediate child file in the current directory; and in response to determining that no changes have been made to the current directory since the first time; populating a file name database with the additional FIDs of each immediate child directory and immediate child file in the current directory; adding the additional FIDs of each immediate child directory of the current directory to the queue; and removing the next FID from the queue. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system for preparing data for replication from a source computing device in a network, the system comprising:
-
one or more memory devices containing; a queue including a plurality of file identifier descriptors (FIDs) each comprising a unique identifier that corresponds to one of a plurality of directories and files on a source file system; and a database comprising file name data that associates each of the plurality of FIDs with a short name and a parent FID; a computing system comprising one or more computing devices comprising computer hardware, the computing system configured to; scan the source file system while in a live state and to populate the queue with the plurality of FIDs; access a current directory of the plurality of directories on the source file system that corresponds to a next FID in the queue; and obtain additional FIDs for each immediate child directory and immediate child file in the current directory; and populate the database with the file name data based on said scan of the source file system in the live state. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. A system for identifying data to be copied in a data replication system, the system comprising:
one or more computing devices comprising computer hardware and configured to; add a first file identifier descriptor (FID) of a first directory on a live source file system to a queue, the first FID being one of a plurality of unique identifiers corresponding to a plurality of directories and files on the source file system; store a current journal sequence number from a file system filter driver identifying a first time; following said storing, access a current directory of the plurality of directories on the source file system that corresponds to a next FID stored in the queue; obtain additional FIDs for each immediate child directory and immediate child file in the current directory; and in response to determining that no changes have been made to the current directory since the first time; populate a file name database with the additional FIDs of each immediate child directory and immediate child file in the current directory; add the additional FIDs of each immediate child directory of the current directory to the queue; and remove the next FID from the queue.
Specification