Systems and methods for selective data replication
First Claim
1. A method for performing data replication, the method comprising:
- using one or more computer processors, performing a first level assessment on files in first data stored on a first storage device that is associated with a source system and on corresponding files in second data stored on a second storage device that is associated with a destination system in networked communication with the source system, at least a portion of the second data previously having been replicated from the first data, the first level assessment comprising,comparing one or more attributes of the files in the first data with those of the corresponding files in the second data, andidentifying a file having at least one attribute of the one or more attributes different in the first and second data;
comparing the size of the identified file with a selected threshold value;
if the size of the identified file is less than or equal to the selected threshold value, replicating the identified file from the first storage device to the second storage device regardless of whether a checksum for the identified file in the first data matches a checksum for the corresponding file in the second data; and
if the size of the identified file is greater than the selected threshold value, performing a second level assessment on the identified file using one or more computer processors, the second level assessment comprising;
obtaining checksums for the identified file in the first data and its corresponding file in the second data;
comparing the checksums;
if the checksums are different, replicating the identified file from the first storage device to the second storage device; and
if the checksums are the same, synchronizing the at least one attribute of the identified file in the first data and the corresponding file in the second data, and not replicating the identified file from the first storage device to the second storage device.
4 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for performing data replication are disclosed. Determining whether to update replicated data typically involves comparison of readily obtainable attributes of a given source file and its corresponding replicated file. Such attributes can be obtained from, for example, metadata. In certain situations, an additional assessment of the source and replicated files can be beneficial. For example, if integrity of an existing replicated file'"'"'s content is maintained, one may not want to re-replicate the corresponding source file. For large source files, such a decision can provide substantial reductions in expenditures of available computing and network resources. In certain embodiments, a threshold for identifying such large files can be based on one or more operating parameters such as network type and available bandwidth. In certain embodiments, replication file'"'"'s integrity can be checked by calculating and comparing checksums for the replication file and its corresponding source file.
600 Citations
15 Claims
-
1. A method for performing data replication, the method comprising:
-
using one or more computer processors, performing a first level assessment on files in first data stored on a first storage device that is associated with a source system and on corresponding files in second data stored on a second storage device that is associated with a destination system in networked communication with the source system, at least a portion of the second data previously having been replicated from the first data, the first level assessment comprising, comparing one or more attributes of the files in the first data with those of the corresponding files in the second data, and identifying a file having at least one attribute of the one or more attributes different in the first and second data; comparing the size of the identified file with a selected threshold value; if the size of the identified file is less than or equal to the selected threshold value, replicating the identified file from the first storage device to the second storage device regardless of whether a checksum for the identified file in the first data matches a checksum for the corresponding file in the second data; and if the size of the identified file is greater than the selected threshold value, performing a second level assessment on the identified file using one or more computer processors, the second level assessment comprising; obtaining checksums for the identified file in the first data and its corresponding file in the second data; comparing the checksums; if the checksums are different, replicating the identified file from the first storage device to the second storage device; and if the checksums are the same, synchronizing the at least one attribute of the identified file in the first data and the corresponding file in the second data, and not replicating the identified file from the first storage device to the second storage device. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A data replication system, comprising:
-
a data storage system comprising a destination storage device configured to store replication of at least a portion of data from a client system, the client system comprising a source storage device and capable of communicating with the data storage system to facilitate transfer of data there between; and a replication agent executing in one or more computer processors, in communication with the client system and the data storage system, and configured to perform a first level assessment of an identified file stored on the source storage device of the client system to determine that the identified file has at least one metadata attribute that is different from that of an existing replicated copy of the identified file, the replicated copy stored on the destination storage device of the data storage system, the replication agent further configured to; obtain a size of the identified file, compare the size of the identified file with a threshold value, if the size is less than or equal to the threshold value, replicate the identified file so as to replace or update the existing replicated copy of the identified file, without determining whether a checksum for the identified file matches a checksum for the replicated copy of the identified file, and if the size is greater than the threshold value, perform a second level assessment on the identified file, the second level assessment comprising; (1) obtaining and comparing checksums of the identified file and the replicated file, and (2) replicating the identified file so as to replace or update the existing replicated copy of the identified file if the checksums are different. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A non-transitory computer readable medium configured to store software code that is readable by a computing system, wherein the software code is executable on the computing system in order to cause the computing system to perform operations comprising:
-
using one or more computer processors, performing a first level assessment on files in first data stored on a first storage device that is associated with a source system and on corresponding files in second data stored on a second storage device that is associated with a destination system in networked communication with the source system, at least a portion of the second data previously having been replicated from the first data, the first level assessment comprising, comparing one or more attributes of the files in the first data with those of the corresponding files in the second data, and identifying a file having at least one attribute of the one or more attributes different in the first and second data; comparing the size of the identified file with a selected threshold value; if the size of the identified file is less than or equal to the selected threshold value, replicating the identified file from the first storage device to the second storage device without determining whether a checksum for the identified file in the first data matches a checksum for the corresponding file in the second data; and if the size of the identified file is greater than the selected threshold value, performing a second level assessment on the identified file using one or more computer processors, the second level assessment comprising; obtaining checksums for the identified file in the first data and its corresponding file in the second data; comparing the checksums; if the checksums are different, replicating the identified file from the first storage device to the second storage device; and if the checksums are the same, synchronizing the at least one attribute of the identified file in the first data and the corresponding file in the second data, and not replicating the identified file from the first storage device to the second storage device.
-
Specification