Methods and systems for file replication utilizing differences between versions of files
First Claim
1. A system for performing file store synchronization across multiple servers, comprising:
- a processor; and
a memory coupled with and readable by the processor and storing therein a set of instructions which, when executed by the processor, causes the processor to initiate a file store synchronization process by;
receiving a first file from a first computer server, the first file comprising a plurality of physical blocks of data located at various allocated positions on a memory device of the first computer server;
receiving a second file from a second computer server, the second file comprising a second plurality of physical blocks of data located at various allocated positions on the memory device of the second computer server;
performing, in logical order for each particular physical block of the first plurality of the physical blocks of the first file;
(a) determining a first signature parameter for the particular physical block of the first file by retrieving a predetermined number of bits from a predetermined location within the particular physical block of the first file, wherein the first signature parameter for the particular physical block of the first file is determined based upon a subset of the data within the particular physical block that is less than all of the data in the particular physical block;
(b) determining a first signature parameter for a corresponding physical block of the second file by retrieving the same predetermined number of bits from the same predetermined location within the corresponding physical block of the second file, wherein the first signature parameter for the corresponding physical block of the second file is determined based upon a subset of the data within the corresponding physical block that is less than all of the data in the corresponding physical block;
(c) determining whether the retrieved bits comprising the first signature parameter for the particular physical block of the first file match the retrieved bits comprising the first signature parameter for the corresponding physical block of the second file; and
(d) in response to determining that the first signature parameter for the particular physical block of the first file matches the first signature parameter of the corresponding physical block of the second file, performing the following additional steps for the particular physical block of the first file;
(i) determining a second signature parameter for the particular physical block of the first file, by executing at least one of a cyclic redundancy check (CRC) algorithm or an MD5 algorithm, using as input all bits of the particular physical block of the first file;
(ii) determining a second signature parameter for the corresponding physical block of the second file, by executing at least one of a cyclic redundancy check (CRC) algorithm or an MD5 algorithm, using as input all bits of the corresponding physical block of the second file;
(iii) determining whether the second signature parameter comprising the output of the one or more algorithms for the particular physical block of the first file matches the second signature parameter comprising the output of the one or more algorithms for the corresponding physical block of the second file; and
(iv) in response to determining that the second signature parameter of the particular physical block of the first file matches the second signature parameter of the corresponding physical block of the second file, creating a delta file using the second signature parameters of the at least one physical block of the second file by generating primitive commands and corresponding logical data parameters for the primitive commands, wherein the logical data parameters comprise logical offset addresses for the base file and logical data lengths for the base file; and
initiating a file store synchronization process between the first computer server and the second computer server, using the delta file as input to the file store synchronization process.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems for efficient file replication are provided. In some embodiments, one or more coarse signatures for blocks in a base file are compared with those coarse signatures for blocks of a revised file, until a match is found. A fine signature is then generated for the matching block, of the revised file and compared to a fine signature of the base file. Thus, fine signatures are not computed unless a coarse signature match has been found, thereby minimizing unneeded time-consuming fine signature calculations. Methods are also provided for determining whether to initiate a delta file generation algorithm, or whether to utilize a more efficient replication method, based upon system and/or file parameters. In accordance with additional embodiments, the lengths of valid data on physical blocks are obtained from physical block mappings for the files, and these lengths and mappings are utilized for delta file generation, to minimize unnecessary signature computations.
69 Citations
19 Claims
-
1. A system for performing file store synchronization across multiple servers, comprising:
-
a processor; and a memory coupled with and readable by the processor and storing therein a set of instructions which, when executed by the processor, causes the processor to initiate a file store synchronization process by; receiving a first file from a first computer server, the first file comprising a plurality of physical blocks of data located at various allocated positions on a memory device of the first computer server; receiving a second file from a second computer server, the second file comprising a second plurality of physical blocks of data located at various allocated positions on the memory device of the second computer server; performing, in logical order for each particular physical block of the first plurality of the physical blocks of the first file; (a) determining a first signature parameter for the particular physical block of the first file by retrieving a predetermined number of bits from a predetermined location within the particular physical block of the first file, wherein the first signature parameter for the particular physical block of the first file is determined based upon a subset of the data within the particular physical block that is less than all of the data in the particular physical block; (b) determining a first signature parameter for a corresponding physical block of the second file by retrieving the same predetermined number of bits from the same predetermined location within the corresponding physical block of the second file, wherein the first signature parameter for the corresponding physical block of the second file is determined based upon a subset of the data within the corresponding physical block that is less than all of the data in the corresponding physical block; (c) determining whether the retrieved bits comprising the first signature parameter for the particular physical block of the first file match the retrieved bits comprising the first signature parameter for the corresponding physical block of the second file; and (d) in response to determining that the first signature parameter for the particular physical block of the first file matches the first signature parameter of the corresponding physical block of the second file, performing the following additional steps for the particular physical block of the first file; (i) determining a second signature parameter for the particular physical block of the first file, by executing at least one of a cyclic redundancy check (CRC) algorithm or an MD5 algorithm, using as input all bits of the particular physical block of the first file; (ii) determining a second signature parameter for the corresponding physical block of the second file, by executing at least one of a cyclic redundancy check (CRC) algorithm or an MD5 algorithm, using as input all bits of the corresponding physical block of the second file; (iii) determining whether the second signature parameter comprising the output of the one or more algorithms for the particular physical block of the first file matches the second signature parameter comprising the output of the one or more algorithms for the corresponding physical block of the second file; and (iv) in response to determining that the second signature parameter of the particular physical block of the first file matches the second signature parameter of the corresponding physical block of the second file, creating a delta file using the second signature parameters of the at least one physical block of the second file by generating primitive commands and corresponding logical data parameters for the primitive commands, wherein the logical data parameters comprise logical offset addresses for the base file and logical data lengths for the base file; and initiating a file store synchronization process between the first computer server and the second computer server, using the delta file as input to the file store synchronization process. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-readable memory device comprising a set of instructions stored therein which, when executed by a processor, causes the processor to perform file store synchronization across multiple servers, by:
-
receiving a first file from a first computer server, the first file comprising a first plurality of physical blocks of data located at various allocated positions on a memory device of the first computer server; receiving a second file from a second computer server, the second file comprising a plurality of physical blocks of data located at various allocated positions on the memory device of the second computer server; performing the following steps for each particular physical block of the first plurality of physical blocks in the first file; (a) determining a first signature parameter for the particular physical block of the first file by retrieving a predetermined number of bits from a predetermined location within the particular physical block of the first file, wherein the first signature parameter for the particular physical block of the first file is determined based upon a subset of the data within the particular physical block that is less than all of the data in the particular physical block; (b) determining a first signature parameter for a corresponding physical block of the second file by retrieving the same predetermined number of bits from the same predetermined location within the corresponding physical block of the second file, wherein the first signature parameter for the corresponding physical block of the second file is determined based upon a subset of the data within the corresponding physical block that is less than all of the data in the corresponding physical block of the second file; (c) determining whether the retrieved bits comprising the first signature parameter for the particular physical block of the first file match the retrieved bits comprising the first signature parameter for the corresponding physical block of the second file; and (d) in response to determining that the first signature parameter for the particular physical block of the first file matches the first signature parameter of the corresponding physical block of the second file, performing the following additional steps for the particular physical block of the first file; (i) determining a second signature parameter for the particular physical block of the first file, by executing at least one of a cyclic redundancy check (CRC) algorithm or an MD5 algorithm, using as input all bits of the particular physical block of the first file; (ii) determining a second signature parameter for the corresponding physical block of the second file, by executing at least one of a cyclic redundancy check (CRC) algorithm or an MD5 algorithm, using as input all bits of the corresponding physical block of the second file; (iii) determining whether the second signature parameter comprising the output of the one or more algorithms for the particular physical block of the first file matches the second signature parameter comprising the output of the one or more algorithms for the corresponding physical block of the second file; and (iv) in response to determining that the second signature parameter of the particular physical block of the first file matches the second signature parameter of the corresponding physical block of the second file, creating a delta file using the second signature parameters of the at least one physical block of the second file by generating primitive commands and corresponding logical data parameters for the primitive commands, wherein the logical data parameters comprise logical offset addresses for the base file and logical data lengths for the base file; and initiating a file store synchronization process between the first computer server and the second computer server, using the delta file as input to the file store synchronization process. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A method of performing file store synchronization across multiple servers, the method comprising:
-
receiving a first file from a first computer server, the first file comprising a first plurality of physical blocks of data located at various allocated positions on a memory device of the first computer server; receiving a second file from a second computer server, the second file comprising a second plurality of physical blocks of data located at various allocated positions on a memory device of the second computer server; for each particular physical block of the first plurality of physical blocks in the first file, performing the following steps; (a) determining a first signature parameter for the particular physical block of the first file by retrieving a predetermined number of bits from a predetermined location within the particular physical block of the first file, wherein the first signature parameter for the particular physical block of the first file is determined based upon a subset of the data within the particular physical block that is less than all of the data in the particular physical block; (b) determining a first signature parameter for a corresponding physical block of the second file by retrieving the same predetermined number of bits from the same predetermined location within the corresponding physical block of the second file, wherein the first signature parameter for the corresponding physical block of the second file is determined based upon a subset of the data within the corresponding physical block that is less than all of the data in the corresponding physical block of the second file; (c) determining whether the retrieved bits comprising the first signature parameter for the particular physical block of the first file match the retrieved bits comprising the first signature parameter for the corresponding physical block of the second file; and (d) in response to determining that the first signature parameter for the particular physical block of the first file matches the first signature parameter of the corresponding physical block of the second file, performing the following additional steps for the particular physical block of the first file; (i) determining a second signature parameter for the particular physical block of the first file, by executing at least one of a cyclic redundancy check (CRC) algorithm or an MD5 algorithm, using as input all bits of the particular physical block of the first file; (ii) determining a second signature parameter for the corresponding physical block of the second file, by executing at least one of a cyclic redundancy check (CRC) algorithm or an MD5 algorithm, using as input all bits of the corresponding physical block of the second file; (iii) determining whether the second signature parameter comprising the output of the one or more algorithms for the particular physical block of the first file matches the second signature parameter comprising the output of the one or more algorithms for the corresponding physical block of the second file; and (iv) in response to determining that the second signature parameter of the particular physical block of the first file matches the second signature parameter of the corresponding physical block of the second file, creating a delta file using the second signature parameters of the at least one physical block of the second file by generating primitive commands and corresponding logical data parameters for the primitive commands, wherein the logical data parameters comprise logical offset addresses for the base file and logical data lengths for the base file; and initiating a file store synchronization process between the first computer server and the second computer server, using the delta file as input to the file store synchronization process. - View Dependent Claims (16, 17, 18, 19)
-
Specification