Methods and systems for file replication utilizing differences between versions of files
First Claim
1. A computer implemented method for comparing data of a first and a revised version of a file to determine differences between the versions, comprising:
- segmenting the first and the revised versions into blocks of digital data of equal size;
moving a reference frame of a set resolution across adjacent portions of one of the blocks of the revised version the set resolution of the reference frame being a number of bits less than a number of bits defining the equal size of the blocks of digital data;
obtaining plural coarse signatures of the one of the blocks of the revised version based on said moving the reference frame across adjacent portions of the one of the blocks of the revised version, the set resolution of the reference frame defining the coarse signatures;
comparing each of the plural coarse signatures of the one of the blocks of the revised version to a coarse signature of a comparable one of the blocks of the first version;
determining whether any of the plural coarse signatures of the one of the blocks of the revised version matches the coarse signature of the comparable one of the blocks of the first version based on said comparing; and
in response to determining none of the plural coarse signatures of the one of the blocks of the revised version match the coarse signature of the comparable one of the blocks of the first version, repeating said moving the reference frame across adjacent portions of one of the blocks of the revised version, obtaining plural coarse signatures of the one of the blocks of the revised version based on said moving, comparing each of the plural coarse signatures of the one of the blocks of the revised version to the coarse signature of the comparable one of the blocks of the first version, and determining whether any of the plural coarse signatures of the one of the blocks of the revised version matches the coarse signature of the comparable one of the blocks of the first version based on said comparing.
19 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems for efficient file replication are provided. In some embodiments, one or more coarse signatures for blocks in a base file are compared with those coarse signatures for blocks of a revised file, until a match is found. A fine signature is then generated for the matching block of the revised file and compared to a fine signature of the base file. Thus, fine signatures are not computed unless a coarse signature match has been found, thereby minimizing unneeded time-consuming fine signature calculations. Methods are also provided for determining whether to initiate a delta file generation algorithm, or whether to utilize a more efficient replication method, based upon system and/or file parameters. In accordance with additional embodiments, the lengths of valid data on physical blocks are obtained from physical block mappings for the files, and these lengths and mappings are utilized for delta file generation, to minimize unnecessary signature computations.
-
Citations
33 Claims
-
1. A computer implemented method for comparing data of a first and a revised version of a file to determine differences between the versions, comprising:
-
segmenting the first and the revised versions into blocks of digital data of equal size; moving a reference frame of a set resolution across adjacent portions of one of the blocks of the revised version the set resolution of the reference frame being a number of bits less than a number of bits defining the equal size of the blocks of digital data; obtaining plural coarse signatures of the one of the blocks of the revised version based on said moving the reference frame across adjacent portions of the one of the blocks of the revised version, the set resolution of the reference frame defining the coarse signatures; comparing each of the plural coarse signatures of the one of the blocks of the revised version to a coarse signature of a comparable one of the blocks of the first version; determining whether any of the plural coarse signatures of the one of the blocks of the revised version matches the coarse signature of the comparable one of the blocks of the first version based on said comparing; and in response to determining none of the plural coarse signatures of the one of the blocks of the revised version match the coarse signature of the comparable one of the blocks of the first version, repeating said moving the reference frame across adjacent portions of one of the blocks of the revised version, obtaining plural coarse signatures of the one of the blocks of the revised version based on said moving, comparing each of the plural coarse signatures of the one of the blocks of the revised version to the coarse signature of the comparable one of the blocks of the first version, and determining whether any of the plural coarse signatures of the one of the blocks of the revised version matches the coarse signature of the comparable one of the blocks of the first version based on said comparing. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system comprising:
-
a processor; and a memory coupled with and readable by the processor and having stored therein a sequence of instructions which, when executed by the processor, causes the processor to compare data of a first and a revised version of a file to determine differences between the versions by; segmenting the first and the revised versions into blocks of digital data of equal size, moving a reference frame of a set resolution across adjacent portions of one of the blocks of the revised version the set resolution of the reference frame being a number of bits less than a number of bits defining the equal size of the blocks of digital data, obtaining plural coarse signatures of the one of the blocks of the revised version based on said moving the reference frame across adjacent portions of the one of the blocks of the revised version, the set resolution of the reference frame defining the coarse signatures, comparing each of the plural coarse signatures of the one of the blocks of the revised version to a coarse signature of a comparable one of the blocks of the first version, determining whether any of the plural coarse signatures of the one of the blocks of the revised version matches the coarse signature of the comparable one of the blocks of the first version based on said comparing, and in response to determining none of the plural coarse signatures of the one of the blocks of the revised version match the coarse signature of the comparable one of the blocks of the first version, repeating said moving the reference frame across adjacent portions of one of the blocks of the revised version, obtaining plural coarse signatures of the one of the blocks of the revised version based on said moving, comparing each of the plural coarse signatures of the one of the blocks of the revised version to the coarse signature of the comparable one of the blocks of the first version, and determining whether any of the plural coarse signatures of the one of the blocks of the revised version matches the coarse signature of the comparable one of the blocks of the first version based on said comparing. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A computer-readable memory having stored therein a sequence of instructions which, when executed by a processor, causes the processor to compare data of a first and a revised version of a file to determine differences between the versions by:
-
segmenting the first and the revised versions into blocks of digital data of equal size, moving a reference frame of a set resolution across adjacent portions of one of the blocks of the revised version the set resolution of the reference frame being a number of bits less than a number of bits defining the equal size of the blocks of digital data, obtaining plural coarse signatures of the one of the blocks of the revised version based on said moving the reference frame across adjacent portions of the one of the blocks of the revised version, the set resolution of the reference frame defining the coarse signatures, comparing each of the plural coarse signatures of the one of the blocks of the revised version to a coarse signature of a comparable one of the blocks of the first version, determining whether any of the plural coarse signatures of the one of the blocks of the revised version matches the coarse signature of the comparable one of the blocks of the first version based on said comparing, and in response to determining none of the plural coarse signatures of the one of the blocks of the revised version match the coarse signature of the comparable one of the blocks of the first version, repeating said moving the reference frame across adjacent portions of one of the blocks of the revised version, obtaining plural coarse signatures of the one of the blocks of the revised version based on said moving, comparing each of the plural coarse signatures of the one of the blocks of the revised version to the coarse signature of the comparable one of the blocks of the first version, and determining whether any of the plural coarse signatures of the one of the blocks of the revised version matches the coarse signature of the comparable one of the blocks of the first version based on said comparing. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
-
Specification