Efficient algorithm and protocol for remote differential compression
First Claim
1. A system for updating objects over a network between a local device and a remote device, comprising:
- a means for computing a first fingerprint function at every byte offset of a first object on the remote device;
a means for chunking the first object on the remote device based on the first fingerprint function;
a means for computing a remote signature for each chunk associated with the first object on the remote device;
a means for generating a remote signature and chunk length list on the remote device, wherein the remote signature and chunk length list is associated with the first object;
a means for computing a second fingerprint function at every byte offset of a second object on the local device, where the first and second objects are associated with one another, and where the first fingerprint function is matched to the second fingerprint function;
a means for chunking the second object on the local device based on the second fingerprint function, wherein the means for chunking the first object on the remote device is matched to the means for chunking the second object on the local device;
a means for computing a local signature for each chunk associated with the second object on the local device, wherein the means for computing the local signature is matched to the means for computing the remote signature;
a means for generating a local signature and chunk length list on the local device, wherein the local signature and chunk length list is associated with the second object;
a means for negotiating a chunked transmission of the remote signature and chunk length list from the remote device to the local device over the network such that bandwidth use is minimized for the transfer of the remote signature and chunk length list to the local device;
a means for identifying differences between the first object and the second object by comparing the local signature and chunk length list to the remote signature and chunk length list on the local device;
a means for requesting transmission of at least one updated object chunk from the remote device when differences between the first object and the second object are identified by the local device;
a means for transmitting the at least one updated object chunk from the remote device to the local device over the network; and
a means for reassembling a copy of the first object on the local device with the at least one updated object chunk.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and system are related to updating objects over limited bandwidth networks. Objects are updated between two or more computing devices using remote differential compression (RDC) techniques such that required data transfers are minimized. In one aspect, efficient large object transfers are achieved by recursively applying the RDC algorithm to its own metadata; a single or multiple recursion step(s) may be used in this case to reduce the amount of metadata sent over the network by the RDC algorithm. Objects and/or signature and chunk length lists can be chunked by locating boundaries at dynamically determined locations. A mathematical function evaluates hash values associated within a horizon window relative to potential chunk boundary. The described method and system is useful in a variety of networked applications, such as peer-to-peer replicators, email clients and servers, client-side caching systems, general-purpose copy utilities, database replicators, portals, software update services, file/data synchronization, and others.
-
Citations
77 Claims
-
1. A system for updating objects over a network between a local device and a remote device, comprising:
-
a means for computing a first fingerprint function at every byte offset of a first object on the remote device;
a means for chunking the first object on the remote device based on the first fingerprint function;
a means for computing a remote signature for each chunk associated with the first object on the remote device;
a means for generating a remote signature and chunk length list on the remote device, wherein the remote signature and chunk length list is associated with the first object;
a means for computing a second fingerprint function at every byte offset of a second object on the local device, where the first and second objects are associated with one another, and where the first fingerprint function is matched to the second fingerprint function;
a means for chunking the second object on the local device based on the second fingerprint function, wherein the means for chunking the first object on the remote device is matched to the means for chunking the second object on the local device;
a means for computing a local signature for each chunk associated with the second object on the local device, wherein the means for computing the local signature is matched to the means for computing the remote signature;
a means for generating a local signature and chunk length list on the local device, wherein the local signature and chunk length list is associated with the second object;
a means for negotiating a chunked transmission of the remote signature and chunk length list from the remote device to the local device over the network such that bandwidth use is minimized for the transfer of the remote signature and chunk length list to the local device;
a means for identifying differences between the first object and the second object by comparing the local signature and chunk length list to the remote signature and chunk length list on the local device;
a means for requesting transmission of at least one updated object chunk from the remote device when differences between the first object and the second object are identified by the local device;
a means for transmitting the at least one updated object chunk from the remote device to the local device over the network; and
a means for reassembling a copy of the first object on the local device with the at least one updated object chunk. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48)
-
-
49. A computer readable medium having computer-executable instructions for updating objects over a communication medium between a local device and a remote device, comprising:
-
chunking a first object on the remote device;
computing a signature for each chunk associated with the first object on the remote device to provide remote signatures;
assembling a remote signature and chunk length list on the remote device from the remote signatures;
generating a recursive remote signature and chunk length list on the remote device by;
chunking the remote signature and chunk length list on the remote device;
computing a signature for each chunk associated with the chunked remote signature and chunk length list on the remote device to provide recursive remote signatures; and
assembling a recursive remote signature and chunk length list on the remote device with the recursive remote signatures;
chunking a second object on the local device;
computing a signature for each chunk associated with the second object on the local device to provide local signatures;
assembling a local signature and chunk length list on the local device from the local signatures, such that the local signature and chunk length list is matched to the remote signature and chunk length list when the first object is matched to the second object;
generating a recursive local signature and chunk length list on the local device by;
chunking the local signature and chunk length list;
computing a signature for each chunk associated with the chunked local signature and chunk length list to provide recursive local signatures; and
assembling a recursive local signature and chunk length list with the recursive local signatures;
negotiating transmission of the recursive remote signature and chunk length list from the remote device to the local device over the communication medium such that bandwidth use is minimized for the transfer of the recursive remote signature and chunk length list to the local device;
identifying at least one difference between the first object and the second object by;
comparing the recursive remote signature and chunk length list and the recursive local signature and chunk length list on the local device to identify a difference; and
mapping the difference to at least one chunk associated with the second object; and
updating the first object on the local device by;
requesting transmission of at least one chunk from the remote device;
receiving a transmission from the remote device over the communication medium, wherein the transmission includes the at least one chunk; and
assembling an object with the at least one chunk. - View Dependent Claims (50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63)
-
-
64. A computer implemented method for updating objects over a communication channel between a local device and a remote device, comprising:
-
chunking a first object on the remote device;
computing a signature for each chunk associated with the first object on the remote device to provide remote signatures;
assembling a remote signature and chunk length list on the remote device from the remote signatures;
chunking a second object on the local device based on the computed fingerprint function;
computing a signature for each chunk associated with the second object on the local device to provide local signatures;
assembling a local signature and chunk length list on the local device from the local signatures, such that the local signature and chunk length list is matched to the remote signature and chunk length list when the first object is matched to the second object;
providing a recursive procedure on both the local device and the remote device, wherein the recursive procedure is arranged to process a designated signature and chunk length list by;
chunking the designated signature and chunk length list to provide a chunked signature and chunk length list;
computing a recursive signature for each chunk associated with the chunked signature and chunk length list;
generating a recursive signature and chunk length list with the recursive signatures and the chunked signature and chunk length list;
initializing the designated signature and chunk length list to the recursive signature and chunk length list when additional iterations are required for recursive processing; and
returning the recursive signature and chunk length list when the recursive procedure has completed the required number of iterations;
generating a recursive remote signature and chunk length list on the remote device by passing the remote signature and chunk length list to the recursive procedure as the designated signature and chunk length list, and by returning the recursive remote signature and chunk length list from the recursive procedure;
generating a recursive local signature and chunk length list on the local device by passing the local signature and chunk length list to the recursive procedure as the designated signature and chunk length list, and by returning the recursive local signature and chunk length list from the recursive procedure;
sending the recursive remote signature and chunk length list from the remote device to the local device over the communication channel;
identifying at least one difference between the first object and the second object by comparing the received recursive remote signature and chunk length list to the recursive local signature and chunk length list;
identifying at least one updated chunk associated with the second object based on the at least one difference; and
updating the first object on the local device by;
requesting transmission of the at least one updated chunk from the remote device;
receiving a transmission from the remote device over the communication channel, wherein the transmission includes the at least one updated chunk; and
assembling an object with the at least one updated chunk. - View Dependent Claims (65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77)
-
Specification