Systems and methods for expedited data transfer in a communication system using hash segmentation
First Claim
1. A method for determining differences between datasets residing on separate hosts in a communication network, the method comprising the steps of:
- creating first hash values, at a first host, corresponding to a plurality of segments of a first dataset;
creating second hash values, at a second host, corresponding to a plurality of segments of a second dataset; and
comparing one or more first hash values to the second hash values to determine which segments of the datasets differ.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention provides for an improved method and system for determining differences in data sets or data files, expedited data transfer and data reconciliation in a communication network using hash segmentation processing. The system and method provides for an efficient means of communicating updated files, new revisions or verifying files between a source host and a target host. By implementing hash segmentation processing, and in many embodiments iterative hash segmentation processing, the updates within the files can be isolated for the purpose of minimizing the amount of data communicated from the source host to the target host. The system and methods provide for the transfer of data between two hosts in instances in which neither host is aware of the revision that exists on the other host. The hash segmentation process may implement a logarithmic hash approach or a sliding linear hash approach.
-
Citations
41 Claims
-
1. A method for determining differences between datasets residing on separate hosts in a communication network, the method comprising the steps of:
-
creating first hash values, at a first host, corresponding to a plurality of segments of a first dataset;
creating second hash values, at a second host, corresponding to a plurality of segments of a second dataset; and
comparing one or more first hash values to the second hash values to determine which segments of the datasets differ. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A method for expedited data transfer and data reconciliation in a communication network, the method comprising the steps of:
-
creating, at a first host, first hash values corresponding to segments of a first dataset;
communicating the first hash values to a second host having a second dataset residing thereon;
creating, at the second host, second hash values corresponding to segments of the second dataset;
comparing, at the second host, the first and second hash values to determine if a segment difference exists between corresponding first dataset segments and second dataset segments;
communicating to the second host one or more segments of the first dataset that have been determined to differ from the second dataset; and
compiling a third dataset that includes the one or more segments of the first dataset determined to differ from the second dataset and one or more segments of the second dataset determined not to differ from the first dataset. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A method for determining differences between datasets residing on separate hosts in a communication network, the method comprising the steps:
-
creating first dataset hash values, at a first host, corresponding to segments of a first dataset; and
searching, at a second host, for segments of a second dataset that have matching hash values to the first dataset hash values using a slide function of a sliding hash algorithm. - View Dependent Claims (26, 27, 28, 29, 30, 31, 32, 33, 34)
-
-
35. A system for expedited data transfer and data reconciliation in a communication network, the system comprising:
-
a first processor residing in a first host, the first processor implements a hash algorithm to create first hash values corresponding to segments of a first dataset; and
a second processor residing in a second host and in network communication with the first processor, the second processor implements the first hash algorithm to create second hash values corresponding to segments of a second dataset;
wherein the first hash values are compared to the second hash values to determine which segments of the datasets differ and wherein the first host communicates to the second host one or more segments of the first dataset if a determination is made that one or more segments of the first dataset differ from the second dataset. - View Dependent Claims (36, 37, 38, 39, 40, 41)
-
Specification