Method and system for reflecting differences between two files
First Claim
Patent Images
1. A method for reflecting differences between two versions of a file, comprising:
- generating a base signature file comprising a plurality of base bit patterns, each base bit pattern being generated as a function of a portion of data in a first file;
processing the second file in by reading the second file only once and generating from the second file a revised signature file comprising a plurality of revised bit patterns, each revised bit pattern matching at least one of the base bit patterns; and
generating a delta file reflecting differences between the first file and the second file based on the base signature file, the revised signature file and the second file.
9 Assignments
0 Petitions
Accused Products
Abstract
A method and system for reflecting differences between two files. The method includes generating a base signature file having a plurality of base bit patterns, each base bit pattern being generated as a function of a portion of data in a first file. A second file containing a plurality of revised bit patterns is generated from a second file. Each revised bit pattern is compared to and matches at least one of the base bit patterns. A delta file reflecting the differences between a first file and the second file based on the base signature file, the delta signature file, and the second file is created.
247 Citations
20 Claims
-
1. A method for reflecting differences between two versions of a file, comprising:
-
generating a base signature file comprising a plurality of base bit patterns, each base bit pattern being generated as a function of a portion of data in a first file;
processing the second file in by reading the second file only once and generating from the second file a revised signature file comprising a plurality of revised bit patterns, each revised bit pattern matching at least one of the base bit patterns; and
generating a delta file reflecting differences between the first file and the second file based on the base signature file, the revised signature file and the second file. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
a) generating a bit pattern as a function of a respective block of data of the second file;
b) comparing the bit pattern to the plurality of base bit patterns, and if the bit pattern matches one of the base bit patterns, writing the bit pattern to the revised signature file, and if the bit pattern does not match any of the plurality of base bit patterns;
i) generating another bit pattern as a function of a next block of data of the second file; and
ii) repeating step b until either the bit pattern matches one of the plurality of base bit patterns, or until all the data in the second file has been processed.
-
-
4. A method according to claim 3, wherein the second file is processed as a stream of data, and the next block of data is obtained by removing the most significant byte of data from the respective block of data, and adding the next byte of data in the stream to the first block of data.
-
5. A method according to claim 3, wherein the second file is processed as a stream of bytes of data, the respective block of data comprising stream bytes ni through nx, and the next block of data comprising bytes ni+1 through nx+1.
-
6. A method according to claim 3, wherein the second file is processed as a stream of bytes of data, the respective block of data comprising stream bytes ni through nx, and wherein if the bit pattern matches one of the plurality of base bit patterns, the next block of data comprises bytes nx+1 through nx+1+blocksize.
-
7. A method according to claim 1, wherein the delta file comprises a plurality of command primitives, each command primitive identifying a modification to the first file.
-
8. A method according to claim 1, wherein the delta file is created on a client computer, and forwarded to a backup computer over a network.
-
9. A method according to claim 1, wherein the second file is communicated from a client computer to a backup computer, and the delta file is created on the backup computer.
-
10. A method according to claim 1, wherein the delta file is generated on a first computer and is communicated to a second computer to synchronize a copy of the first file being maintained on the second computer.
-
11. A method according to claim 1, wherein the delta file is created on a client computer, and forwarded to another computer over a network for file replication.
-
12. A method for representing differences between two files, comprising:
-
processing a first file and generating a plurality of first file patterns, each first file pattern being associated with a respective portion of the first file;
processing a second file by a single read of the second file and generating a plurality of second file patterns, each second file pattern being associated with a portion of the second file and being compared to and matching one of the first file patterns; and
processing the second file patterns and the first file patterns, and accessing the second file and generating a delta file containing primitives reflecting the differences between the first file and the second file. - View Dependent Claims (13, 14, 15, 16, 17)
a) obtaining a block of data;
b) generating a bit pattern as a function of the block of data;
c) comparing the bit pattern to the plurality of first file patterns;
d) if the bit pattern matches one of the plurality of first file patterns saving the bit pattern as a second file pattern, if the bit pattern does not match one of the plurality of first file patterns repeating steps a) through d) until the bit pattern matches one of the plurality of first file patterns, or the second file has been completely processed.
-
-
15. A method according to claim 14, wherein at least two blocks of data are obtained, the first block of data being bytes ni through bytes nx of the stream of data, and the second block of data being bytes ni+1 through bytes nx+1.
-
16. A method according to claim 12, wherein the second file is processed in a single pass to generate the plurality of second file patterns.
-
17. A system according to claim 16, further comprising a longest common subsequence table being operative to allow the third generator to traverse the plurality of revised bit patterns in longest common subsequence order.
-
18. A system for maintaining differences between a first file and a second file, comprising:
-
a first generator being operative to generate a base signature file comprising a plurality of base bit patterns, each base bit pattern being generated as a function of a portion of data in a first file;
a second generator being operative to generate by a single read through a second file a revised signature file comprising a plurality of revised bit patterns, each revised bit pattern matching at least one of the base bit patterns; and
a third generator being operative to generate a delta file reflecting differences between the first file and the second file based on the base signature file, the revised signature file and the second file. - View Dependent Claims (19, 20)
-
Specification