Method and system for data backup
First Claim
1. A method for data backup, wherein, there is original backup data and current data to be backed up, the method comprising:
- performing first chunking on the current data by using the same chunking method as that used by the original backup data to obtain a current chunk, wherein the original backup data is a content defined chunking data;
calculating hash value of the current chunk, wherein a determination of whether a number of continuous matched chunks exceeds a threshold is determined, based on the calculated hash value, and wherein the threshold is a preset value, if number of continuous matched chunks exceeds the preset threshold, time for chunking the data is saved, and, wherein, then data blocks of current chunk of data is equal to matched block of the original backup data;
in response to a number of continuous matched chunks exceeding the threshold, the length of a data block that corresponds to an identifier of a next chunk of the matched chunk is acquired;
acquiring, from a hash value table of the original backup data, an identifier of a matched chunk whose hash value is the same as the calculated hash value of the current chunk, and incrementing number of continuous matched chunks, based on the exceeded threshold; and
clearing the number of continuous matched chunks, whereby, the hash value table of the original backup data, and the identifier of a matched chunk whose hash value is the same as the calculated hash value of the current chunk are returned are returned in response to the number of continuous matched chunks exceeding the threshold.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention relates to a method, system, and computer program product for data backup, the method comprising: performing first chunking on current data by using the same chunking method as that used by original backup data to obtain a current chunk; calculating hash value of the current chunk; and acquiring, from a hash value table of the original backup data, an identifier of a matched chunk whose hash value is the same as the calculated hash value of the current chunk, and incrementing number of continuous matched chunks by one. Since the pertinence between original backup data and current data is maximally utilized, performance of de-duplication method can be efficiently improved.
29 Citations
5 Claims
-
1. A method for data backup, wherein, there is original backup data and current data to be backed up, the method comprising:
-
performing first chunking on the current data by using the same chunking method as that used by the original backup data to obtain a current chunk, wherein the original backup data is a content defined chunking data; calculating hash value of the current chunk, wherein a determination of whether a number of continuous matched chunks exceeds a threshold is determined, based on the calculated hash value, and wherein the threshold is a preset value, if number of continuous matched chunks exceeds the preset threshold, time for chunking the data is saved, and, wherein, then data blocks of current chunk of data is equal to matched block of the original backup data; in response to a number of continuous matched chunks exceeding the threshold, the length of a data block that corresponds to an identifier of a next chunk of the matched chunk is acquired; acquiring, from a hash value table of the original backup data, an identifier of a matched chunk whose hash value is the same as the calculated hash value of the current chunk, and incrementing number of continuous matched chunks, based on the exceeded threshold; and clearing the number of continuous matched chunks, whereby, the hash value table of the original backup data, and the identifier of a matched chunk whose hash value is the same as the calculated hash value of the current chunk are returned are returned in response to the number of continuous matched chunks exceeding the threshold. - View Dependent Claims (2, 3, 4, 5)
-
Specification