System and method for efficient backup using hashes
First Claim
1. A computer implemented method for data backup executed on a processor, the method comprising:
- (a) forming an image of a storage device, wherein contents of blocks of the storage device can be restored from the image;
(b) for each block stored in the image, storing a hash function value for contents of each block;
(c) for each block of the storage device to be backed up to an image, generating a hash function value corresponding to contents of that block;
(d) comparing the hash function values to identify, out of blocks of the storage device, candidate blocks that might have identical contents with contents of blocks stored in the image;
(e) comparing contents of candidate blocks with contents of corresponding blocks stored in the image;
(f) for blocks of the storage device with identical contents, storing links in the image instead of the contents of the blocks, wherein links for multiple blocks of the storage device with identical contents point to a single block in the image; and
(g) backing up, to the image, contents of unidentified blocks and blocks that do not have identical contents, wherein;
the image of the storage device contains a bitmap of the storage device backup;
the bitmap contains indicators for use of the links, such that an indicator defines if a block contains the content or if the block points to another block,and the bitmap contains indicators that reflect used and unused blocks such that an indicator of one setting represents a used block whose contents are shared with another block in the image, and a bit of an alternate setting corresponds to an unused block whose contents are unique to every other block in the storage image;
and the unused blocks do not require backing up of their contents.
12 Assignments
0 Petitions
Accused Products
Abstract
A method, system and computer program product for data backup such that: for each block of a storage device to be backed up to an image, generating a hash function value corresponding to contents of that block; generating a map of links between blocks in the image and corresponding blocks the storage device; using the hash function values to identify blocks of the storage device with identical contents, such that links for the blocks in the storage device with identical contents point to a single block in the image; and modifying the link in the map when a block in the storage is moved (for example, due to defragmentation) but its contents is not altered, so that the link points to the same backed up block.
96 Citations
11 Claims
-
1. A computer implemented method for data backup executed on a processor, the method comprising:
-
(a) forming an image of a storage device, wherein contents of blocks of the storage device can be restored from the image; (b) for each block stored in the image, storing a hash function value for contents of each block; (c) for each block of the storage device to be backed up to an image, generating a hash function value corresponding to contents of that block; (d) comparing the hash function values to identify, out of blocks of the storage device, candidate blocks that might have identical contents with contents of blocks stored in the image; (e) comparing contents of candidate blocks with contents of corresponding blocks stored in the image; (f) for blocks of the storage device with identical contents, storing links in the image instead of the contents of the blocks, wherein links for multiple blocks of the storage device with identical contents point to a single block in the image; and (g) backing up, to the image, contents of unidentified blocks and blocks that do not have identical contents, wherein; the image of the storage device contains a bitmap of the storage device backup; the bitmap contains indicators for use of the links, such that an indicator defines if a block contains the content or if the block points to another block, and the bitmap contains indicators that reflect used and unused blocks such that an indicator of one setting represents a used block whose contents are shared with another block in the image, and a bit of an alternate setting corresponds to an unused block whose contents are unique to every other block in the storage image; and the unused blocks do not require backing up of their contents. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for data backup implemented on a computer system having a memory and a processor, the system comprising:
-
(a) an image of a storage device, wherein contents of blocks of the storage device can be restored from the image; (b) for each block of the storage device to be backed up to an image, a stored hash function value corresponding to contents of that block; (c) a map of links between blocks in the storage device and corresponding blocks in the image; and (d) means for using the hash function values to identify blocks of the storage device with identical contents, such that links for the blocks in the storage device with identical contents point to a single block in the image, wherein the link in the map is modified when a block in the storage is moved but its contents is not altered, so that the link points to the same backed up block, and wherein; the image of the storage device contains a bitmap of the storage device backup; the bitmap contains indicators for use of the links, such that an indicator defines if a block contains the content or if the block points to another block, and the bitmap contains indicators that reflect used and unused blocks such that an indicator of one setting represents a used block whose contents are shared with another block in the image, and a bit of an alternate setting corresponds to an unused block whose contents are unique to every other block in the storage image; and the unused blocks do not require backing up of their contents. - View Dependent Claims (9)
-
-
10. A system for data backup implemented on a computer system having a memory and a processor, the system comprising:
-
(a) means for backing up a current state of a storage device to an image, wherein contents of blocks of the storage device can be restored from the image; (b) means for generating incremental backups for the storage device; (c) for each block of the storage device to be backed up to the image, means for storing a hash function value corresponding to contents of that block; (d) means for generating a map of links between blocks in the storage device and corresponding blocks in the image; and (e) after a defragmentation of the storage device, means for modifying the links in the map when a block in the storage is moved but its contents is not altered, so that the link points to the same block in the image as previously, wherein; the image of the storage device contains a bitmap of the storage device backup; the bitmap contains indicators for use of the links, such that an indicator defines if a block contains the content or if the block points to another block, and the bitmap contains indicators that reflect used and unused blocks such that an indicator of one setting represents a used block whose contents are shared with another block in the image, and a bit of an alternate setting corresponds to an unused block whose contents are unique to every other block in the storage image; and the unused blocks do not require backing up of their contents.
-
-
11. A computer useable storage medium having computer program logic stored thereon for executing on a processor for data backup, the computer program logic comprising:
-
(a) computer program code means for each block of a storage device to be backed up to an image, wherein contents of blocks of the storage device can be restored from the image; (b) computer program code means for generating a hash function value corresponding to contents of that block; (c) computer program code means for generating a map of links between blocks in the image and corresponding blocks the storage device; (d) computer program code means for using the hash function values to identify blocks of the storage device with identical contents, such that links for the blocks in the storage device with identical contents point to a single block in the image; and (e) computer program code means for modifying the link in the map when a block in the storage is moved but its contents is not altered, so that the link points to the same backed up block, wherein; the image of the storage device contains a bitmap of the storage device backup; the bitmap contains indicators for use of the links, such that an indicator defines if a block contains the content or if the block points to another block, and the bitmap contains indicators that reflect used and unused blocks such that an indicator of one setting represents a used block whose contents are shared with another block in the image, and a bit of an alternate setting corresponds to an unused block whose contents are unique to every other block in the storage image; and the unused blocks do not require backing up of their contents.
-
Specification