Combining Hash-Based Duplication with Sub-Block Differencing to Deduplicate Data
First Claim
1. A method comprising, by one or more computer systems:
- accessing data;
partitioning the data into a plurality of sub-blocks;
determining whether a first one of the sub-blocks is identical to another one of the sub-blocks or similar to another one of the sub-blocks;
if the first one of the sub-blocks is identical to another one of the sub-blocks, applying hash-based deduplication to storage of the first one of the sub-blocks with respect to the other one of the sub-blocks; and
if the first one of the sub-blocks is similar to another one of the sub-blocks, applying sub-block differencing to storage of the first one of the sub-blocks with respect to the other one of the sub-blocks.
9 Assignments
0 Petitions
Accused Products
Abstract
In one embodiment, a method includes accessing data; partitioning the data into sub-blocks; determining whether a first one of the sub-blocks is identical to another one of the sub-blocks or similar to another one of the sub-blocks; if the first one of the sub-blocks is identical to another one of the sub-blocks, applying by the one or more computer systems hash-based deduplication to storage of the first one of the sub-blocks with respect to the other one of the sub-blocks; and, if the first one of the sub-blocks is similar to another one of the sub-blocks, applying by the one or more computer systems sub-block differencing to storage of the first one of the sub-blocks with respect to the other one of the sub-blocks.
112 Citations
22 Claims
-
1. A method comprising, by one or more computer systems:
-
accessing data; partitioning the data into a plurality of sub-blocks; determining whether a first one of the sub-blocks is identical to another one of the sub-blocks or similar to another one of the sub-blocks; if the first one of the sub-blocks is identical to another one of the sub-blocks, applying hash-based deduplication to storage of the first one of the sub-blocks with respect to the other one of the sub-blocks; and if the first one of the sub-blocks is similar to another one of the sub-blocks, applying sub-block differencing to storage of the first one of the sub-blocks with respect to the other one of the sub-blocks. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. One or more computer-readable storage media embodying instructions that are operable when executed by one or more computer systems to:
-
access data; partition the data into a plurality of sub-blocks; determine whether a first one of the sub-blocks is identical to another one of the sub-blocks or similar to another one of the sub-blocks; if the first one of the sub-blocks is identical to another one of the sub-blocks, apply hash-based deduplication to storage of the first one of the sub-blocks with respect to the other one of the sub-blocks; and if the first one of the sub-blocks is similar to another one of the sub-blocks, apply sub-block differencing to storage of the first one of the sub-blocks with respect to the other one of the sub-blocks. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A method comprising:
-
accessing by one or more computer systems a deduplicated version of data; and re-creating by the one or more computer systems the data from the deduplicated version using; results of hash-based deduplication applied to storage of sub-blocks of the data that are identical to other sub-blocks of the data; and results of sub-block differencing applied to storage of sub-blocks of the data that are similar to other ones of the sub-blocks.
-
-
16. One or more computer-readable storage media embodying instructions that are operable when executed by one or more computer systems to:
-
access a deduplicated version of data; and re-create the data from the deduplicated version using; results of hash-based deduplication applied to storage of sub-blocks of the data that are identical to other sub-blocks of the data; and results of sub-block differencing applied to storage of sub-blocks of the data that are similar to other ones of the sub-blocks.
-
-
17. One or more computer-readable storage media embodying data that was stored on the media at least in part by:
-
accessing the data by one or more computer systems; partitioning by the one or more computer systems the data into a plurality of sub-blocks; determining by the one or more computer systems whether a first one of the sub-blocks is identical to another one of the sub-blocks or similar to another one of the sub-blocks; if the first one of the sub-blocks is identical to another one of the sub-blocks, applying by the one or more computer systems hash-based deduplication to storage of the first one of the sub-blocks with respect to the other one of the sub-blocks; and if the first one of the sub-blocks is similar to another one of the sub-blocks, applying by the one or more computer systems sub-block differencing to storage of the first one of the sub-blocks with respect to the other one of the sub-blocks. - View Dependent Claims (18, 19, 20, 21, 22)
-
Specification