Content aligned block-based deduplication
First Claim
Patent Images
1. A method for defining deduplication block alignments within a data segment, the method comprising:
- iteratively performing a block alignment function on data within a sliding window in a data segment, and following each iterative performance of the block alignment function;
in response to determining that the output of the block alignment function performed on a current window of data satisfies a predetermined criteria;
establishing with one or more computer processors a deduplication data block having a predetermined block size; and
moving the sliding window relative to the data segment by an amount based on the predetermined block size before a next iterative performance of the block alignment function; and
in response to determining that the output of the block alignment function performed on the current window of data does not satisfy the predetermined criteria;
moving the sliding window relative to the data segment by a predetermined incremental amount that is different from the predetermined block size before the next iterative performance of the block alignment function and without establishing a deduplication data block,wherein gaps of data not belonging to any deduplication data block exist between at least some established deduplication data blocks following performance of the block alignment function across the data segment, andwherein the size of the gaps of data is based at least on a number of successive outputs of the block alignment function that do not satisfy the predetermined criteria.
4 Assignments
0 Petitions
Accused Products
Abstract
A content alignment system according to certain embodiments aligns a sliding window at the beginning of a data segment. The content alignment system performs a block alignment function on the data within the sliding window. A deduplication block is established if the output of the block alignment function meets a predetermined criteria. At least part of a gap is established if the output of the block alignment function does not meet the predetermined criteria. The predetermined criteria is changed if a threshold number of outputs fail to meet the predetermined criteria.
-
Citations
22 Claims
-
1. A method for defining deduplication block alignments within a data segment, the method comprising:
iteratively performing a block alignment function on data within a sliding window in a data segment, and following each iterative performance of the block alignment function; in response to determining that the output of the block alignment function performed on a current window of data satisfies a predetermined criteria; establishing with one or more computer processors a deduplication data block having a predetermined block size; and moving the sliding window relative to the data segment by an amount based on the predetermined block size before a next iterative performance of the block alignment function; and in response to determining that the output of the block alignment function performed on the current window of data does not satisfy the predetermined criteria; moving the sliding window relative to the data segment by a predetermined incremental amount that is different from the predetermined block size before the next iterative performance of the block alignment function and without establishing a deduplication data block, wherein gaps of data not belonging to any deduplication data block exist between at least some established deduplication data blocks following performance of the block alignment function across the data segment, and wherein the size of the gaps of data is based at least on a number of successive outputs of the block alignment function that do not satisfy the predetermined criteria. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
11. A deduplication system configured to define deduplication block alignments within a data segment, the system comprising:
a deduplication block alignment module executing in one or more processors and configured to iteratively perform a deduplication block alignment function on data within a sliding window in a data segment and which, for each iterative performance of the block alignment function, is configured to; establish a deduplication block having a predetermined block size in response to determining that the output of the deduplication block alignment function performed on the data within the sliding window satisfies a predetermined criteria; and define at least a portion of data having a predetermined incremental size that is different from the predetermined block size as not belonging to a deduplication block in response to determining that the output of the block alignment function performed on the data within the sliding window does not satisfy the predetermined criteria, wherein the predetermined incremental size is determined prior to determining that the output of the block alignment function performed on the data within the sliding window does not satisy the predetermined criteria. - View Dependent Claims (12, 13, 14, 15, 16)
-
17. A method for defining deduplication block alignments within a data segment, the method comprising:
-
iteratively performing a deduplication block alignment function on data within a sliding window in a data segment and, for each iterative performance of the deduplication block alignment function; establishing with one or more computer processors a deduplication data block having a predetermined block size in response to determining that the output of the deduplication block alignment function performed on the data within the sliding window satisfies a predetermined criteria; and defining at least a portion of a gap of data having a predetermined incremental size that is different from the predetermined block size as not belonging to a deduplication data block in response to determining that the output of the deduplication block alignment function performed on the data within the sliding window does not satisfy the predetermined criteria, wherein the predetermined incremental size is determined prior to determining that the output of the block alignment function performed on the data within the sliding window does not satisfy the predetermined criteria. - View Dependent Claims (18, 19, 20, 21)
-
-
22. A deduplication system configured to define deduplication block alignments within a data segment, the system comprising:
-
a deduplication block alignment module executing in one or more processors and configured to iteratively perform a deduplication block alignment function on data within a sliding window in a data segment and which, for each iterative performance of the block alignment function, is configured to; in response to a determination that the output of the block alignment function performed on a current window of data satisfies a predetermined criteria; establish a deduplication block having a predetermined block size, and move the sliding window relative to the data segment by an amount based on the predetermined block size before a next iterative performance of the block alignment function; and in response to a determination that the output of the block alignment function performed on the current window of data does not satisfy the predetermined criteria; move the sliding window relative to the data segment by a predetermined incremental amount that is different from the predetermined block size before the next iterative performance of the block alignment function and without establishing a deduplication data block, wherein gaps of data not belonging to any deduplication data block exist between at least some established deduplication data blocks following performance of the block alignment function across the data segment, and wherein the size of the gaps of data is based at least on a number of successive outputs of the block alignment function that do not satisfy the predetermined criteria.
-
Specification