Content aligned block-based deduplication
First Claim
Patent Images
1. A method for refining criteria for determining deduplication block alignments within a data segment, the method comprising:
- selecting a first range of output values for a deduplication block alignment function which indicate that a block alignment has been found; and
iteratively performing the block alignment function using one or more computer processors on data within a sliding window in the data segment, wherein the sliding window comprises a sliding boundary, and for each iterative performance of the block alignment function;
in response to determining with the one or more computer processors that an output of the block alignment function performed on a current window of data of the data segment falls within the first range;
establishing a deduplication data block having a predetermined block size;
moving the sliding window in a first direction relative to the data segment by an amount based on the predetermined block size before performing a next iteration; and
in response to determining that the output of the block alignment function performed on the current window of data does not fall within the first range for a threshold number of iterations;
selecting a second range of output values for the block alignment function which indicate that a block alignment has been found; and
moving the sliding window relative to the data segment in a second direction opposite to the first direction before performing the next iteration using the second range, wherein the next iteration is performed on data on which the block alignment function was previously performed using the first range.
4 Assignments
0 Petitions
Accused Products
Abstract
A content alignment system according to certain embodiments aligns a sliding window at the beginning of a data segment. The content alignment system performs a block alignment function on the data within the sliding window. A deduplication block is established if the output of the block alignment function meets a predetermined criteria. At least part of a gap is established if the output of the block alignment function does not meet the predetermined criteria. The predetermined criteria is changed if a threshold number of outputs fail to meet the predetermined criteria.
787 Citations
19 Claims
-
1. A method for refining criteria for determining deduplication block alignments within a data segment, the method comprising:
-
selecting a first range of output values for a deduplication block alignment function which indicate that a block alignment has been found; and iteratively performing the block alignment function using one or more computer processors on data within a sliding window in the data segment, wherein the sliding window comprises a sliding boundary, and for each iterative performance of the block alignment function; in response to determining with the one or more computer processors that an output of the block alignment function performed on a current window of data of the data segment falls within the first range; establishing a deduplication data block having a predetermined block size; moving the sliding window in a first direction relative to the data segment by an amount based on the predetermined block size before performing a next iteration; and in response to determining that the output of the block alignment function performed on the current window of data does not fall within the first range for a threshold number of iterations; selecting a second range of output values for the block alignment function which indicate that a block alignment has been found; and moving the sliding window relative to the data segment in a second direction opposite to the first direction before performing the next iteration using the second range, wherein the next iteration is performed on data on which the block alignment function was previously performed using the first range. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A deduplication system for determining deduplication block alignments within a data segment, the system comprising:
-
one or more computer processors; a block alignment module being executed by the one or more computer processors and configured to; select a first range of output values for a deduplication block alignment function which indicates that a block alignment has been found; and iteratively perform the block alignment function using one or more computer processors on data within a sliding window in the data segment, wherein the sliding window comprises a sliding boundary, and for each iterative performance of the block alignment function, the block alignment module is configured to; determine whether an output of the block alignment function performed on a current window of data falls within the first range, establish a deduplication data block having a predetermined block size in response to determining that the output of the block alignment function performed on the current window of data falls within the first range; move the sliding window in a first direction relative to the data segment by an amount based on the predetermined block size before performing a next iteration; and a criteria adjustment module configured to; select a second range of output values for the block alignment function which indicates that a block alignment has been found in response to determining that the output of the block alignment function performed on the current window of data does not fall within the first range for a threshold number of iterations; and move the sliding window relative to the data segment in a second direction opposite to the first direction before performing the next iteration using the second range, wherein the next iteration is performed on data on which the block alignment function was previously performed using the first range. - View Dependent Claims (15, 16, 17, 18, 19)
-
Specification