CONTENT ALIGNED BLOCK-BASED DEDUPLICATION
First Claim
1. A method for refining criteria for determining deduplication block alignments within a data segment, the method comprising:
- selecting a first range of output values for a deduplication block alignment function which indicate that a block alignment has been found; and
iteratively performing a block alignment function on data within a sliding window in a data segment and, for each iterative performance of the block alignment function;
in response to determining with one or more computer processors whether the output of the block alignment function performed on a current window of data falls within the first range;
establishing a deduplication data block having a predetermined block size;
moving the sliding window in a first direction relative to the data segment by an amount based on the predetermined block size before performing the next iteration; and
in response to determining that the output of the block alignment function performed on the current window of data does not fall within the first range for a threshold number of iterations;
selecting a second range of output values for the block alignment function which indicate that a block alignment has been found; and
moving the sliding window over data relative to the data segment in a second direction opposite the first direction before performing the next iteration using the second range, wherein the next iteration is performed on data on which the block alignment function was previously performed using the first range.
4 Assignments
0 Petitions
Accused Products
Abstract
A content alignment system according to certain embodiments aligns a sliding window at the beginning of a data segment. The content alignment system performs a block alignment function on the data within the sliding window. A deduplication block is established if the output of the block alignment function meets a predetermined criteria. At least part of a gap is established if the output of the block alignment function does not meet the predetermined criteria. The predetermined criteria is changed if a threshold number of outputs fail to meet the predetermined criteria.
45 Citations
20 Claims
-
1. A method for refining criteria for determining deduplication block alignments within a data segment, the method comprising:
-
selecting a first range of output values for a deduplication block alignment function which indicate that a block alignment has been found; and iteratively performing a block alignment function on data within a sliding window in a data segment and, for each iterative performance of the block alignment function; in response to determining with one or more computer processors whether the output of the block alignment function performed on a current window of data falls within the first range; establishing a deduplication data block having a predetermined block size; moving the sliding window in a first direction relative to the data segment by an amount based on the predetermined block size before performing the next iteration; and in response to determining that the output of the block alignment function performed on the current window of data does not fall within the first range for a threshold number of iterations; selecting a second range of output values for the block alignment function which indicate that a block alignment has been found; and moving the sliding window over data relative to the data segment in a second direction opposite the first direction before performing the next iteration using the second range, wherein the next iteration is performed on data on which the block alignment function was previously performed using the first range. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A deduplication system configured determining deduplication block alignments within a data segment, the system comprising:
-
a block alignment module executing in one or more processors and configured to; select a first range of output values for a deduplication block alignment function which indicate that a block alignment has been found; and iteratively perform the deduplication block alignment function on data within a sliding window in a data segment and, for each iterative performance of the deduplication block alignment function, the block alignment module configured to; determine whether the output of the deduplication block alignment function performed on data within the sliding window falls within the first range; and establish a deduplication data block with a predetermined block size in response to determining that the output of the deduplication block alignment function performed on the data within the sliding window falls within the first range; and a criteria adjustment module configured to select a second range of output values for the block alignment function which indicate that a block alignment has been found, the selection of the second range in response to the block alignment module determining, for a threshold number of iterations, that the output of the block alignment performed on the data within the sliding window does fall within the first range, wherein the second range is used instead of the first range for subsequent iterations of the block alignment function. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. A method of determining deduplication block alignments within a data segment, the method comprising:
-
selecting a first range of possible output values of a deduplication block alignment function which indicate that a block alignment has been found; and iteratively performing the deduplication block alignment function on data within a sliding window in a data segment and, for each iterative performance of the deduplication block alignment function; determining with one or more computer processors whether the output of the deduplication block alignment function performed on the data within the sliding window falls within the first range; establishing with one or more computer processors a deduplication data block having a predetermined block size in response to determining that the output of the block alignment function falls within the first range; and selecting a second range of output values for the block alignment function which indicate that a block alignment has been found, the selection of the second range performed in response to determining, for a threshold number of iterations, that the output of the block alignment does not fall within the first range, wherein the second range is used instead of the first range for subsequent iterations of the block alignment function.
-
Specification