Efficient data storage using two level delta resemblance
First Claim
Patent Images
1. A method of storage using resemblance of data segments comprising:
- breaking up a new input data stream into a plurality of data segments, wherein a new segment comprises one of the plurality of data segments, and wherein the breaking up of the new input data stream is based at least in part on a hash of content of the data stream;
determining that the new segment resembles a second prior stored segment, wherein the second prior stored segment is represented as a first stored delta and a first prior stored segment, and wherein resembling is based at least in part on a comparison between a subset of values of a set of values calculated by applying one or more functions to at least a portion of the new segment and a plurality of corresponding values calculated by applying the one or more functions to at least a portion of the second prior stored segment, wherein the subset of values comprises one of the following;
a lowest n values of the set of values, a highest m values of the set of values, or a lowest k values of the set of values and a highest l values of the set of values, wherein n, m, k, and l are integers;
determining a second delta between the new segment and the second prior stored segment; and
storing a representation of the new segment based at least in part on the second delta.
12 Assignments
0 Petitions
Accused Products
Abstract
Storage using resemblance of data segments is disclosed. It is determined that a new segment resembles a second prior stored segment wherein the second prior stored segment is represented as a first stored delta and a first prior stored segment. A second delta between the new segment and the prior stored segment is determined. A representation of the new segment based at least in part on the second delta is stored.
24 Citations
31 Claims
-
1. A method of storage using resemblance of data segments comprising:
-
breaking up a new input data stream into a plurality of data segments, wherein a new segment comprises one of the plurality of data segments, and wherein the breaking up of the new input data stream is based at least in part on a hash of content of the data stream; determining that the new segment resembles a second prior stored segment, wherein the second prior stored segment is represented as a first stored delta and a first prior stored segment, and wherein resembling is based at least in part on a comparison between a subset of values of a set of values calculated by applying one or more functions to at least a portion of the new segment and a plurality of corresponding values calculated by applying the one or more functions to at least a portion of the second prior stored segment, wherein the subset of values comprises one of the following;
a lowest n values of the set of values, a highest m values of the set of values, or a lowest k values of the set of values and a highest l values of the set of values, wherein n, m, k, and l are integers;determining a second delta between the new segment and the second prior stored segment; and storing a representation of the new segment based at least in part on the second delta.
-
-
2. A non-transitory computer readable storage medium storing a computer program product comprising computer instructions for:
-
breaking up a new input data stream into a plurality of data segments, wherein a new segment comprises one of the plurality of data segments, and wherein the breaking up of the new input data stream is based at least in part on a hash of content of the data stream; determining that the new segment resembles a second prior stored segment, wherein the second prior stored segment is represented as a first stored delta and a first prior stored segment, and wherein resembling is based at least in part on a comparison between a subset of values of a set of values calculated by applying one or more functions to at least a portion of the new segment and a plurality of corresponding values calculated by applying the one or more functions to at least a portion of the second prior stored segment, wherein the subset of values comprises one of the following;
a lowest n values of the set of values, a highest m values of the set of values, or a lowest k values of the set of values and a highest l values of the set of values, wherein n, m, k, and l are integers;determining a second delta between the new segment and the second prior stored segment; and storing a representation of the new segment based at least in part on the second delta.
-
-
3. A system for storage using resemblance of data segments comprising:
-
a processor; and a memory coupled with the processor, wherein the memory is configured to provide the processor with instructions which when executed cause the processor to; break up a new input data stream into a plurality of data segments, wherein a new segment comprises one of the plurality of data segments, and wherein the breaking up of the new input data stream is based at least in part on a hash of content of the data stream; determine that the new segment resembles a second prior stored segment, wherein the second prior stored segment is represented as a first stored delta and a first prior stored segment, and wherein resembling is based at least in part on a comparison between a subset of values of a set of values calculated by applying one or more functions to at least a portion of the new segment and a plurality of corresponding values calculated by applying the one or more functions to at least a portion of the second prior stored segment, wherein the subset of values comprises one of the following;
a lowest n values of the set of values, a highest m values of the set of values, or a lowest k values of the set of values and a highest l values of the set of values, wherein n, m, k, and l are integers;determine a second delta between the new segment and the second prior stored segment; and store a representation of the new segment based at least in part on the second delta. - View Dependent Claims (4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31)
-
Specification