Single-ended deduplication using cloud storage protocol
First Claim
1. A method of transferring data to a remote data storage, the method comprising:
- dividing, at a first network location, at least a portion of a first data stream into segments;
identifying a first segment in the first data stream that is a duplicate of a second segment in a second data stream, wherein the second data stream is stored in a remote data storage at a second network location;
removing, at the first network location, at least the first segment from the first data stream to form an optimized data stream;
recording, at the first network location, an identity and location of at least the first segment within the first data stream;
transferring the optimized data stream from the first network location to the remote data storage at the second network location;
generating, at the first network location, a copy command that at least identifies the second data stream, a source location of the second segment within the second data stream, a source length of the second segment, and a destination location of the removed first segment within the optimized data stream, wherein the recorded identity and location of the first segment within the first data stream is used to generate the copy command; and
sending the copy command from the first network location to the remote data storage at the second network location, wherein the remote data storage executes the copy command which causes the remote data storage to copy the second segment from the source location in the second data stream to the destination location in the optimized data stream, thereby reconstructing the first data stream in the remote data storage at the second network location without transferring any portion of the optimized data stream or the first segment back to the first network location, and wherein the remote data storage does not require an optimization device to reconstruct the first data stream.
19 Assignments
0 Petitions
Accused Products
Abstract
A single-ended optimized storage protocol enables storage clients or other devices to direct a remote data storage to copy data. In response to commands via the protocol, a remote data storage can copy portions of a data stream at the remote data storage to destination storage locations within the same or a different data stream. The protocol may be utilized for optimized transfer of data via a network to a remote data storage. An initial data stream is divided into segments. Redundant segments are removed from the data stream to form an optimized data stream, which is transferred to the remote data storage. Commands are issued to the remote data storage using the protocol to direct the remote data storage to reconstruct the initial data stream at the remote data storage using the optimized data stream and optionally segments from other data streams previously transferred to the remote data storage.
-
Citations
26 Claims
-
1. A method of transferring data to a remote data storage, the method comprising:
-
dividing, at a first network location, at least a portion of a first data stream into segments; identifying a first segment in the first data stream that is a duplicate of a second segment in a second data stream, wherein the second data stream is stored in a remote data storage at a second network location; removing, at the first network location, at least the first segment from the first data stream to form an optimized data stream; recording, at the first network location, an identity and location of at least the first segment within the first data stream; transferring the optimized data stream from the first network location to the remote data storage at the second network location; generating, at the first network location, a copy command that at least identifies the second data stream, a source location of the second segment within the second data stream, a source length of the second segment, and a destination location of the removed first segment within the optimized data stream, wherein the recorded identity and location of the first segment within the first data stream is used to generate the copy command; and sending the copy command from the first network location to the remote data storage at the second network location, wherein the remote data storage executes the copy command which causes the remote data storage to copy the second segment from the source location in the second data stream to the destination location in the optimized data stream, thereby reconstructing the first data stream in the remote data storage at the second network location without transferring any portion of the optimized data stream or the first segment back to the first network location, and wherein the remote data storage does not require an optimization device to reconstruct the first data stream. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A non-transitory computer-readable storage medium storing instructions that, when executed by a computer, cause the computer to perform a method for transferring data to a remote data storage, the method comprising:
-
dividing, at a first network location, at least a portion of a first data stream into segments; identifying a first segment in the first data stream that is a duplicate of a second segment in a second data stream, wherein the second data stream is stored in a remote data storage at a second network location; removing, at the first network location, at least the first segment from the first data stream to form an optimized data stream;
recording, at the first network location, an identity and location of at least the first segment within the first data stream;transferring the optimized data stream from the first network location to the remote data storage at a second network location; generating, at the first network location, a copy command that at least identifies data stream, a source location of the second segment within the second data stream, a source length of the second segment, and a destination location of the removed first segment within the optimized data stream, wherein the recorded identity and location of the first segment within the first data stream is used to generate the copy command; and sending the copy command from the first network location to the remote data storage at the second network location, wherein the remote data storage executes the copy command which causes the remote data storage to copy the second segment from the source location in the second data stream to the destination location in the optimized data stream, thereby reconstructing the first data stream in the remote data storage at the second network location without transferring any portion of the optimized data stream or the first segment back to the first network location, and wherein the remote data storage does not require an optimization device to reconstruct the first data stream.
-
-
26. An apparatus for transferring data to a remote data storage, the apparatus comprising:
-
a processor; and a memory storing instructions executable by the processor, the instructions comprising; instructions to divide, at a first network location, at least a portion of a first data stream into segments; instructions to identify a first segment in the first data stream that is a duplicate of a second segment in a second data stream, wherein the second data stream is stored in a remote data storage at a second network location; instructions to remove, at the first network location, at least the first segment from the first data stream to form an optimized data stream; instructions to record, at the first network location, an identity and location of at least the first segment within the first data stream; instructions to transfer the optimized data stream from the first network location to the remote data storage at the second network location; instructions to generate, at the first network location, a copy command that at least identifies the second data stream, a source location of the second segment within the second data stream, a source length of the second segment, and a destination location within the optimized data stream, wherein the recorded identity and location of the first segment within the first data stream is used to generate the copy command; and instructions to send the copy command from the first network location to the remote data storage, wherein the remote data storage executes the copy command which causes the remote data storage to copy the second segment from the source location in the second data stream to the destination location in the optimized data stream, thereby reconstructing the first data stream in the remote data storage at the second network location without transferring any portion of the optimized data stream or the first segment back to the first network location, and wherein the remote data storage does not require an optimization device to reconstruct the first data stream.
-
Specification