Technique selection in a deduplication aware client environment
First Claim
1. A method comprising:
- determining whether an amount of available computing resources at a client device comprising a processor and memory meets or exceeds a computing resource availability threshold at the client device;
performing a processing operation on a data stream at the client device to produce a pre-processed data stream when it is determined that the amount of available computing resources meets or exceeds the computing resource availability threshold, wherein the processing operating comprises parsing the data stream to identify one or more data chunks, and wherein the one or more data chunks are identified via a designated rolling hash parsing technique operable to identify at least some identical chunks when parsing different but overlapping data streams, wherein the one or more data chunks are identified by;
determining, using the rolling hash parsing technique, a hash value for a first chunk of the at least some identical chunks;
determining that the hash value for the first chunk qualifies as a chunk boundary; and
in response to determining that the hash value for the first chunk qualifies as the chunk boundary, determining that the data stream has reached a chunk boundary; and
transmitting the pre-processed data stream for storage to a networked storage system via a network, the networked storage system operable to store deduplicated data for retrieval via the network, the networked storage system operable to parse data streams via the designated rolling hash parsing technique.
23 Assignments
0 Petitions
Accused Products
Abstract
Techniques and mechanisms described herein facilitate the transmission of a data stream to a networked storage system. According to various embodiments, a determination may be made as to whether an amount of available computing resources at a client device meets or exceeds a computing resource availability threshold at the client device. A processing operation on a data stream may be performed at the client device to produce a pre-processed data stream when the amount of available computing resources meets or exceeds the computing resource availability threshold. The pre-processed data stream may be transmitted to a networked storage system for storage via a network. The networked storage system may be operable to store deduplicated data for retrieval via the network.
58 Citations
17 Claims
-
1. A method comprising:
-
determining whether an amount of available computing resources at a client device comprising a processor and memory meets or exceeds a computing resource availability threshold at the client device; performing a processing operation on a data stream at the client device to produce a pre-processed data stream when it is determined that the amount of available computing resources meets or exceeds the computing resource availability threshold, wherein the processing operating comprises parsing the data stream to identify one or more data chunks, and wherein the one or more data chunks are identified via a designated rolling hash parsing technique operable to identify at least some identical chunks when parsing different but overlapping data streams, wherein the one or more data chunks are identified by; determining, using the rolling hash parsing technique, a hash value for a first chunk of the at least some identical chunks; determining that the hash value for the first chunk qualifies as a chunk boundary; and in response to determining that the hash value for the first chunk qualifies as the chunk boundary, determining that the data stream has reached a chunk boundary; and transmitting the pre-processed data stream for storage to a networked storage system via a network, the networked storage system operable to store deduplicated data for retrieval via the network, the networked storage system operable to parse data streams via the designated rolling hash parsing technique. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A device comprising:
-
memory operable to store a data stream; a processor operable to determine whether an amount of available computing resources at a client device meets or exceeds a computing resource availability threshold at the client device and perform a processing operation on the data stream to produce a pre-processed data stream when it is determined that the amount of available computing resources meets or exceeds the computing resource availability threshold, wherein the processing operating comprises parsing the data stream to identify one or more data chunks, and wherein the one or more data chunks are identified via a designated rolling hash parsing technique operable to identify at least some identical chunks when parsing different but overlapping data streams, wherein the one or more data chunks are identified by; determining, using the rolling hash parsing technique, a hash value for a first chunk of the at least some identical chunks; determining that the hash value for the first chunk qualifies as a chunk boundary; and in response to determining that the hash value for the first chunk qualifies as the chunk boundary, determining that the data stream has reached a chunk boundary; and a communications interface operable to transmit the pre-processed data stream for storage to a networked storage system via a network, the networked storage system operable to store deduplicated data for retrieval via the network, the networked storage system operable to parse data streams via the designated rolling hash parsing technique. - View Dependent Claims (13, 14, 15, 16)
-
-
17. One or more non-transitory computer readable media having instructions stored thereon to perform operations comprising:
-
determining whether an amount of available computing resources at a client device comprising a processor and memory meets or exceeds a computing resource availability threshold at the client device; performing a processing operation on a data stream at the client device to produce a pre-processed data stream when it is determined that the amount of available computing resources meets or exceeds the computing resource availability threshold, wherein the processing operating comprises parsing the data stream to identify one or more data chunks, and wherein the one or more data chunks are identified via a designated rolling hash parsing technique operable to identify at least some identical chunks when parsing different but overlapping data streams, wherein the one or more data chunks are identified by; determining, using the rolling hash parsing technique, a hash value for a first chunk of the at least some identical chunks; determining that the hash value for the first chunk qualifies as a chunk boundary; and in response to determining that the hash value for the first chunk qualifies as the chunk boundary, determining that the data stream has reached a chunk boundary; and transmitting the pre-processed data stream for storage to a networked storage system via a network, the networked storage system operable to store deduplicated data for retrieval via the network, the networked storage system operable to parse data streams via the designated rolling hash parsing technique.
-
Specification