DISTRIBUTED DATA DEDUPLICATION IN ENTERPRISE NETWORKS
First Claim
Patent Images
1. A method of providing distributed data deduplication in enterprise network, comprising:
- receiving a byte stream by a controller of a byte cache, the byte cache being one of a plurality of byte caches in the enterprise network;
encoding the byte stream by the controller by generating one or more hash values associated with one or more regions of the byte stream;
storing the one or more hash values and associated one or more regions in a storage of the byte cache if the one or more hash values and associated one or more regions do not exist in the storage of the byte cache;
querying a container logic associated with the byte cache to determine which of the one or more hash values to send;
responsive to a response from the container indicating that the one or more hash values do not exist in other byte caches in the enterprise network, attaching all of the one or more hash values and the associated one or more regions to an output stream;
responsive to a response from the container including a hash value and byte cache identifier pair indicating that the hash value exists in a byte cache identified by the byte cache identifier, attaching the hash value and byte cache identifier pair received from the container in the output stream along with non-redundant data of the byte stream and said one or more hash values not identified in the response from the container;
creating a transmission control protocol connection to a receiving byte cache in the enterprise network; and
transmitting the output stream to the receiving byte cache.
1 Assignment
0 Petitions
Accused Products
Abstract
Distributed data deduplication may include or utilize containers attached to nodes or byte caches in a cluster or enterprise networks. The containers may store a mapping of byte caches and hashes the byte caches hold. An encoding byte cache may communicate with its attached container to determine which nodes should send which hash values, and may encode an output stream accordingly. Decoding byte cache decompresses the output stream by communicating with its attached container for receiving hash values and associated content from one or more byte caches specified in the output stream.
-
Citations
20 Claims
-
1. A method of providing distributed data deduplication in enterprise network, comprising:
-
receiving a byte stream by a controller of a byte cache, the byte cache being one of a plurality of byte caches in the enterprise network; encoding the byte stream by the controller by generating one or more hash values associated with one or more regions of the byte stream; storing the one or more hash values and associated one or more regions in a storage of the byte cache if the one or more hash values and associated one or more regions do not exist in the storage of the byte cache; querying a container logic associated with the byte cache to determine which of the one or more hash values to send; responsive to a response from the container indicating that the one or more hash values do not exist in other byte caches in the enterprise network, attaching all of the one or more hash values and the associated one or more regions to an output stream; responsive to a response from the container including a hash value and byte cache identifier pair indicating that the hash value exists in a byte cache identified by the byte cache identifier, attaching the hash value and byte cache identifier pair received from the container in the output stream along with non-redundant data of the byte stream and said one or more hash values not identified in the response from the container; creating a transmission control protocol connection to a receiving byte cache in the enterprise network; and transmitting the output stream to the receiving byte cache. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer readable storage medium storing a program of instructions executable by a machine to perform a method of providing distributed data deduplication in enterprise network, the method comprising:
-
receiving a byte stream by a controller of a byte cache, the byte cache being one of a plurality of byte caches in the enterprise network; encoding the byte stream by the controller by generating one or more hash values associated with one or more regions of the byte stream; storing the one or more hash values and associated one or more regions in a storage of the byte cache if the one or more hash values and associated one or more regions do not exist in the storage of the byte cache; querying a container logic associated with the byte cache to determine which of the one or more hash values to send; responsive to a response from the container indicating that the one or more hash values do not exist in other byte caches in the enterprise network, attaching all of the one or more hash values and the associated one or more regions to an output stream; responsive to a response from the container including a hash value and byte cache identifier pair indicating that the hash value exists in a byte cache identified by the byte cache identifier, attaching the hash value and byte cache identifier pair received from the container in the output stream along with non-redundant data of the byte stream and said one or more hash values not identified in the response from the container; creating a transmission control protocol connection to a receiving byte cache in the enterprise network; and transmitting the output stream to the receiving byte cache. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A system of providing distributed data deduplication in enterprise network, comprising:
-
a byte cache comprising a controller logic and memory, the byte cache being one of a plurality of byte caches in the enterprise network, the controller logic of the byte cache operable to receive a byte stream and encode the byte stream by generating one or more hash values associated with one or more regions of the byte stream, the controller logic of the byte cache further operable to store the one or more hash values and associated one or more regions in the memory if the one or more hash values and associated one or more regions do not exist in the memory; and a container connected to the byte cache, the container comprising container logic and container memory, the container memory operable to store a map containing hash value to byte cache identifier mappings indicating which byte caches of the enterprise network store which hash values and associated content, the container operable to receive a query from the byte cache controller requesting which of the one or more hash values to send, the container further operable to send to the byte cache a reply to the query based on the map, responsive to a response from the container indicating that the one or more hash values do not exist in other byte caches in the enterprise network, the controller logic of the byte cache further operable to attach all of the one or more hash values and the associated one or more regions to an output stream; responsive to a response from the container including a hash value and byte cache identifier pair indicating that the hash value exists in another byte cache identified by the byte cache identifier, the controller logic of the byte cache further operable to attach the hash value and byte cache identifier pair received from the container in the output stream along with non-redundant data of the byte stream and said one or more hash values not identified in the response from the container, wherein the output stream is transmitted via a transmission control protocol connection to a receiving byte cache in the enterprise network. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification