Higher Efficiency Storage Replication Using Compression
First Claim
1. A multi-cluster synchronization system, comprising:
- a first cluster including a first cluster-internal network, the first cluster further including a first structured information repository and a first object storage, wherein the first structured information repository contains metadata corresponding to stored information objects in the first object storage, and wherein the first structured information repository and the first object storage are coupled via the first cluster-internal network;
an intercluster network coupling the first cluster and a remote cluster;
an intercluster repository synchronizer including a compression module, the compression module adapted to identify one or more files to compress and transmit to the remote cluster in compressed form.
4 Assignments
0 Petitions
Accused Products
Abstract
An improved scalable object storage system includes methods and systems allowing multiple clusters to work together. In one embodiment, there is a multi-cluster synchronization system between two or more clusters. The multi-cluster synchronization system uses variable compression to optimize the transfer of information between the clusters. Compression is used not only to minimize the total number of bytes sent between the two clusters, but to dynamically vary the size of the objects sent across the wire to optimize for higher throughput after considering packet loss, TCP windows, and block sizes. This includes both the packaging of multiple small files together into one larger compressed file, saving on TCP and header overhead, but also the chunking of large files into multiple smaller files that are less likely to have difficulties due to intermittent network congestion or errors.
82 Citations
18 Claims
-
1. A multi-cluster synchronization system, comprising:
-
a first cluster including a first cluster-internal network, the first cluster further including a first structured information repository and a first object storage, wherein the first structured information repository contains metadata corresponding to stored information objects in the first object storage, and wherein the first structured information repository and the first object storage are coupled via the first cluster-internal network; an intercluster network coupling the first cluster and a remote cluster; an intercluster repository synchronizer including a compression module, the compression module adapted to identify one or more files to compress and transmit to the remote cluster in compressed form. - View Dependent Claims (2, 3, 4, 5, 6, 7, 17, 18)
-
-
8. A method of synchronizing objects in an object storage system, the method comprising:
-
identifying a set of stored information objects at a first location to be transferred to a second location; analyzing the set of stored information objects to determine a compression scheme; selectively compressing the set of stored information objects according to the compression scheme; and transmitting the selectively compressed objects to the second location; wherein transmitting the selectively compressed objects results in the duplication of the set of stored information objects at the second location. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16)
-
Specification