Storage of data in a distributed storage system
First Claim
1. A distributed storage system for storing blobs, comprising:
- a plurality of instances, wherein each respective instance includes a plurality of server computers having memory and one or more processors storing one or more programs for execution by the one or more processors, and wherein at least a subset of the instances are at physically distinct geographic locations;
wherein a respective instance stores data for a plurality of blobs; and
wherein each respective blob has corresponding metadata that includes information that identifies a set of one or more instances where the respective blob is stored and includes a blob placement policy that specifies the desired number of copies of the respective blob as well as the desired locations for copies of the respective blob; and
a location assignment module configured to;
compare the desired number of copies of each respective blob and the desired locations for copies of the respective blob, as specified in the respective blob placement policy for the respective blob, to a current number of copies of the respective blob and current locations of copies of the respective blob; and
issue commands to delete a copy of the respective blob or to replicate the respective blob to another instance in response to determining that the current number of copies of the respective blob is inconsistent with the desired number of copies of the respective blob or the current locations of the respective blob are inconsistent with the desired locations for copies of the respective blob, as specified in the respective blob placement policy for the respective blob.
2 Assignments
0 Petitions
Accused Products
Abstract
A distributed storage system has multiple instances. There is a plurality of local instances, and at least some of the local instances are at physically distinct geographic locations. Each local instance is configured to store data for a non-empty set of blobs in a plurality of data stores having a plurality of distinct data store types. In addition, each local instance stores metadata for the respective set of blobs in a metadata store distinct from the data stores. There is also a plurality of global instances. Each global instance is configured to store data for zero or more blobs in zero or more data stores and store metadata for all blobs stored at any local or global instance. The system selects one global instance to run a replication module that replicates blobs between instances according to blob policies.
-
Citations
12 Claims
-
1. A distributed storage system for storing blobs, comprising:
-
a plurality of instances, wherein each respective instance includes a plurality of server computers having memory and one or more processors storing one or more programs for execution by the one or more processors, and wherein at least a subset of the instances are at physically distinct geographic locations; wherein a respective instance stores data for a plurality of blobs; and wherein each respective blob has corresponding metadata that includes information that identifies a set of one or more instances where the respective blob is stored and includes a blob placement policy that specifies the desired number of copies of the respective blob as well as the desired locations for copies of the respective blob; and a location assignment module configured to; compare the desired number of copies of each respective blob and the desired locations for copies of the respective blob, as specified in the respective blob placement policy for the respective blob, to a current number of copies of the respective blob and current locations of copies of the respective blob; and issue commands to delete a copy of the respective blob or to replicate the respective blob to another instance in response to determining that the current number of copies of the respective blob is inconsistent with the desired number of copies of the respective blob or the current locations of the respective blob are inconsistent with the desired locations for copies of the respective blob, as specified in the respective blob placement policy for the respective blob. - View Dependent Claims (2, 3, 4)
-
-
5. A method, comprising:
at a server having one or more processors and memory storing programs configured for execution by the one or more processors to manage storage of blobs in a distributed storage system, wherein the distributed storage system includes a plurality of instances, at least a subset of the instances are at physically distinct geographic locations, each respective instance stores data for a plurality of blobs, and each respective blob has corresponding metadata that includes information that identifies a set of one or more instances where the respective blob is stored and includes a blob placement policy that specifies the desired number of copies of the respective blob as well as the desired locations for copies of the respective blob; comparing the desired number of copies of each respective blob and the desired locations for copies of the respective blob, as specified in the respective blob placement policy for the respective blob, to a current number of copies of the respective blob and current locations of copies of the respective blob; and issue commands to delete a copy of the respective blob or to replicate the respective blob to another instance in response to determining that the current number of copies of the respective blob is inconsistent with the desired number of copies of the respective blob or the current locations of the respective blob are inconsistent with the desired locations for copies of the respective blob, as specified in the respective blob placement policy for the respective blob. - View Dependent Claims (6, 7, 8)
-
9. A non-transitory computer readable storage medium storing one or more programs configured for execution by a server having one or more processors and memory, the one or more programs comprising instructions for:
-
managing storage of blobs in a distributed storage system, wherein the distributed storage system includes a plurality of instances, at least a subset of the instances are at physically distinct geographic locations, each respective instance stores data for a plurality of blobs, and each respective blob has corresponding metadata that includes information that identifies a set of one or more instances where the respective blob is stored and includes a blob placement policy that specifies the desired number of copies of the respective blob as well as the desired locations for copies of the respective blob; and wherein managing storage of the blobs includes; comparing the desired number of copies of each respective blob and the desired locations for copies of the respective blob, as specified in the respective blob placement policy for the respective blob, to a current number of copies of the respective blob and current locations of copies of the respective blob; and issuing commands to delete a copy of the respective blob or to replicate the respective blob to another instance in response to determining that the current number of copies of the respective blob is inconsistent with the desired number of copies of the respective blob or the current locations of the respective blob are inconsistent with the desired locations for copies of the respective blob, as specified in the respective blob placement policy for the respective blob. - View Dependent Claims (10, 11, 12)
-
Specification