Storage of data in a distributed storage system
First Claim
1. A distributed storage system for storing blobs, comprising:
- a plurality of local instances, wherein each respective local instance includes a plurality of server computers, having memory and one or more processors storing one or more programs for execution by the one or more processors, and wherein at least a subset of the local instances are at physically distinct geographic locations; and
a plurality of global instances distinct from the local instances, wherein each respective global instance includes a plurality of server computers, having memory and one or more processors storing one or more programs for execution by the one or more processors;
wherein each blob stored in the distributed storage system has corresponding metadata, and wherein the metadata includes information identifying a set of one or more local instances or global instances where the respective blob is stored;
wherein each respective local instance;
stores data for a respective set of blobs in a plurality of data stores having a plurality of distinct data store types, wherein the respective set of blobs is non-empty and is a subset of all blobs stored in the distributed storage system; and
stores the metadata for the respective set of blobs in a metadata store distinct from the data stores;
wherein each respective global instance stores the metadata for all blobs stored at all local instances; and
wherein one global instance has a location assignment daemon, and wherein for each blob the location assignment daemon;
determines locations of respective local instances currently storing replicas of the blob;
determines a number of target replica locations for the blob according to a respective placement policy assigned to the blob; and
upon determining the respective local instances currently storing replicas of the blob are fewer than the target number of replica locations, issuing a replication command to a first local instance to create a new replica of the blob at a second instance in accordance with the respective placement policy.
1 Assignment
0 Petitions
Accused Products
Abstract
A distributed storage system has multiple instances. There is a plurality of local instances, and at least some of the local instances are at physically distinct geographic locations. Each local instance is configured to store data for a non-empty set of blobs in a plurality of data stores having a plurality of distinct data store types. In addition, each local instance stores metadata for the respective set of blobs in a metadata store distinct from the data stores. There is also a plurality of global instances. Each global instance is configured to store data for zero or more blobs in zero or more data stores and store metadata for all blobs stored at any local or global instance. The system selects one global instance to run a replication module that replicates blobs between instances according to blob policies. Some systems also include dynamic replication based on user needs.
-
Citations
15 Claims
-
1. A distributed storage system for storing blobs, comprising:
-
a plurality of local instances, wherein each respective local instance includes a plurality of server computers, having memory and one or more processors storing one or more programs for execution by the one or more processors, and wherein at least a subset of the local instances are at physically distinct geographic locations; and a plurality of global instances distinct from the local instances, wherein each respective global instance includes a plurality of server computers, having memory and one or more processors storing one or more programs for execution by the one or more processors; wherein each blob stored in the distributed storage system has corresponding metadata, and wherein the metadata includes information identifying a set of one or more local instances or global instances where the respective blob is stored; wherein each respective local instance; stores data for a respective set of blobs in a plurality of data stores having a plurality of distinct data store types, wherein the respective set of blobs is non-empty and is a subset of all blobs stored in the distributed storage system; and stores the metadata for the respective set of blobs in a metadata store distinct from the data stores; wherein each respective global instance stores the metadata for all blobs stored at all local instances; and wherein one global instance has a location assignment daemon, and wherein for each blob the location assignment daemon; determines locations of respective local instances currently storing replicas of the blob; determines a number of target replica locations for the blob according to a respective placement policy assigned to the blob; and upon determining the respective local instances currently storing replicas of the blob are fewer than the target number of replica locations, issuing a replication command to a first local instance to create a new replica of the blob at a second instance in accordance with the respective placement policy. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method, comprising:
-
at a server having one or more processors and memory storing programs configured for execution by the one or more processors; managing storage of blobs in a distributed storage system; wherein the distributed storage system includes a plurality of local instances, wherein each respective local instance includes a plurality of server computers, each having memory and one or more processors storing one or more programs for execution by the one or more processors, and wherein at least a subset of the local instances are at physically distinct geographic locations; wherein the distributed storage system includes a plurality of global instances distinct from the local instances, wherein each respective global instance includes a plurality of server computers, having memory and one or more processors storing one or more programs for execution by the one or more processors; wherein each blob stored in the distributed storage system has corresponding metadata, and wherein the metadata includes information identifying a set of one or more local instances or global instances where the respective blob is stored; wherein each respective local instance; stores data for a respective set of blobs in a plurality of data stores having a plurality of distinct data store types, wherein the respective set of blobs is non-empty and is a subset of all blobs stored in the distributed storage system; and stores the metadata for the respective set of blobs in a metadata store distinct from the data stores; wherein each respective global instance stores the metadata for all blobs stored at all local instances; and wherein one global instance has a location assignment daemon, and wherein for each blob the location assignment daemon; determines locations of respective local instances currently storing replicas of the blob; determines a number of target replica locations for the blob according to a respective placement policy assigned to the blob; and upon determining the respective local instances currently storing replicas of the blob are fewer than the target number of replica locations, issuing a replication command to a first local instance to create a new replica of the blob at a second instance in accordance with the respective placement policy. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A non-transitory computer readable storage medium storing one or more programs configured for execution by a server having one or more processors and memory, the one or more programs comprising instructions to perform:
managing storage of blobs in a distributed storage system; wherein the distributed storage system includes a plurality of local instances, wherein each respective local instance includes a plurality of server computers, each having memory and one or more processors storing one or more programs for execution by the one or more processors, and wherein at least a subset of the local instances are at physically distinct geographic locations; wherein the distributed storage system includes a plurality of global instances distinct from the local instances, wherein each respective global instance includes a plurality of server computers, having memory and one or more processors storing one or more programs for execution by the one or more processors; wherein each blob stored in the distributed storage system has corresponding metadata, and wherein the metadata includes information identifying a set of one or more local instances or global instances where the respective blob is stored; wherein each respective local instance; stores data for a respective set of blobs in a plurality of data stores having a plurality of distinct data store types, wherein the respective set of blobs is non-empty and is a subset of all blobs stored in the distributed storage system; and stores the metadata for the respective set of blobs in a metadata store distinct from the data stores; wherein each respective global instance stores the metadata for all blobs stored at all local instances; and wherein one global instance has a location assignment daemon, and wherein for each blob the location assignment daemon; determines locations of respective local instances currently storing replicas of the blob; determines a number of target replica locations for the blob according to a respective placement policy assigned to the blob; and upon determining the respective local instances currently storing replicas of the blob are fewer than the target number of replica locations, issuing a replication command to a first local instance to create a new replica of the blob at a second instance in accordance with the respective placement policy. - View Dependent Claims (12, 13, 14, 15)
Specification