Storage of Data In A Distributed Storage System
First Claim
1. A distributed storage system for storing electronic data, comprising:
- a plurality of local instances, wherein a respective local instance includes a plurality of server computers, having memory and one or more processors storing one or more programs for execution by the one or more processors, and wherein at least a subset of the local instances are at physically distinct geographic locations; and
wherein each respective local instance is configured to;
store data for a respective non-empty set of blobs in a plurality of data stores having a plurality of distinct data store types; and
store metadata for the respective set of blobs in a metadata store distinct from the data stores; and
a plurality of global instances, wherein each respective global instance includes a plurality of server computers, having memory and one or more processors storing one or more programs for execution by the one or more processors; and
wherein each respective global instance is configured to;
store data for zero or more blobs in zero or more data stores; and
store metadata for all blobs stored at any local or global instance; and
wherein one global instance has a first replication module that replicates blobs between instances according to blob policies.
2 Assignments
0 Petitions
Accused Products
Abstract
A distributed storage system has multiple instances. There is a plurality of local instances, and at least some of the local instances are at physically distinct geographic locations. Each local instance is configured to store data for a non-empty set of blobs in a plurality of data stores having a plurality of distinct data store types. In addition, each local instance stores metadata for the respective set of blobs in a metadata store distinct from the data stores. There is also a plurality of global instances. Each global instance is configured to store data for zero or more blobs in zero or more data stores and store metadata for all blobs stored at any local or global instance. The system selects one global instance to run a replication module that replicates blobs between instances according to blob policies. Some systems also include dynamic replication based on user needs.
202 Citations
15 Claims
-
1. A distributed storage system for storing electronic data, comprising:
-
a plurality of local instances, wherein a respective local instance includes a plurality of server computers, having memory and one or more processors storing one or more programs for execution by the one or more processors, and wherein at least a subset of the local instances are at physically distinct geographic locations; and wherein each respective local instance is configured to; store data for a respective non-empty set of blobs in a plurality of data stores having a plurality of distinct data store types; and store metadata for the respective set of blobs in a metadata store distinct from the data stores; and a plurality of global instances, wherein each respective global instance includes a plurality of server computers, having memory and one or more processors storing one or more programs for execution by the one or more processors; and wherein each respective global instance is configured to; store data for zero or more blobs in zero or more data stores; and store metadata for all blobs stored at any local or global instance; and wherein one global instance has a first replication module that replicates blobs between instances according to blob policies. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A distributed storage system for storing electronic data, comprising:
-
a plurality of local instances, wherein a respective local instance includes a plurality of server computers, having memory and one or more processors storing one or more programs for execution by the one or more processors, and wherein at least a subset of the local instances are at physically distinct geographic locations; and wherein each respective local instance is configured to; store data for a respective non-empty set of blobs in a plurality of data stores having a plurality of distinct data store types; and store metadata for the respective set of blobs in a metadata store distinct from the data stores; and a plurality of global instances, wherein a respective global instance includes a plurality of server computers, having memory and one or more processors storing one or more programs for execution by the one or more processors; and wherein each respective global instance is configured to; store data for zero or more blobs in zero or more data stores; and store metadata for all blobs stored at any local or global instance; and wherein a respective local or global instance has a first replication module that dynamically replicates blobs from one local or global instance to another local or global instance based on user requests to access blobs that are not stored at a local or global instance near the user. - View Dependent Claims (9, 10, 11)
-
-
7. A distributed storage system for storing electronic data, comprising:
-
a plurality of instances, wherein a respective instance includes a plurality of server computers having memory and one or more processors storing one or more programs for execution by the one or more processors, and wherein at least a subset of the instances are at physically distinct geographic locations; wherein a respective instance stores data for a plurality of blobs; and wherein each blob has an associated blob policy that specifies the desired number of copies of the blob as well as the desired locations for copies of the blob; and a location assignment module configured to; compare a desired number of copies of each blob and desired location constraints for each blob to a current number of copies of each blob and current locations of copies of each blob to; and issue commands to delete a copy of a respective blob or to replicate a respective blob to another instance when the current number of copies of a respective blob and/or current locations of the respective blob are inconsistent with the desired number of copies of the respective blob or the desired location constraints of the respective blob. - View Dependent Claims (12, 13, 14, 15)
-
-
8. A method of reading a blob from a distributed storage system, comprising:
-
at a client, executing on a computer with one or more processors and memory storing one or more programs for execution by the one or more processors; receiving a request from a user application for a blob; locating an instance within the distributed storage system that is geographically close to the client; contacting a blob access module at the located instance to request metadata for the requested blob, the request including user access credentials; receiving from the blob access module a collection of metadata from the requested blob, and a set of one or more read tokens; selecting an instance that has a copy of the requested blob based on the received collection of metadata; contacting a data store module at the selected instance, including providing the data store module with the set of one or more read tokens; receiving the content of the requested blob in one or more chunks; assembling the one or more chunks to form the requested blob; and returning the blob to the user application.
-
Specification