MASSIVELY SCALABLE OBJECT STORAGE
First Claim
1. A method for storing data, comprising:
- providing a plurality of physical storage pools, each storage pool including a plurality of storage nodes coupled to a network, each storage node further providing a non-transient computer readable medium for data storage;
classifying a plurality of availability zones, wherein the storage nodes within an availability zone are subject to a correlated loss of access to stored data;
defining a plurality of abstract partitions, wherein each possible input data management request deterministically corresponds to one of the plurality of abstract partitions;
mapping the plurality of abstract partitions to the plurality of physical storage pools such that each mapped physical storage pool includes a replica of the data associated with the associated mapped abstract partition, and each replica for a particular abstract partition is mapped to a physical storage pool in a different availability zone;
receiving a data management request over the network, the data management request associated with a data object;
identifying a first partition corresponding to the received data management request; and
manipulating the data object in the physical storage pools mapped to the first partition in accordance with the data management request.
7 Assignments
0 Petitions
Accused Products
Abstract
Several different embodiments of a massively scalable object storage system are described. The object storage system is particularly useful for storage in a cloud computing installation whereby shared servers provide resources, software, and data to computers and other devices on demand. In several embodiments, the object storage system includes a ring implementation used to associate object storage commands with particular physical servers such that certain guarantees of consistency, availability, and performance can be met. In other embodiments, the object storage system includes a synchronization protocol used to order operations across a distributed system. In a third set of embodiments, the object storage system includes a metadata management system. In a fourth set of embodiments, the object storage system uses a structured information synchronization system. Features from each set of embodiments can be used to improve the performance and scalability of a cloud computing object storage system.
-
Citations
23 Claims
-
1. A method for storing data, comprising:
-
providing a plurality of physical storage pools, each storage pool including a plurality of storage nodes coupled to a network, each storage node further providing a non-transient computer readable medium for data storage; classifying a plurality of availability zones, wherein the storage nodes within an availability zone are subject to a correlated loss of access to stored data; defining a plurality of abstract partitions, wherein each possible input data management request deterministically corresponds to one of the plurality of abstract partitions; mapping the plurality of abstract partitions to the plurality of physical storage pools such that each mapped physical storage pool includes a replica of the data associated with the associated mapped abstract partition, and each replica for a particular abstract partition is mapped to a physical storage pool in a different availability zone; receiving a data management request over the network, the data management request associated with a data object; identifying a first partition corresponding to the received data management request; and manipulating the data object in the physical storage pools mapped to the first partition in accordance with the data management request. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A massively scalable online data storage system, comprising:
-
a distributed storage coupled to a network, the distributed storage including a first storage pool and a second storage pool from a plurality of storage pools, the first storage pool in a first availability zone and the second storage pool in a second availability zone, each storage pool including at least one processor, a computer readable medium, and a communications interface; a director coupled to the network, the director including a processor, a computer readable medium, and a communications interface; a ring structure associated with the director, wherein the ring structure is adapted to associate a storage request with a first abstract partition from a plurality of abstract partitions, and wherein the ring structure is further adapted to selectively associate a first abstract partition with a first fault-tolerant multi-master replication target, the first replication target including the first storage pool and the second storage pool; wherein the director is adapted to route inbound storage requests to the replication target and outbound storage responses from the replication target. - View Dependent Claims (17, 18, 19, 20, 21, 22)
-
-
23. A non-transient computer readable medium containing executable instructions, which when executed on a processor:
-
at a first time, initialize a ring by retrieving a set of ring parameters, the ring parameters including a number of abstract partitions, a number of physical storage pools, and a set of performance constraints;
performing a consistent hashing function associating a first range of inputs with a first abstract partition and a second range of inputs with a second abstract partition; and
allocating the available physical storage pools by mapping each abstract partition to one or more storage pools in accordance with the set of performance constraints;at a second time, opaquely route an input request to a correct storage pool in accordance with the initialized ring; at a third time, rebalance the ring by retrieving the set of ring parameters, performing a consistent hashing function associating the range of inputs with the first abstract partition and the second range of inputs with the second abstract partition; and
allocating the available storage pools mapping each abstract partition to one or more storage pools in accordance with the set of performance constraints such that each abstract partition has zero or one changes in the physical storage pools allocated thereto.
-
Specification