Massively scalable object storage system
First Claim
1. A method for managing data items in a distributed storage pool, comprising:
- providing a plurality of physical storage pools, each storage pool including a plurality of storage nodes coupled to a network, and each storage node further providing a non-transitory computer readable medium for data storage;
storing a first replica of a data item in a first physical storage pool of the plurality of physical storage pools;
storing a second replica of the data item in a second physical storage pool of the plurality of physical storage pools;
receiving a modification instruction for the data item;
in response to receiving the modification instruction for the data item;
modifying the first replica of the data item;
creating a first modification sentinel file based on the modification instruction; and
storing the first modification sentinel file in the first physical storage pool; and
in response to encountering the first modification sentinel file during a data item replication process of the first physical storage pool;
modifying in accordance with the first modification sentinel file the second replica of the data item stored in the second physical storage pool;
creating a second modification sentinel file;
storing the second modification sentinel file in the second physical storage pool;
waiting for a configurable time;
deleting the first modification sentinel file in the first physical storage pool; and
deleting the second modification sentinel file in the second physical storage pool,wherein the configurable time is longer than a worst-case replication propagation time between the first physical storage pool and the second physical storage pool.
4 Assignments
0 Petitions
Accused Products
Abstract
Several different embodiments of a massively scalable object storage system are described. The object storage system is particularly useful for storage in a cloud computing installation whereby shared servers provide resources, software, and data to computers and other devices on demand. In several embodiments, the object storage system includes a ring implementation used to associate object storage commands with particular physical servers such that certain guarantees of consistency, availability, and performance can be met. In other embodiments, the object storage system includes a synchronization protocol used to order operations across a distributed system. In a third set of embodiments, the object storage system includes a metadata management system. In a fourth set of embodiments, the object storage system uses a structured information synchronization system. Features from each set of embodiments can be used to improve the performance and scalability of a cloud computing object storage system.
-
Citations
16 Claims
-
1. A method for managing data items in a distributed storage pool, comprising:
-
providing a plurality of physical storage pools, each storage pool including a plurality of storage nodes coupled to a network, and each storage node further providing a non-transitory computer readable medium for data storage; storing a first replica of a data item in a first physical storage pool of the plurality of physical storage pools; storing a second replica of the data item in a second physical storage pool of the plurality of physical storage pools; receiving a modification instruction for the data item; in response to receiving the modification instruction for the data item; modifying the first replica of the data item; creating a first modification sentinel file based on the modification instruction; and storing the first modification sentinel file in the first physical storage pool; and in response to encountering the first modification sentinel file during a data item replication process of the first physical storage pool; modifying in accordance with the first modification sentinel file the second replica of the data item stored in the second physical storage pool; creating a second modification sentinel file; storing the second modification sentinel file in the second physical storage pool; waiting for a configurable time; deleting the first modification sentinel file in the first physical storage pool; and deleting the second modification sentinel file in the second physical storage pool, wherein the configurable time is longer than a worst-case replication propagation time between the first physical storage pool and the second physical storage pool. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for managing data items in a distributed storage pool, the system comprising:
-
a distributed storage system coupled to a network, the distributed storage system including a first storage pool and a second storage pool from a plurality of physical storage pools, the first and second storage pools each including at least one processor, a computer readable medium, and a communications interface; an object service that stores a first replica of a data item in the first storage pool and receives a modification instruction for the data item, wherein in response to receiving the modification instruction, the object service modifies the first replica of the data item, creates a first modification sentinel file based on the modification instruction, and stores the first modification sentinel file in the first storage pool; a replicator that stores a second replica of the data item in the second storage pool and encounters the first modification sentinel file during a data item replication process of the first storage pool, wherein in response to encountering the first modification sentinel file, the replicator modifies in accordance with the first modification sentinel file the second replica of the data item stored in the second storage pool, creates a second modification sentinel file, and stores the second modification sentinel file in the second storage pool; and a timer triggering an action at a configurable time; wherein the replicator includes a triggerable action that deletes the first modification sentinel file in the first storage pool and deletes the second modification sentinel file in the second storage pool, and wherein the configurable time is based upon a measured worst-case replication time between the first and second storage pools. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory machine-readable medium comprising a plurality of machine-readable instructions that when executed by one or more processors causes the one or more processors to perform a method comprising:
-
storing a first replica of a data item in a first physical storage pool; storing a second replica of the data item in a second physical storage pool; in response to receiving a modification instruction for the data item, modifying the first replica of the data item, creating a first modification sentinel file based on the modification instruction, and storing the first modification sentinel file in the first physical storage pool; in response to encountering the first modification sentinel file during a data item replication process of the first physical storage pool, modifying in accordance with the first modification sentinel file the second replica of the data item stored in the second physical storage pool, creating a second modification sentinel file, and storing the second modification sentinel file in the second physical storage pool; waiting for a configurable time; and deleting the first modification sentinel file in the first physical storage pool; and deleting the second modification sentinel file in the second physical storage pool, wherein the configurable time is based upon a measurement of a replication delay between the first and second physical storage pools. - View Dependent Claims (16)
-
Specification