RECOVERY AND REPLICATION OF A FLASH MEMORY-BASED OBJECT STORE
First Claim
1. A method for adding a new node to a cluster of nodes, comprising:
- at a surviving node, in the cluster of nodes, replicating, to a recovering node in the cluster of nodes, all requests to modify data stored in a first data store thereon that are received by the surviving node; and
the surviving node performing a bulk copy operation to copy data, stored in the first data store, to a second data store maintained on the recovering node,wherein the surviving node (a) replicates all requests to modify data received by the surviving node and (b) performs a bulk copy operation in parallel.
4 Assignments
0 Petitions
Accused Products
Abstract
Approaches for recovering nodes and adding new nodes to object stores maintained on one or more solid state devices. At a surviving node, in a cluster of nodes, replicating, to a recovering node in the cluster of nodes, all requests to modify data stored in a first data store thereon that are received by the surviving node. The surviving node performing a bulk copy operation to copy data, stored in the first data store, to a second data store maintained on the recovering node. The surviving node (a) replicates all requests to modify data received by the surviving node and (b) performs a bulk copy operation in parallel.
-
Citations
20 Claims
-
1. A method for adding a new node to a cluster of nodes, comprising:
-
at a surviving node, in the cluster of nodes, replicating, to a recovering node in the cluster of nodes, all requests to modify data stored in a first data store thereon that are received by the surviving node; and the surviving node performing a bulk copy operation to copy data, stored in the first data store, to a second data store maintained on the recovering node, wherein the surviving node (a) replicates all requests to modify data received by the surviving node and (b) performs a bulk copy operation in parallel. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for recovering a node in a distributed object store comprising a plurality of nodes, comprising:
-
dividing a key space into a number of segments, wherein the key space is a range of values returned from a hash function applied to a set of keys that are used to identify objects stored within the distributed object store, wherein each node of the distributed object store is assigned a set of tokens in the key space; determining a subset of keys, in a particular segment stored on a first node of the distributed object store, that are to be moved to a second node of the distributed object store; and transferring any object identified by the subject of keys from the first node to the second node.
-
-
11. The method of Clam 10, wherein the steps of dividing, determining, and transferring are performed without interrupting service on any of the nodes of the distributed object store.
-
12. The method of Clam 10, further comprising:
in response to the first node receiving, from a client, a request that references any key in the subset of keys, the first node responding to the client with a error code and a IP address of the second node.
-
13. The method of Clam 10, further comprising:
-
prior to transferring, the first node broadcasting a virtual IP address associated with the subset of keys to all clients of the distributed object store; and upon completion of transferring, the first node reassigning the virtual IP address to identify the second node.
-
-
14. The method of Clam 10, wherein transferring any object identified by the subject of keys from the first node to the second node comprises:
-
in response to the first node receiving, from a client, a request that references any key in the subset of keys, the first node forwarding the request to the second node; and in response to the first node receiving, from the second node, a response to the request, the first node forwarding the response to the client.
-
-
15. A method for recovering a node in a distributed object store comprising a plurality of nodes, comprising:
-
dividing a key space into a number of equal size segments, wherein the key space is a range of values returned from a hash function applied to a set of keys that are used to identify objects stored within the distributed object store, and wherein the number of segments is greater than the number of nodes in the plurality of nodes; assigning, to each node of the distributed object store, a set of tokens in the key space; and storing, in a set of one or more nodes assigned to a next set of N tokens that satisfy a suitability condition, N copies of objects in each key space segment, wherein the suitability condition limits common failure domains among the plurality of nodes; and recovering a failed node, in the plurality of nodes, by retrieving, in parallel, copies of the objects previously stored on the failed node from nodes of the plurality of nodes. - View Dependent Claims (16, 17)
-
-
18. A method for performing replication from one solid state device to another solid state device, comprising:
-
a first node, maintaining, in volatile memory, a set of write operations to be performed on a second node, wherein each of the first node and the second node persistently store data using one or more solid state devices; the first node, examining the set of write operations maintained in volatile memory to identify a set of related write operations which write to contiguous data blocks at the second node; and at the first node, sending a single write operation to the second node to request the performance of the set of related write operations to the contiguous data blocks of the one or more solid state devices used by the second node to persistently store data.
-
-
19. A computer readable storage medium for storing one or more sequences of instructions, which when executed by one or more processors, causes:
-
at a surviving node, in the cluster of nodes, replicating, to a recovering node in the cluster of nodes, all requests to modify data stored in a first data store thereon that are received by the surviving node; and the surviving node performing a bulk copy operation to copy data, stored in the first data store, to a second data store maintained on the recovering node, wherein the surviving node (a) replicates all requests to modify data received by the surviving node and (b) performs a bulk copy operation in parallel. - View Dependent Claims (20)
-
Specification