Executing prioritized replication requests for objects in a distributed storage system
First Claim
1. A computer-implemented method for executing replication requests for objects in a distributed storage system, comprising:
- at a computer system including one or more processors and memory storing one or more programs, for execution by the one or more processors;
identifying a replication queue from a plurality of replication queues corresponding to a replication key, wherein the replication key includes information related to at least a source storage device in a distributed storage system at which objects are located and a destination storage device in the distributed storage system to which the objects are to be replicated;
scanning a distributed database using an identifier of the replication queue to produce a list of replication requests corresponding to the replication queue, wherein the list of replication requests is sorted by globally-determined priorities of the replication requests that are included in row keys corresponding to records of the distributed database for the replication requests in the list of replication requests, wherein a respective replication request includes a globally-determined profit value corresponding to the respective replication request, wherein the globally-determined profit value is based on a metric corresponding to a benefit of performing the respective replication request minus a metric corresponding to a cost of performing the respective replication request, wherein the records of the distributed database are distributed across a plurality of nodes of the distributed database, and wherein a location assignment daemon is configured to generate replication requests globally across instances of the distributed storage system based at least in part on a current state of the distributed storage system and replication policies for objects in the distributed storage system;
executing replication requests in the list of replication requests in priority order; and
deleting replication requests from the distributed database only when the replication requests are complete.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method for executing replication requests for objects in a distributed storage system is provided. A replication queue is identified from a plurality of replication queues corresponding to a replication key. The replication key includes information related to at least a source storage device in a distributed storage system at which objects are located and a destination storage device in the distributed storage system to which the objects are to be replicated. A distributed database is scanned using an identifier of the replication queue to produce a list of replication requests corresponding to the replication queue. The records of the distributed database are distributed across a plurality of nodes of the distributed database. The replication requests in the list of replication requests are executed in priority order. Replication requests are deleted from the distributed database only when the replication requests are complete.
117 Citations
29 Claims
-
1. A computer-implemented method for executing replication requests for objects in a distributed storage system, comprising:
at a computer system including one or more processors and memory storing one or more programs, for execution by the one or more processors; identifying a replication queue from a plurality of replication queues corresponding to a replication key, wherein the replication key includes information related to at least a source storage device in a distributed storage system at which objects are located and a destination storage device in the distributed storage system to which the objects are to be replicated; scanning a distributed database using an identifier of the replication queue to produce a list of replication requests corresponding to the replication queue, wherein the list of replication requests is sorted by globally-determined priorities of the replication requests that are included in row keys corresponding to records of the distributed database for the replication requests in the list of replication requests, wherein a respective replication request includes a globally-determined profit value corresponding to the respective replication request, wherein the globally-determined profit value is based on a metric corresponding to a benefit of performing the respective replication request minus a metric corresponding to a cost of performing the respective replication request, wherein the records of the distributed database are distributed across a plurality of nodes of the distributed database, and wherein a location assignment daemon is configured to generate replication requests globally across instances of the distributed storage system based at least in part on a current state of the distributed storage system and replication policies for objects in the distributed storage system; executing replication requests in the list of replication requests in priority order; and deleting replication requests from the distributed database only when the replication requests are complete. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
14. A system for executing replication requests for objects in a distributed storage system, comprising:
-
one or more processors; memory; and one or more programs stored in the memory, the one or more programs comprising instructions to; identify a replication queue from a plurality of replication queues corresponding to a replication key, wherein the replication key includes information related to at least a source storage device in a distributed storage system at which objects are located and a destination storage device in the distributed storage system to which the objects are to be replicated; scan a distributed database using an identifier of the replication queue to produce a list of replication requests corresponding to the replication queue, wherein the list of replication requests is sorted by globally-determined priorities of the replication requests that are included in row keys corresponding to records of the distributed database for the replication requests in the list of replication requests, wherein a respective replication request includes a globally-determined profit value corresponding to the respective replication request, wherein the globally-determined profit value is based on a metric corresponding to a benefit of performing the respective replication request minus a metric corresponding to a cost of performing the respective replication request, wherein the records of the distributed database are distributed across a plurality of nodes of the distributed database, and wherein a location assignment daemon is configured to generate replication requests globally across instances of the distributed storage system based at least in part on a current state of the distributed storage system and replication policies for objects in the distributed storage system; execute replication requests in the list of replication requests in priority order; and delete replication requests from the distributed database only when the replication requests are complete. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21)
-
-
22. A non-transitory computer readable storage medium storing one or more programs configured for execution by a computer, the one or more programs comprising instructions to:
-
identify a replication queue from a plurality of replication queues corresponding to a replication key, wherein the replication key includes information related to at least a source storage device in a distributed storage system at which objects are located and a destination storage device in the distributed storage system to which the objects are to be replicated; scan a distributed database using an identifier of the replication queue to produce a list of replication requests corresponding to the replication queue, wherein the list of replication requests is sorted by globally-determined priorities of the replication requests that are included in row keys corresponding to records of the distributed database for the replication requests in the list of replication requests, wherein a respective replication request includes a globally-determined profit value corresponding to the respective replication request, wherein the globally-determined profit value is based on a metric corresponding to a benefit of performing the respective replication request minus a metric corresponding to a cost of performing the respective replication request, wherein the records of the distributed database are distributed across a plurality of nodes of the distributed database, and wherein a location assignment daemon is configured to generate replication requests globally across instances of the distributed storage system based at least in part on a current state of the distributed storage system and replication policies for objects in the distributed storage system; execute replication requests in the list of replication requests in priority order; and delete replication requests from the distributed database only when the replication requests are complete. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29)
-
Specification