Manifest-based snapshots in distributed computing environments
First Claim
1. A computer implemented method of creating a manifest-based snapshot of a data object in a distributed cloud-computing platform, the method comprising:
- responsive to receiving a request to create a snapshot of the data object,identifying, by a master node of a distributed database system, multiple slave nodes of the distributed database system on which a data object is stored in the cloud-computing platform,wherein the distributed database system includes the master node and the multiple slave nodes,wherein each slave node implements a region server and includes a data node associated with the region server;
creating a snapshot manifest representing the snapshot of the data object,wherein the snapshot manifest comprises a file including a listing of multiple file names in the snapshot manifest and reference information for locating the multiple files in the distributed database system,wherein creating the snapshot manifest further comprises directing each region server to create a portion of the snapshot manifest corresponding to the partition of the data on the data node with which the region server is associated,wherein directing each region server to create the portion of the snapshot manifest corresponding to the partition of the data on the data node with which the region server is associated comprises;
generating, by the master node, a request to each region server for the respective portions of the snapshot manifest corresponding to the partition of the data on the data node with which the region servers are associate; and
sending, by the master node, the requests to the corresponding region servers.
5 Assignments
0 Petitions
Accused Products
Abstract
Scalable architectures, systems, and services are provided herein for creating manifest-based snapshots in distributed computing environments. In some embodiments, responsive to receiving a request to create a snapshot of a data object, a master node identifies multiple slave nodes on which a data object is stored in the cloud-computing platform and creates a snapshot manifest representing the snapshot of the data object. The snapshot manifest comprises a file including a listing of multiple file names in the snapshot manifest and reference information for locating the multiple files in the distributed database system. The snapshot can be created without disrupting I/O operations, e.g., in an online mode by various region servers as directed by the master node. Additionally, a log roll approach to creating the snapshot is also disclosed in which log files are marked. The replaying of log entries can reduce the probability of causal consistency in the snapshot.
-
Citations
22 Claims
-
1. A computer implemented method of creating a manifest-based snapshot of a data object in a distributed cloud-computing platform, the method comprising:
-
responsive to receiving a request to create a snapshot of the data object, identifying, by a master node of a distributed database system, multiple slave nodes of the distributed database system on which a data object is stored in the cloud-computing platform, wherein the distributed database system includes the master node and the multiple slave nodes, wherein each slave node implements a region server and includes a data node associated with the region server; creating a snapshot manifest representing the snapshot of the data object, wherein the snapshot manifest comprises a file including a listing of multiple file names in the snapshot manifest and reference information for locating the multiple files in the distributed database system, wherein creating the snapshot manifest further comprises directing each region server to create a portion of the snapshot manifest corresponding to the partition of the data on the data node with which the region server is associated, wherein directing each region server to create the portion of the snapshot manifest corresponding to the partition of the data on the data node with which the region server is associated comprises; generating, by the master node, a request to each region server for the respective portions of the snapshot manifest corresponding to the partition of the data on the data node with which the region servers are associate; and sending, by the master node, the requests to the corresponding region servers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A distributed database system having a master node comprising:
-
one or more processors; a memory unit having instructions stored thereon which, when executed by the one or more processors cause the master node to create a manifest-based snapshot of a data object in a distributed cloud-computing platform by; identifying multiple slave nodes of the distributed database system on which a data object is stored in the distributed cloud-computing platform, wherein the distributed database system includes the master node and the multiple slave nodes, wherein each slave node implements a region server and includes a data node associated with the region server; creating a snapshot manifest representing the snapshot of the data object, wherein the snapshot manifest includes a file including a listing of multiple file names in the snapshot manifest and reference information for locating the multiple files in the distributed database system, wherein creating the snapshot manifest further comprises directing each region server to create a portion of the snapshot manifest corresponding to the partition of the data on the data node with which the region server is associated, wherein directing each region server to create the portion of the snapshot manifest corresponding to the partition of the data on the data node with which the region server is associated comprises; generating, by the master node, a request to each region server for the respective portions of the snapshot manifest corresponding to the partition of the data on the data node with which the region servers are associated; and sending, by the master node, the request to the corresponding region servers. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22)
-
Specification