Chaining log operations in data replication groups
First Claim
Patent Images
1. A computer-implemented method, comprising:
- receiving, at a first node of a data replication group, a message from a second node of the data replication group, where the data replication group comprises a plurality of nodes implementing a consensus protocol;
determining to commit an operation to the first node, where the operation is indicated in the message;
obtaining an operation number corresponding to the operation;
obtaining a previous operation number corresponding to a previous operation committed immediately prior to the operation;
recording the operation number and the previous operation number in a log entry such that the log entry indicates that the first node committed the previous operation immediately prior to the operation in error, wherein the error indicates that a gap exists between the operation number and the previous operation number in the log entry; and
determining validity of the gap and other gaps in a log that comprises the log entry and other log entries by using state information of the node and state information of the data replication group.
1 Assignment
0 Petitions
Accused Products
Abstract
Data replication groups may be used to store data in a distributed computing environment. The data replication groups may include a set of nodes executing a consensus protocol to maintain data durably. In order to monitor and debug the operation of the data replication group a chaining mechanism may be utilized during log creation by the set of nodes of the data replication group. The chaining mechanism may cause entries in the log to indicate an operation being performed and an operation performed immediately prior to the operation being performed. In various embodiments, an outside observer receives the logs and checks the logs for errors indicated by the chaining mechanism.
66 Citations
21 Claims
-
1. A computer-implemented method, comprising:
-
receiving, at a first node of a data replication group, a message from a second node of the data replication group, where the data replication group comprises a plurality of nodes implementing a consensus protocol; determining to commit an operation to the first node, where the operation is indicated in the message; obtaining an operation number corresponding to the operation; obtaining a previous operation number corresponding to a previous operation committed immediately prior to the operation; recording the operation number and the previous operation number in a log entry such that the log entry indicates that the first node committed the previous operation immediately prior to the operation in error, wherein the error indicates that a gap exists between the operation number and the previous operation number in the log entry; and determining validity of the gap and other gaps in a log that comprises the log entry and other log entries by using state information of the node and state information of the data replication group. - View Dependent Claims (2, 3, 4)
-
-
5. A system, comprising:
-
one or more processors; and memory that includes instructions that, when executed by the one or more processors, cause the system to; receive a message from a node of a data replication group, the data replication group including one or more nodes implementing a consensus protocol; obtain an operation number associated with a performed operation and a previous operation number associated with a previously performed operation; determine whether the operation number and the previous operation number in a log entry of a log is valid based at least in part on analyzing whether one or more gaps in the log entries in the data replication group exists; and mitigate errors to the one or more gaps in the log entries in the data replication group. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-readable storage medium comprising thereon executable instructions that, when executed by one or more processors of a computer system, cause the computer system to at least:
-
obtain a log generated by a node of a data replication group, the data replication group comprising one or more nodes implementing a consensus protocol; determine a set of operations performed by the node based at least in part on the log; detect a gap in the set of operations, where the gap in the set of operations is indicated by a chaining mechanism utilized to generate log entries of the log; determine a validity of the gap based at least in part on a first operation performed by the node indicated by a log entry of the log and a previous operation performed by the node indicated by the log entry of the log by at least comparing state information of the node with state information of the data replication group; detect a second gap in the set of operations, where the second gap in the set of operations is detected based at least in part on a chaining mechanism utilized, by the node, to generate log entries of the log; determine that the detected second gap is a valid operation of the data replication group based at least in part on the log entries of the log; and transmit a notification indicating information about the gap and the second gap. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21)
-
Specification