Method and apparatus for efficient propagation of large datasets under failure conditions
First Claim
1. In a computerized device, a method for maintaining a stored dataset replicated from a master dataset, comprising the steps of:
- maintaining a change log of changes made to the stored dataset where each change is recorded as an event in a list of events, the list of events including an indicator chain where each link in the indicator chain is an indicator of dataset commonality;
providing an update request to a responder node to update the stored dataset, the update request including a first reference to the list of events and a second reference to the indicator chain;
receiving an update response from the responder node, the update response including information about a responder dataset derived from the first reference and the second reference; and
reconciling the change log and the update response to update the stored dataset whereby the stored dataset is substantially current to the master dataset.
1 Assignment
0 Petitions
Accused Products
Abstract
A network of nodes caches replicated datasets in which dataset changes are efficiently propagated as a set of changes even under failure conditions. A master node and a plurality of subordinate nodes in the network each maintain a copy of the dataset and a change log storing change events in the dataset in that node. The change log further includes a rename chain having a plurality of linked rename records created in response to a new master gaining control of the dataset. The master node computes and propagates dataset changes to the subordinate nodes as a set of change events. If the master node fails, one of the subordinate nodes becomes temporary master and continues to propagate dataset changes using its dataset and its change log in response to update requests from other nodes where the update request contains information from the change log of the requestor node.
144 Citations
28 Claims
-
1. In a computerized device, a method for maintaining a stored dataset replicated from a master dataset, comprising the steps of:
-
maintaining a change log of changes made to the stored dataset where each change is recorded as an event in a list of events, the list of events including an indicator chain where each link in the indicator chain is an indicator of dataset commonality; providing an update request to a responder node to update the stored dataset, the update request including a first reference to the list of events and a second reference to the indicator chain; receiving an update response from the responder node, the update response including information about a responder dataset derived from the first reference and the second reference; and reconciling the change log and the update response to update the stored dataset whereby the stored dataset is substantially current to the master dataset. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 27, 28)
-
-
12. In a computerized device, a method for maintaining a stored dataset to be replicated to at least one other node, comprising the steps of:
-
maintaining a change log of changes made to the stored dataset where each change is recorded as an event in a list of events, the list of events including an indicator chain where each link in the indicator chain is an indicator of dataset commonality; detecting that a master node storing a master dataset has become non-functional; accessing a manifest file in response to detecting that the master node has become non-functional; computing change events from the manifest file; recording the change events in the change log; and updating the stored dataset according to the change log. - View Dependent Claims (13, 14, 15)
-
-
16. In a computerized device, a method for maintaining a stored dataset to be replicated to at least one other node, comprising the steps of:
-
maintaining a change log of changes made to the stored dataset where each change is recorded as an event in a list of events, the list of events including an indicator chain where each link in the indicator chain is an indicator of dataset commonality; receiving an update request from a requesting node storing a replicated dataset, the update request including a first reference to a change log event in a requesting node change log and a second reference to an indicator in a requesting node indicator chain; and providing an update response to the requesting node including information about the stored dataset derived from the change log based on the first reference and the second reference. - View Dependent Claims (17, 18, 19, 20, 21)
-
-
22. A method for maintaining consistency among replicated datasets in a system having a master node and a plurality of subordinate nodes, wherein the master node stores a master dataset and each of the plurality of subordinate node stores a subordinate dataset, the method comprising the steps of:
-
maintaining a change log in the master node and in each of the subordinate nodes, the change log in each node storing changes made in the dataset in that node where the changes are recorded as a list of events, the list of events including an indicator chain where each link in the indicator chain is an indicator of dataset commonality; detecting at one of the subordinate nodes that the master node is non-functional; selecting among the subordinate nodes a temporary master; the temporary master adding an indicator to its list of events and linked to its indicator chain in response to becoming temporary master; the temporary master accessing a manifest file; the temporary master updating its dataset according to the manifest file; the temporary master recording change events from the updating step in its change log; receiving at the temporary master an update request from another subordinate node, the update request including a first reference to a change log event in the other subordinate node'"'"'s change log and a second reference to an indicator in the other subordinate node'"'"'s indicator chain; the temporary master providing an update to the other subordinate node including information about the dataset of the temporary master derived from the change log of the temporary master based on the first reference and the second reference; receiving at the temporary master an update request from the master node, the update request including a third reference to a master node change log event and a fourth reference to an indicator in the master node'"'"'s indicator chain; the temporary master providing a master node update to the master node including information about the dataset of the temporary master derived from the change log of the temporary master based on the third reference and the fourth reference; and the temporary master returning control of the system to the master node.
-
-
23. A computerized device to maintain a stored dataset replicated from a master dataset, comprising:
-
a memory; a storage device storing the dataset and a change log having changes to the dataset recorded as a list of events, the list of events further including an indicator chain where each link in the indicator chain is an indicator of dataset commonality; and a controller coupled to the storage device and the memory, the controller configured to maintain the change log, to provide an update request to a responder node to update the stored dataset, the update request including a first reference to the list of events and a second reference to the indicator chain, the controller to receive an update response from the responder node, the update response including information about a responder dataset derived from the first reference and the second reference, and the controller to reconcile the change log and the update response to update the stored dataset whereby the stored dataset is made substantially current to the master dataset.
-
-
24. A computerized device to maintain a stored dataset to be replicated to at least one other node, comprising:
-
a memory; a storage device storing the dataset and a change log having changes to the dataset recorded as a list of events, the list of events further including an indicator chain where each link in the indicator chain is an indicator of dataset commonality; and a controller coupled to the storage device and the memory, the controller configured to access a manifest file to compute dataset changes, to maintain a change log by including the computed dataset changes, to receive an update request from a requesting node storing a replicated dataset, the update request including a first reference to a change log event in a requesting node change log and a second reference to an indicator in a requesting node indicator chain, and to provide an update response to the requesting node including information about the stored dataset derived from the change log based on the first reference and the second reference.
-
-
25. A computer program product having a computer-readable medium including computer program logic encoded thereon that, when performed on a computer system having a coupling of a memory, a processor, and at least one communications interface, provides a method for maintaining a stored dataset replicated from a master dataset by performing the operations of:
-
maintaining a change log of changes made to the stored dataset where each change is recorded as an event in a list of events, the list of events including an indicator chain where each link in the indicator chain is an indicator of dataset commonality; providing an update request to a responder node to update the stored dataset, the update request including a first reference to the list of events and a second reference to the indicator chain; receiving an update response from the responder node, the update response including information about a responder dataset derived from the first reference and the second reference; and reconciling the change log and the update response to update the stored dataset whereby the stored dataset is substantially current to the master dataset.
-
-
26. A computer program product having a computer-readable medium including computer program logic encoded thereon that, when performed on a computer system having a coupling of a memory, a processor, and at least one communications interface, provides a method for maintaining a stored dataset replicated from a master dataset to be replicated to at least one other node by performing the operations of:
-
maintaining a change log of changes made to the stored dataset where each change is recorded as an event in a list of events, the list of events including an indicator chain where each link in the indicator chain is an indicator of dataset commonality; receiving an update request from a requesting node storing a replicated dataset, the update request including a first reference to a change log event in a requesting node change log and a second reference to an indicator in a requesting node indicator chain; and providing an update response to the requesting node including information about the stored dataset derived from the change log based on the first reference and the second reference.
-
Specification