System and method for discovery based data recovery in a store and forward replication process
First Claim
1. In a network having a plurality of nodes, a method for discovering data missing from a local copy of a replica object replicated at one of said nodes, and for recovering the missing data, the method comprising the steps of:
- keeping, at a local node, a local change set comprising a list of changes that have been made to a local copy of a replica object, said replica object replicated at at least one other node in the network, said list of changes comprising both changes made at said local node and changes made at said at least one other node that have been received by said local node and applied to said local copy of said replica object;
sending said local change set to said at least one other node;
receiving a change set from said at least one other node, said received change set comprising a list of changes including both changes made by said at least one other node to a copy of said replica object replicated at said at least one other node and changes received by said at least one other node from any other node and applied to said copy of said replica object replicated at said at least one other node; and
discovering at said local node any data missing from the local copy of the replica object by comparing the received change set to the local change set to identify changes listed in the received change set, but not listed in the local change set.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for discovery based data recovery in a store and forward replication system are presented. Data loss is discovered by comparing a list of changes made to a local copy of a replica object with a list of changes received over the network from other nodes also having a copy of the replica object. When the list of changes received contains changes that the local list does not, the local system knows its copy of the replica object is not up-to-date. Missing changes are then requested from other systems having the missing data. In making the request, care is taken to minimize the cost incurred in recovering the missing data and to balance network traffic among several other nodes, if possible.
204 Citations
43 Claims
-
1. In a network having a plurality of nodes, a method for discovering data missing from a local copy of a replica object replicated at one of said nodes, and for recovering the missing data, the method comprising the steps of:
-
keeping, at a local node, a local change set comprising a list of changes that have been made to a local copy of a replica object, said replica object replicated at at least one other node in the network, said list of changes comprising both changes made at said local node and changes made at said at least one other node that have been received by said local node and applied to said local copy of said replica object; sending said local change set to said at least one other node; receiving a change set from said at least one other node, said received change set comprising a list of changes including both changes made by said at least one other node to a copy of said replica object replicated at said at least one other node and changes received by said at least one other node from any other node and applied to said copy of said replica object replicated at said at least one other node; and discovering at said local node any data missing from the local copy of the replica object by comparing the received change set to the local change set to identify changes listed in the received change set, but not listed in the local change set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. In a network having a plurality of nodes, a method for discovering data missing from a local copy of a replica object replicated at one of said nodes, and for recovering the missing data, the method comprising the steps of:
-
keeping, at a local node, a local change set comprising a list of changes that have been made to a local copy of a replica object, said replica object replicated at at least one other node in the network, said list of changes comprising both changes made at said local node and changes made at said at least one other node that have been received by said local node and applied to said local copy of said replica object; sending said local change set to said at least one other node; receiving a change set from said at least one other node, said received change set comprising a list of changes including both changes made at said at least one other node to a copy of said replica object replicated at said at least one other node and changes received by said at least one other node from any other node and applied to said copy of said replica object replicated at said at least one other node; discovering at said local node any data missing from the local copy of the replica object by comparing the received change set to the local change set to determine if the received change set contains changes not found in the local change set; entering the changes listed in the received change set but not listed in the local change set into a backfill set comprising a list of changes missing from the local change set; selecting at least one of the at least one other node according to a specified criteria designed to minimize the cost of receiving changes missing from the local change set; and sending a message to said selected at least one of the at least one other node requesting the changes missing from the local change set. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. In a network having a plurality of nodes, a method for discovering data missing from a local copy of a replica object replicated at one of said nodes, and for recovering the missing data, the method comprising the steps of:
-
keeping at a local node a local change set comprising a list of changes that have been made to a local copy of a replica object, said replica object replicated at at least one other node in the network, said list of changes comprising both changes made at said local node and changes made at said at least one other node that have been received by said local node and applied to said local copy of said replica object; broadcasting from said local node to one or more other nodes in the network an information request message, said information request message comprising a request for a change set from said one or more other nodes; receiving at said local node said requested change set from said one or more other nodes; and discovering at said local node any data missing from said local copy of the replica object comparing the received change set to the local change set to identify changes listed in the received change set, but not listed in the local change set. - View Dependent Claims (21, 22)
-
-
23. In a network system comprising a plurality of replica nodes, each logically connected to one another, an article of manufacture for use in a local replica node having a copy of a designated replica object, each replica node comprising a CPU, said article of manufacture comprising:
program storage means, accessible by the CPU, for storing and providing to the CPU program code means, said program code means comprising; means for storing a local change set comprising changes made to a local copy of a designated replica object, said changes originating at either of said local replica node or any other replica node; means for sending said local change set to other replica nodes; means for receiving from said other replica nodes at least one received change set containing changes made to copies of the designated replica object at said other replica nodes and changes received by said other replica nodes from any other replica node and applied to said copies of the designated replica object; and means for comparing said at least one received change set to said local change set in order to discover data missing from said local copy of the designated replica object. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
-
34. In a network system comprising a plurality of replica nodes, each logically connected to one another, an article of manufacture for use in a local replica node having a local copy of a designated replica object, each replica node comprising a CPU, said article of manufacture comprising:
program storage means, accessible by the CPU, for storing and providing to the CPU program code means, said program code means comprising; means for storing a local change set comprising changes made to a local copy of a designated replica object, said changes originating at either of said local replica node or any other replica node; means for sending said local change set to the other replica nodes; means for receiving from said other replica nodes at least one received change set containing changes made to copies of the designated replica object at said other replica nodes and changes received by said other replica nodes from any other replica node and applied to said copies of the designated replica object; means for comparing said at least one received change set to said local change set in order to discover data missing from said local copy of the designated replica object; and means for requesting from said other replica nodes data missing from said local copy of the designated replica object. - View Dependent Claims (35, 36, 37, 38, 39, 40, 41, 42, 43)
Specification