Consistent group membership for semi-active and passive replication
First Claim
1. A method of maintaining consistent group membership for objects, wherein objects are distributed across a plurality of networked computers, said method comprising:
- classifying one member of a group as a Primary;
classifying at least one other member of said group as a Backup;
assigning a rank to each member of said group, with said Primary member having the lowest rank;
assigning a unique precedence to each member of said group;
based on rank of a Backup member, allowing a Backup member to claim status as the Primary member if said Backup member detects the failure of said Primary member;
announcing the failure of said Primary member by said Backup member; and
removing Backup members having a lower rank than said Backup member that has claimed status as said Primary member.
5 Assignments
0 Petitions
Accused Products
Abstract
This invention defines a method and mechanisms for maintaining a consistent group membership, based on the leader-follower strategy of Semi-Active or Passive replication. Each member of the group is assigned a rank, and a precedence, determined by the order in which it is added to the group. The Primary maintains the membership of the group, while each Backup monitors the behavior of the Primary. When a Backup detects that the Primary is faulty, the Backup announces that it is the new Primary and removes the faulty Primary from the membership of the group. The group membership algorithm disclosed here does not require a consensus decision to reconfigure the membership and effects a membership change more quickly in the common case where the Backup of lowest rank takes control as the new Primary when it determines that the existing Primary failed.
45 Citations
48 Claims
-
1. A method of maintaining consistent group membership for objects, wherein objects are distributed across a plurality of networked computers, said method comprising:
-
classifying one member of a group as a Primary; classifying at least one other member of said group as a Backup; assigning a rank to each member of said group, with said Primary member having the lowest rank; assigning a unique precedence to each member of said group; based on rank of a Backup member, allowing a Backup member to claim status as the Primary member if said Backup member detects the failure of said Primary member; announcing the failure of said Primary member by said Backup member; and removing Backup members having a lower rank than said Backup member that has claimed status as said Primary member. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A method of maintaining consistent group membership for objects, wherein objects are distributed across a plurality of networked computers, said method comprising:
-
classifying one member of a group as a Primary; classifying at least one other member of said group as a Backup; assigning a rank to each member of said group, with said Primary member having the lowest rank; assigning a unique precedence to each member of said group; based on rank of a Backup member, allowing a Backup member to claim status as the Primary member if said Backup member detects the failure of said Primary member; and announcing the failure of said Primary member by said Backup member; wherein the determination that the Primary member has failed when a lack of valid activity from said Primary member is detected; wherein the determination that a lack of valid activity has occurred when a message has not been received from said Primary member within a fault detection timeout interval; and wherein said fault detection timeout interval depends on said rank of said member.
-
-
19. A method of maintaining consistent group membership for objects, wherein objects are distributed across a plurality of networked computers, said method comprising:
-
classifying one member of a group as a Primary; classifying at least one other member of said group as a Backup; assigning a rank to each member of said group, with said Primary member having the lowest rank; assigning a unique precedence to each member of said group; based on rank of a Backup member, allowing a Backup member to claim status as the Primary member if said Backup member detects the failure of said Primary member; and announcing the failure of said Primary member by said Backup member; wherein the determination that the Primary member has failed when a lack of valid activity from said Primary member is detected; wherein the determination that a lack of valid activity has occurred when a message has not been received from said Primary member within a fault detection timeout interval; wherein a fault in the Primary member is detected by lower ranking Backup members having a shorter fault detection timeout interval than higher ranking Backup members; and wherein said fault may be detected by lower ranking Backup members earlier than higher ranking Backup members.
-
-
20. A method of maintaining consistent group membership for objects, wherein objects are distributed across a plurality of networked computers, said method comprising:
-
classifying one member of a group as a Primary; classifying at least one other member of said group as a Backup; assigning a rank to each member of said group, with said Primary member having the lowest rank; assigning a unique precedence to each member of said group; based on rank of a Backup member, allowing a Backup member to claim status as the Primary member if said Backup member detects the failure of said Primary member; and announcing the failure of said Primary member by said Backup member; wherein the determination that the Primary member has failed when a lack of valid activity from said Primary member is detected; wherein the determination that a lack of valid activity has occurred when a message has not been received from said Primary member within a fault detection timeout interval; and wherein said fault detection timeout interval for the Backup member of lowest rank is set to a time period within which said Primary member is expected to have performed actions.
-
-
21. A computer system, in which application objects are replicated by semi-active or passive replication across a plurality of networked computers to provide fault tolerance, said system comprising:
-
replication software executable on one or more networked computers in said system; and means associated with said replication software for maintaining a consistent membership view between a Primary member and Backup members, and wherein any of said Backup members can determine that the Primary member is faulty and can determine the new membership; wherein replacement of a faulty Primary member and additional faulty Backup members is based on the ranks of the members, which determine the timeout periods necessary for detecting a fault. - View Dependent Claims (22, 23, 24, 25, 26)
-
-
27. A computer program having a computer readable medium with computer readable code stored thereon, said computer readable code executable on one or more computers in a system of networked computers wherein application objects are replicated by semi-active or passive replication across a plurality of networked computers for fault tolerance, said system comprising:
-
instructions for ranking members within a group of objects and communicating membership information from a Primary member to Backup members within a group of objects; and instructions for said Backup members to detect the failure of said Primary member and to claim to be the new Primary member; wherein the time required for a Backup member to detect the failure of said Primary member is determined by the ranking of said Backup member. - View Dependent Claims (28, 29, 30)
-
-
31. A method of maintaining group membership based on the leader-follower strategy of Semi-Active or Passive replication, comprising the steps of:
-
monitoring the behavior of a Primary member of a group by other members of the group; forming a new membership for the group by choosing a new Primary member for the group in response to said Primary member being detected as faulty by a Backup member; wherein the membership set and the choice of said new Primary member is consistent across the members of the group; wherein the determination of the membership set and the choice of said new Primary member avoids the need for a consensus decision; and wherein the determination of the membership set and the choice of said new Primary member avoids the overhead of the large number of messages that must be multicast in a consensus algorithm; wherein the determination that said Primary member is faulty is based on the non-occurrence of events relating to said Primary member within the interval of a timeout; and wherein the duration of said timeout at a Backup member is determined by the rank of the Backup member. - View Dependent Claims (32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
-
-
43. A method of maintaining group membership based on the leader-follower strategy of Semi-Active or Passive replication, comprising the steps of:
-
monitoring the behavior of a Primary member of a group by other members of the group; forming a new membership for the group by choosing a new Primary member for the group in response to said Primary member being detected as faulty by a Backup member; wherein the membership set and the choice of said new Primary member is consistent across the members of the group; wherein the determination of the membership set and the choice of said new Primary member avoids the need for a consensus decision; and wherein the determination of the membership set and the choice of said new Primary member avoids the overhead of the large number of messages that must be multicast in a consensus algorithm; and wherein a given Backup member, having determined that said Primary member is faulty, multicasts a message to the group members announcing that said Primary member is faulty, and that said Backup member is the new Primary member, and that all Backup members of lower ranks than said Backup member are to remove themselves from said group. - View Dependent Claims (44, 45, 46, 47, 48)
-
Specification