Switching between active-replication and active-standby for data synchronization in virtual synchrony
First Claim
1. A fault tolerant system using total order and normally operating in an active-replication mode comprising:
- a primary CPU normally operating in an active-replication mode;
a backup CPU interconnected with the primary CPU, the backup CPU requiring synchronization with the primary CPU;
means for sending an “
add me”
request signal from said backup CPU to said primary CPU to cause said primary CPU to temporarily switch to an active-standby mode;
means for sending a “
finished”
signal from said primary CPU to said backup CPU when copies of all data synchronization records have been transmitted to said backup CPU; and
means causing both said primary and said backup CPUs to revert to an active-replication mode substantially immediately after transmission of said “
finished”
signal.
17 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a method and apparatus for causing CPUs comprising portions of a fault tolerant process group to operate in an active-standby mode when synchronizing newly on-line CPUs and reverting to an active-replication mode when synchronization is complete. The above is accomplished in one embodiment of the invention by continuing to operate the primary processor in the active-standby mode and updating the newly online CPUs in accordance with a single pass intelligent update algorithm. When synchronization is complete, a message is transmitted to all CPUs in the group causing a reversion to an active-replication mode for all CPUs whether primary or standby. Any already synchronized CPUs that were in a standby mode, when the group is switched to an active-standby mode, are only updated by check-point message data as data synchronization updating record messages being supplied to a newly online CPU are ignored by these already synchronized standby CPUs.
29 Citations
13 Claims
-
1. A fault tolerant system using total order and normally operating in an active-replication mode comprising:
-
a primary CPU normally operating in an active-replication mode;
a backup CPU interconnected with the primary CPU, the backup CPU requiring synchronization with the primary CPU;
means for sending an “
add me”
request signal from said backup CPU to said primary CPU to cause said primary CPU to temporarily switch to an active-standby mode;
means for sending a “
finished”
signal from said primary CPU to said backup CPU when copies of all data synchronization records have been transmitted to said backup CPU; and
means causing both said primary and said backup CPUs to revert to an active-replication mode substantially immediately after transmission of said “
finished”
signal.- View Dependent Claims (2)
means for sending check-point messages to said backup CPU during the time said primary CPU is operating in an active-standby mode;
means for receiving and storing external messages at both said primary and backup CPUs at all times said CPUs are operational; and
means for processing said external messages at said backup CPU only when said CPUs comprising said fault tolerant system are operating in an active-replication mode.
-
-
3. A method of synchronizing a newly added CPU in a fault-tolerant signal processing system normally operating in an active-replication mode comprising the steps of:
-
changing a primary CPU to operate in an active-standby mode with respect to a backup CPU to be synchronized when said primary CPU is notified that said backup CPU is ready to be synchronized;
supplying a notification to said backup CPU that copies of all data synchronization records have been transmitted to said backup CPU; and
changing said primary and said backup CPUs to an active-replication mode substantially immediately after transmission of said notification. - View Dependent Claims (4)
sending check-point messages to said backup CPU during the time said primary CPU is operating in an active-standby mode;
receiving and storing external messages at both said primary and backup CPUs at all times said CPUs are operational; and
processing said external messages at said backup CPU only when said the CPUs comprising said fault tolerant system are operating in an active-replication mode.
-
-
5. A method of synchronizing a recently added backup CPU in a fault tolerant signal processing system normally operating in an active-replication mode comprising the steps of:
-
changing a primary CPU and any already synchronized standby CPUs to operate in an active-standby mode when a notification is received that a further CPU is ready to be synchronized;
supplying an end of record transmission notification to said further CPU that copies of all data synchronization records have been transmitted to said further CPU; and
changing said primary CPU and any standby CPUs to an active-replication mode upon receipt of a reply to said end of record transmission notification.
-
-
6. A processing apparatus for use in a fault tolerant process system comprising:
-
CPU means operable to be synchronized to a primary CPU in a process group while the primary CPU continues processing and simultaneously supplies data synchronization records and check-point messages in an active-standby mode to complete the synchronization of a database and combination transaction queue and message list;
means within said CPU means for switching to a standby processor mode in an active-replication mode when synchronization of all recently added CPUs is complete and the primary CPU continues to process incoming messages in accordance with design specifications; and
means within said CPU means for operating said CPU means in a primary CPU active-replication mode with respect to any standby CPUs when failure of the previously primary CPU is detected and all other CPUs in the process group have less priority, in accordance with a predefined set of conditions, than said CPU means. - View Dependent Claims (7)
-
-
8. A computer program product for use in combination with a CPU of a fault tolerant process system, the computer program product having a medium with a computer program embodied thereon, the computer program comprising:
-
computer program code for causing a newly online first CPU to be synchronized by receiving updating check-point messages and data synchronization records from a primary CPU in a process group while the primary CPU continues processing in an active-standby mode until synchronization of a database and combination transaction queue and message list is completed;
computer program code for causing said first CPU to switch to a standby processor in an active-replication mode when synchronization is complete and the primary CPU continues to process incoming messages in accordance with predetermined conditions; and
computer program code for causing said first CPU to operate in a primary CPU active-replication mode with respect to any interconnected standby CPUs comprising a part of a process group when failure of the previously primary CPU is detected and all other CPUs in the process group have less priority, in accordance with a predefined set of conditions, than said first CPU.
-
-
9. A method of operating a CPU used in a fault tolerant process system comprising the steps of:
-
causing a newly online first CPU to be synchronized by receiving updating check-point messages and data synchronization records from a primary CPU in a process group while the primary CPU continues processing in an active-standby mode until synchronization of a database and combination transaction queue and message list is completed;
causing said first CPU to switch to a standby processor mode in an active-replication mode when synchronization is complete and the primary CPU continues to process incoming messages in accordance with predetermined conditions; and
causing said first CPU to operate in a primary CPU active-replication mode with respect to any interconnected standby CPUs comprising a part of a process group when failure of the previously primary CPU is detected and all other CPUs in the process group are determined to have less priority than said first CPU.
-
-
10. A fault tolerant signal processing system normally operating in an active-replication mode comprising:
-
at least a primary CPU operable to maintain synchronization with any operable standby CPUs in an active-replication mode;
means within any CPU newly placed on-line for sending a message to said primary CPU requesting that it be synchronized with said primary CPU;
means within said primary CPU and any operable standby CPUs for changing to an active-standby mode until the CPU newly placed on-line is synchronized; and
means within all CPUs for returning to an active-replication mode for normal operation.
-
-
11. A method of synchronizing newly added standby CPUs to a fault tolerant signal processing system group normally operating in an active-replication mode comprising:
-
sending a message to from a CPU newly placed on-line, to a primary CPU, requesting that it be synchronized with said primary CPU;
changing all CPUs in the group to operate in an active-standby mode until the CPU newly placed on-line is synchronized; and
returning all CPUs to an active-replication mode when synchronization of all CPUs in a group occurs.
-
-
12. A method of operating a primary CPU used in a fault tolerant process system comprising the steps of:
-
switching from an active-replication mode to an active-standby mode upon receiving a request for synchronization from a newly online CPU; and
sending an end of record transmission notification to the newly online CPU that copies of all data synchronization records have been transmitted to the newly online CPU and then returning to an active-replication mode when synchronization is complete. - View Dependent Claims (13)
-
Specification