Method for highly available transaction recovery for transaction processing systems
First Claim
1. A method for recovering transactions comprising:
- providing a cluster of servers executing on one or more computers, wherein each computer includes a computer readable medium and a processor operating thereon;
providing, at each of the servers within the clustera transaction recovery service, anda corresponding transaction log for which the transaction recovery service has ownership and which is stored on a shared computer readable medium also accessible by other servers in the cluster;
detecting, within the cluster, a failure of a primary server associated with a first transaction recovery service and a first transaction log stored on the shared computer readable medium, wherein the first transaction log includes records of a current transaction and participants being coordinated by the primary server for participation in the current transaction;
performing failover migration of the first transaction recovery service from the failed primary server to a back-up server in the cluster that also has access to the shared computer readable medium, and maintaining ownership by the first transaction recovery service of the first transaction log;
performing transaction recovery for the failed primary server by the first transaction recovery service at the back-up server, using the first transaction log, and including processing any unfinished transactions; and
after primary server transaction recovery is complete, performing failback migration of the first transaction recovery service from the back-up server to the primary server to allow the primary server to service new or subsequent transactions.
1 Assignment
0 Petitions
Accused Products
Abstract
A highly available transaction recovery service migration system in accordance with one embodiment of the present invention implements a server'"'"'s Transaction Recovery Service (TRS) as a migratable service. In one embodiment of the present invention, the TRS is a server instance or software module implemented in JAVA. The TRS migrates to an available server that resides in the same cluster as the failed server. The migrated TRS obtains the TLOG of the failed server, reads the transaction log, and performs transaction recovery on behalf of the failed server. The migration may occur manually or automatically on a migratable services framework. The TRS of the failed server migrates back in a fail back operation once the failed primary server is restarted. Failback operation may occur whether recovery is completed or not. This expedites recovery and improves availability of the failed server thereby preserving the efficiency of the network and other servers.
-
Citations
26 Claims
-
1. A method for recovering transactions comprising:
-
providing a cluster of servers executing on one or more computers, wherein each computer includes a computer readable medium and a processor operating thereon; providing, at each of the servers within the cluster a transaction recovery service, and a corresponding transaction log for which the transaction recovery service has ownership and which is stored on a shared computer readable medium also accessible by other servers in the cluster; detecting, within the cluster, a failure of a primary server associated with a first transaction recovery service and a first transaction log stored on the shared computer readable medium, wherein the first transaction log includes records of a current transaction and participants being coordinated by the primary server for participation in the current transaction; performing failover migration of the first transaction recovery service from the failed primary server to a back-up server in the cluster that also has access to the shared computer readable medium, and maintaining ownership by the first transaction recovery service of the first transaction log; performing transaction recovery for the failed primary server by the first transaction recovery service at the back-up server, using the first transaction log, and including processing any unfinished transactions; and after primary server transaction recovery is complete, performing failback migration of the first transaction recovery service from the back-up server to the primary server to allow the primary server to service new or subsequent transactions. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A method for performing transaction recovery comprising:
-
(a) providing a cluster of servers executing on one or more computers, wherein each computer includes a computer readable medium and a processor operating thereon; (b) providing, at each of the servers within the cluster a transaction recovery service, and a corresponding transaction log for which the transaction recovery service has ownership and which is stored on a shared computer readable medium also accessible by other servers in the cluster; (c) detecting a first server has failed, the first server having a first transaction log on the shared computer readable medium and a first transaction recovery service wherein the transaction log includes records of at least one transaction and related participants being coordinated by the first server for the at least one transaction; (d) moving the first transaction recovery service to a second server wherein the second server has access to the shared computer readable medium and wherein the first transaction recovery service maintains ownership of the first transaction log; (e) activating the first transaction recovery service on the second server; (f) performing transaction recovery on behalf of the first server by the first transaction recovery service at the second server, using the first transaction log, while the first transaction recovery service resides on the second server wherein transaction recovery includes processing any unfinished transactions; (g) completing transaction recovery on behalf of the first server by the first transaction recovery service; (h) deactivating the first transaction recovery service on the second server; (i) moving the first transaction recovery service from the second server to the first server to service new or subsequent transactions, wherein moving the first transaction recovery service from the second server to the first server is completed after transaction recovery on behalf of the first server by the first transaction recovery service is completed; and (h) activating the first transaction recovery service on the first server. - View Dependent Claims (19, 20, 21, 22, 23)
-
-
24. A method for manually performing failback migration from a backup server to a primary server after transaction recovery on a primary server is complete comprising:
-
detecting, within the cluster, a failure of a primary server associated with a first transaction recovery service (TRS) and a first transaction log (TLOG) stored on the shared computer readable medium, wherein the first transaction log includes records of a current transaction and participants being coordinated by the primary server for participation in the current transaction; performing failover migration of the first TRS from the failed primary server to a backup server in the cluster that also has access to the shared computer readable medium, and maintaining ownership by the first TRS of the first TLOG; completing transaction recovery for the failed primary server by the first TRS associated with the primary server and residing on the backup server wherein transaction recovery includes using the first TLOG corresponding to the failed primary server to commit each unfinished transaction included in the first TLOG; requesting a migratable framework to migrate the first TRS from the backup server to the primary server, the backup server making the request to the migratable framework; deactivating the first TRS residing on the backup server by the migratable framework, wherein deactivating the first TRS includes the migratable framework calling a JAVA method residing in the first TRS to deactivate the first TRS on the backup server; and migrating the first TRS from the backup server to the primary server by the migratable framework after primary server transaction recovery is complete to service new or subsequent transactions, wherein the primary server regains ownership of the first TLOG corresponding to the primary server upon primary server restart.
-
-
25. In a cluster of servers, a server high availability method comprising the step of performing transaction recovery in a backup server using a transaction recovery service and a transaction log, wherein the transaction log includes records of at least one transaction and related participants being coordinated for the at least one transaction and wherein transaction recovery includes processing any unfinished transactions, stored in a memory shared between the backup server and a failed server before the step of restarting the failed server and migrating the transaction recovery service from the backup server to the failed server after transaction recovery is complete.
-
26. In a cluster of servers, a server high availability method comprising the step of performing transaction recovery in a backup server using a transaction recovery service and a transaction log, wherein the transaction log includes records of at least one transaction and related participants being coordinated for the at least one transaction and wherein transaction recovery includes processing any unfinished transactions, before the step of restarting a failed server and migrating the transaction recovery service from the backup server to the failed server after transaction recovery is complete.
Specification