Method of assessing restart approach to minimize recovery time
First Claim
Patent Images
1. A system for message queue failure recovery, comprising:
- a non-transitory computer readable storage medium comprising a recovery management component stored as a set of computer instructions that, when executed by a processor, cause the processor to;
detect a failure in a message queue or a queue manager for the message queue,iteratively detect a current status of each of the message queue and the queue manager,iteratively examine a maintained active log for the message queue and a message recovery log,iteratively select one of a plurality of failure recovery procedures based at least on the iteratively detected current status of the message queue and the queue manager, the maintained active log, and the message recovery log, wherein the plurality of failure recovery procedures comprise;
a first procedure to restart the queue manager and reload one or more messages in the queue manager from a backup queue,a second procedure to shut down and restart a server that hosts the message queue and the queue manager, anda third procedure to shut down a server that hosts the message queue and the queue manager and signal a request for further investigation into the failure, andresponsive to detecting the failure, execute the currently selected one of the plurality of failure recovery procedures.
6 Assignments
0 Petitions
Accused Products
Abstract
A computer implemented method is provided for message queue failure recovery. The method comprises detecting a failure in a message queue or a queue manager for the message queue, detecting a current status of each of the message queue and the queue manager, examining a maintained active log for the message queue and a message recovery log, examining usage of system resources associated with the message queue and the queue manager, and executing one of a plurality of failure recovery procedures based on the current status of the message queue and the queue manager, the active log, the message recovery log, and the usage of the system resources.
-
Citations
20 Claims
-
1. A system for message queue failure recovery, comprising:
a non-transitory computer readable storage medium comprising a recovery management component stored as a set of computer instructions that, when executed by a processor, cause the processor to; detect a failure in a message queue or a queue manager for the message queue, iteratively detect a current status of each of the message queue and the queue manager, iteratively examine a maintained active log for the message queue and a message recovery log, iteratively select one of a plurality of failure recovery procedures based at least on the iteratively detected current status of the message queue and the queue manager, the maintained active log, and the message recovery log, wherein the plurality of failure recovery procedures comprise; a first procedure to restart the queue manager and reload one or more messages in the queue manager from a backup queue, a second procedure to shut down and restart a server that hosts the message queue and the queue manager, and a third procedure to shut down a server that hosts the message queue and the queue manager and signal a request for further investigation into the failure, and responsive to detecting the failure, execute the currently selected one of the plurality of failure recovery procedures. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
11. A computer implemented method for message queue failure recovery, comprising:
detecting, by a processor, a failure in a message queue or a queue manager for the message queue; detecting, by a processor, a current status of each of the message queue and the queue manager; examining, by a processor, a maintained active log for the message queue and a message recovery log; examining, by a processor, usage of system resources associated with the message queue and the queue manager; and executing one of a plurality of failure recovery procedures based on the current status of the message queue and the queue manager, the maintained active log, the message recovery log, and the usage of the system resources, wherein responsive to a difference between the maintained active log and the recovery log exceeding a determined quantity of logs, the executed failure recovery procedure shuts down a server that hosts the message queue and the queue manager and signals a request for investigation of the failure. - View Dependent Claims (12, 13, 14, 15)
-
16. A system for message queue failure recovery, comprising:
a non-transitory computer readable storage medium comprising a recovery management component stored as a set of computer instructions that, when executed by a processor, cause the processor to; detect a failure in a message queue or a queue manager for the message queue, iteratively detect a current status of each of the message queue and the queue manager, iteratively examine a maintained active log for the message queue and a message recovery log, examine status and log information for a plurality of other message queues and queue managers for the other message queues, iteratively select at least one of a plurality of failure recovery procedures based at least on the iteratively detected current status of the message queue and the queue manager, the maintained active log, and the message recovery log, responsive to detecting the failure, execute the currently selected at least one of the plurality of failure recovery procedures based at least on the iteratively detected current status of the message queue and the queue manager, the maintained active log, and the message recovery log, and redistribute a plurality of messages previously assigned to the failed message queue or queue manager to the other message queues or the queue managers of the other message queues based on the respective status and log information for the message queues and queue managers for the other message queues. - View Dependent Claims (17, 18, 19, 20)
Specification