Method and system for proactively reducing the outage time of a computer system

US 6,978,398 B2
Filed: 08/15/2001
Issued: 12/20/2005
Est. Priority Date: 08/15/2001
Status: Active Grant

First Claim

Patent Images

1. A method of reducing a time for a computer system to recover from a degradation of performance in a hardware or a software in at least one first node of said computer system, comprising:

monitoring a state of said at least one first node;

predicting an outage of said hardware or said software based on monitoring;

based on said monitoring, transferring a state of said at least one first node to a second node prior to said degradation in performance of said hardware or said software of said at least one first node;

proactively invoking a state migration functionality to reduce said recovery time, wherein said proactively invoking includes migrating a dynamic state to stable storage of said second node, said second node being accessible to a recovering agent, to reduce an amount of time required by said recovering agent; and

connecting said at least one first node and said second node to a shared memory containing a stale state of the at least one first node and a redo log, wherein said shared memory includes at least one of a shared storage medium, a shared storage disk and a shared network,wherein said degradation of performance comprises one of an outage and a failure,wherein said second node selectively includes an application running corresponding to an application failing on said at least one first node while the at least one first node is still operational,wherein said state transfer from said at least one first node to said second node occurs while the at least one first node is still operational, andwherein said predicting comprises providing a failure predictor on at least one of said at least one first node and said second node, for commanding the at least one first node to start an application if not already running while the at least one first node is still operational, and commanding the second node to begin reading a state of said at least one node and redo log from the shared memory while the at least one first node is still operational.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method (and system) of reducing a time for a computer system to recover from a degradation of performance in a hardware or a software in at least one first node of the computer system, includes monitoring a state of the at least one first node, and based on the monitoring, transferring a state of the at least one first node to a second node prior to the degradation in performance of the hardware or the software of the at least one first node.

Citations

1 Claim

1. A method of reducing a time for a computer system to recover from a degradation of performance in a hardware or a software in at least one first node of said computer system, comprising:
- monitoring a state of said at least one first node;
  
  predicting an outage of said hardware or said software based on monitoring;
  
  based on said monitoring, transferring a state of said at least one first node to a second node prior to said degradation in performance of said hardware or said software of said at least one first node;
  
  proactively invoking a state migration functionality to reduce said recovery time, wherein said proactively invoking includes migrating a dynamic state to stable storage of said second node, said second node being accessible to a recovering agent, to reduce an amount of time required by said recovering agent; and
  
  connecting said at least one first node and said second node to a shared memory containing a stale state of the at least one first node and a redo log, wherein said shared memory includes at least one of a shared storage medium, a shared storage disk and a shared network,wherein said degradation of performance comprises one of an outage and a failure,wherein said second node selectively includes an application running corresponding to an application failing on said at least one first node while the at least one first node is still operational,wherein said state transfer from said at least one first node to said second node occurs while the at least one first node is still operational, andwherein said predicting comprises providing a failure predictor on at least one of said at least one first node and said second node, for commanding the at least one first node to start an application if not already running while the at least one first node is still operational, and commanding the second node to begin reading a state of said at least one node and redo log from the shared memory while the at least one first node is still operational.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Harper, Richard Edwin, Hunter, Steven Wade
Primary Examiner(s)
Badermann, Scott
Assistant Examiner(s)
LOHN, JOSHUA A

Application Number

US09/929,143
Publication Number

US 20030036882A1
Time in Patent Office

1,588 Days
Field of Search

714/13, 714/15, 714/47, 714/38
US Class Current

714/13
CPC Class Codes

G06F 11/1662   the resynchronized componen...

G06F 11/2028   eliminating a faulty proces...

G06F 11/203   using migration

G06F 11/2035   without idle spare hardware

G06F 11/2046   where the redundant compone...

G06F 11/2097   maintaining the standby con...

G06F 11/3466   Performance evaluation by t...

G06F 2201/875   Monitoring of systems inclu...

G21D 3/007   Expert systems

Method and system for proactively reducing the outage time of a computer system

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

1 Claim

Specification

Solutions

Use Cases

Quick Links

Method and system for proactively reducing the outage time of a computer system

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

1 Claim

Specification

Subscription Required

Solutions

Use Cases

Quick Links