Multiple node/virtual input/output (I/O) server (VIOS) failure recovery in clustered partition mobility
First Claim
1. In a cluster-aware data processing system having a processor, a memory coupled to the processor, at least one input/output (I/O) adapter that enables connection to an external network with a shared storage repository, and a plurality of virtual I/O servers (VIOSes) that form a VIOS cluster with a shared database, where each VIOS is cluster aware, a method comprising:
- activating a first monitoring thread on a first VIOS of one or more VIOSes of a first server to track a status of a live partition mobility (LPM) event;
recording information about the LPM event within the shared database by using the first monitoring thread;
in response to the first VIOS sustaining a failure condition, identifying one or more functioning monitoring threads that continue to function on the first server, wherein the failure condition results in a loss of LPM event monitoring by the first monitoring thread;
determining whether the one or more functioning monitoring threads is a single, last monitoring thread; and
in response to receiving an indication that identifies at least the first VIOS of the one or more VIOSes on the first server as being in failed state, performing, via the last monitoring thread;
a query of a mobility table within the shared database to determine one or more failed VIOSes, including the first VIOS, within the one or more VIOSes of the first server that are in the failed state; and
a removal of one or more corresponding rows/entries that are associated with the one or more failed VIOSes from the mobility table.
0 Assignments
0 Petitions
Accused Products
Abstract
A method utilizes cluster-awareness to effectively support a live partition mobility (LPM) event and provide recovery from node failure within a Virtual Input/Output (I/O) Server (VIOS) cluster. An LPM utility creates a monitoring thread on a first VIOS on initiation of a corresponding LPM event. The monitoring thread tracks a status of an LPM and records status information in the mobility table of a database. The LPM utility creates other monitoring threads on other VIOSes running on the (same) source server. If the first VIOS VIOS sustains one of multiple failures, the LPM utility provides notification to other functioning nodes/VIOSes. The LPM utility enables a functioning monitoring thread to update the LPM status. In particular, a last monitoring thread may perform cleanup/update operations within the database based on an indication that there are nodes on the first server that are in failed state.
11 Citations
9 Claims
-
1. In a cluster-aware data processing system having a processor, a memory coupled to the processor, at least one input/output (I/O) adapter that enables connection to an external network with a shared storage repository, and a plurality of virtual I/O servers (VIOSes) that form a VIOS cluster with a shared database, where each VIOS is cluster aware, a method comprising:
-
activating a first monitoring thread on a first VIOS of one or more VIOSes of a first server to track a status of a live partition mobility (LPM) event; recording information about the LPM event within the shared database by using the first monitoring thread; in response to the first VIOS sustaining a failure condition, identifying one or more functioning monitoring threads that continue to function on the first server, wherein the failure condition results in a loss of LPM event monitoring by the first monitoring thread; determining whether the one or more functioning monitoring threads is a single, last monitoring thread; and in response to receiving an indication that identifies at least the first VIOS of the one or more VIOSes on the first server as being in failed state, performing, via the last monitoring thread; a query of a mobility table within the shared database to determine one or more failed VIOSes, including the first VIOS, within the one or more VIOSes of the first server that are in the failed state; and a removal of one or more corresponding rows/entries that are associated with the one or more failed VIOSes from the mobility table. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
Specification