HP Serviceguard Extended Distance Cluster for Linux A.11.20.10 Deployment Guide, December 2012

6 Disaster Scenarios and Their Handling
The previous chapters provided information on deploying Software RAID in your environment. In
this chapter, you will find information on how Software RAID addresses various disaster scenarios.
All the disaster scenarios described in this section have the following three categories:
Disaster Scenario
Describes the type of disaster and provides details regarding the cause and the sequence of
failures leading to the disasters in the case of multiple failures.
What happens when this disaster occurs
Describes how the Extended Distance Cluster software handles this disaster.
Recovery Process
After the disaster strikes and necessary actions are taken by the software to handle it, you
need to ensure that your environment recovers from the disaster. This section describes all the
steps that an administrator needs to take to repair the failures and restore the cluster to its
original state. Also, all the commands listed in this column must be executed on a single line.
The following table lists all the disaster scenarios that are handled by the Extended Distance Cluster
software. All the scenarios assume that the setup is the same as the one described in “Extended
Distance Clusters (page 11) of this document.
Table 4 Disaster Scenarios and Their Handling
Recovery ProcessWhat Happens When This
Disaster Occurs
Disaster Scenario
As the network and both the mirrored disk
sets are accessible on Node 2, and were
also accessible when Node 1 failed, you
only need to restore Node 1. Then you
must enable the package to run on Node
1 after it is repaired by running the
following command:
# cmmodpkg -e P1 -n N1
The package (P1) fails over to
another node (Node 2).
This node (Node 2) is configured
to take over the package when it
fails on Node 1.
A package (P1) is running on a node
(Node 1). Node 1 experiences a
failure.
Once you restore power to S1, or restore
the FC links to S1, the corresponding mirror
half of S1 ( /dev/hpdev/mylink-sde
) is accessible from Node 1. To make the
restored mirrored half part of the MD array,
complete the following procedure:
1. Run the following command to remove
the mirrored half from the array:
# mdadm --remove /dev/md0
/dev/hpdev/mylink-sde
2. Run the following command to add the
mirrored half to the array:
# mdadm --add /dev/md0
/dev/hpdev/mylink-sde
NOTE: This does not work on SUSE
Linux Enterprise Server. For more
information, see Troubleshooting
serviceguard-xdc packages
The re-mirroring process is initiated. When
it is complete, the extended distance cluster
detects the added mirror half and accepts
S1 as part of md0.
The package (P1) continues to run
on Node 1 with the mirror that
consists of only S2.
A package (P1) is running on a node
(Node 1). The package uses a mirror
(md0) that consists of two storage
components - S1 (local to Node 1 -
/dev/hpdev/mylink-sde) and S2
(local to Node 2).
Access to S1 is lost from both nodes,
either due to power failure to S1 or loss
of FC links to S1.
35