HP Serviceguard Extended Distance Cluster for Linux A.11.20.20 Deployment Guide, August 2013
Table 4 Disaster Scenarios and Their Handling (continued)
Recovery ProcessWhat Happens When This
Disaster Occurs
Disaster Scenario
Complete the following procedure to initiate
a recovery:
1. Restore data center 1, Node 1 and
storage 1. Once Node 1 is restored, it
rejoins the cluster. Once S1 is restored,
it becomes accessible from Node 2.
NOTE: Manual intervention is required
to add back a disk to MD device on
SUSE Linux Enterprise Server. For more
information, see Troubleshooting
serviceguard-xdc packages.
2. Enable P1 to run on Node 1 by running
the following command:
# cmmodpkg -e P1 -n N1
The package (P1) fails over to
Node 2 and starts running with
the mirror of md0 that consists of
only the storage local to node 2
(S2).
A package (P1) is running on a node
(Node 1). The package uses a mirror
(md0) that consists of two storage
components - S1 (local to Node 1
-/dev/hpdev/mylink-sde ) and S2
(local to Node 2)
Data center 1 that consists of Node 1
and P1 experiences a failure.
NOTE: In this example, failures in a
data center are instantaneous. For
example - power failure.
For the first failure scenario, complete the
following procedure to initiate a recovery:
1. Restore the links in both directions
between the data centers. As a result,
S2 (/dev/hpdev/mylink-sdf ) is
accessible from N1 and S1 is accessible
from N2.
NOTE: Manual intervention is required
to add back a disk to MD device on SUSE
Linux Enterprise Server. For more
information, see Troubleshooting
serviceguard-xdc packages.
The package (P1) continues to run
on N1 after the first failure, with
md0 consisting of only S1.
After the second failure, the
package (P1) fails over to N2
and starts with S1. Since S2 is
also accessible, the extended
distance cluster adds S2 and
starts re-mirroring of S2.
This is a multiple failure scenario where
the failures occur in a particular
sequence in the configuration that
corresponds to figure 2 where Ethernet
and FC links do not go over DWDM.
The package (P1) is running on a node
(N1). P1 uses a mirror md0 consisting
of S1 (local to node N1,
say/dev/hpdev/ mylink-sde) and
S2 (local to node N2 ) .
The first failure occurs with all FC links
between the two data centers failing,
causing N1 to lose access to S2 and
N2 to lose access to S1.
After recovery for the first failure has
been initiated, the second failure occurs
when re-mirroring is in progress and
N1 goes down.
38 Disaster Scenarios and Their Handling