HP Serviceguard Extended Distance Cluster for Linux A.12.00.00 Deployment Guide, March 2014
Table 4 Disaster Scenarios and Their Handling (continued)
Recovery ProcessWhat Happens When This Disaster
Occurs
Disaster Scenario
In this case, the package
(P1) runs with RPO_TARGET
set to 60 seconds.
Package P1 is running on
node N1. P1 uses a mirror
md0 consisting of S1 (local
to node N1, for example
/dev/hpdev/mylink-sde)
and S2 (local to node N2).
The first failure occurs when
all FC links between two
data centers fail, causing
N1 to lose access to S2 and
N2 to lose access to S1.
After the package resumes
activity and runs for 90
seconds, a second failure
occurs causing node N1 to
fail.
In this scenario, no attempts are made to repair the
first failure until the second failure occurs. Complete
the following procedure to initiate a recovery:
1. To recover from the first failure, restore the FC
links between the data centers. As a result, S1
is accessible from N2.
2. After the FC links are restored, and S1 is
accessible from N2, run the following command
to restart the package on N2.
# cmrunpkg <package_name>
When the package starts up on N2, it automatically
adds S1 back into the array and starts re-mirroring
from S1 to S2. When re-mirroring is complete, the
extended distance cluster detects and accepts S1
as part of md0 again.
For the second failure, restore N1. Once it is
restored, it joins the cluster and can access S1 and
S2.
1. Run the following command to enable P1 to run
on N1:
# cmmodpkg -e P1 -n N1
The package (P1) continues to run
on N1 with md0 consisting of only
S1 after the first failure
After the second failure, the
package does not start up on N2
because when it tries to start with
only S2 on N2, it detects that S2 is
non-current for a time period which
is greater than the value of
RPO_TARGET.
Complete the following procedure to initiate a
recovery:
1. Reconnect the FC links between the data centers.
As a result, S1 (/dev/hpdev/mylink-sde )
becomes accessible from N2
NOTE: Manual intervention is required to add
back a disk to MD device on SUSE Linux Enterprise
Server. For more information, see Troubleshooting
serviceguard-xdc packages.
If the FC links are not restored on
N2, you can only start the package
forcefully. You can forcefully start a
package only if it is determined that
the associated data loss is
acceptable.
After you execute the force start
commands, package P1 starts on
N2 and runs with md0 consisting
of only S2
(/dev/hpdev/mylink-sdf ).
This scenario is an extension
of the previous failure
scenario. In the previous
scenario, when the package
fails over to N2, it does not
start as the value of
RPO_TARGET would have
been exceeded.
To forcefully start the
package P1 on N2 when the
FC links are not restored on
N2, check the package log
file on N2 and execute the
commands that appear in it.
42 Disaster Scenarios and Their Handling