Cascading Failover in a Continentalclusters, December 2005

Failover from Primary Cluster to Recovery Cluster
After reception of the Continentalclusters alert and alarm, the administrators at the recovery site follow
the prescribed processes and recovery procedures to start the protected applications on the recovery
cluster. Note that data corruption may occur in situation where a disaster occurs at the primary cluster
while the data refresh from secondary disk array to the recovery disk array is in progress. Under
these circumstances, the data in the recovery replication group devices in the recovery disk array is
not usable. The data can be recovered by restoring an old copy of the data from the local mirror
devices (device C’) in the recovery disk array; as shown in Figure 7.
Figure 7 – Failover from Primary Site to Recovery Cluster
Execute the following commands to restore the data:
1. If the data was being refreshed from the primary site to the recovery site, restore the data
from the local mirror to the recovery replication group devices in the recovery disk array.
2. Check the data restore progress assuming step 1 applies.
3. Once the restore completed, split the local mirror device (device C’) from the recovery
replication group. The data in the recovery disk array may not be current but should be
consistent. There is no additional procedure needed. Metrocluster is programmed to handle
this case.
4. After the application is up and running, re-establish the local mirror devices as mirrors of the
standard devices for an additional copy of the data.
Failback from the Recovery Cluster to the Secondary Site within the
Primary Cluster
This procedure is used when the application fails back and runs on the secondary site while the
primary site is still down.
1. Halt the Continentalclusters monitor package.
2. Halt the Continentalclusters recovery packages at the recovery site.
3. Split the local mirror device (device C’) from the recovery replication group in the recovery
disk array.
11