Cascading Failover in a Continentalclusters, December 2005

Failover from Primary Cluster to Recovery Cluster

After reception of the Continentalclusters alert and alarm, the administrators at the recovery site follow

the prescribed processes and recovery procedures to start the protected applications on the recovery

cluster. Note that data corruption may occur in situation where a disaster occurs at the primary cluster

while the data refresh from secondary disk array to the recovery disk array is in progress. Under

these circumstances, the data in the recovery replication group devices in the recovery disk array is

not usable. The data can be recovered by restoring an old copy of the data from the local mirror

devices (device C’) in the recovery disk array; as shown in Figure 7.

Figure 7 – Failover from Primary Site to Recovery Cluster

Execute the following commands to restore the data:

1. If the data was being refreshed from the primary site to the recovery site, restore the data

from the local mirror to the recovery replication group devices in the recovery disk array.

2. Check the data restore progress assuming step 1 applies.

3. Once the restore completed, split the local mirror device (device C’) from the recovery

replication group. The data in the recovery disk array may not be current but should be

consistent. There is no additional procedure needed. Metrocluster is programmed to handle

this case.

4. After the application is up and running, re-establish the local mirror devices as mirrors of the

standard devices for an additional copy of the data.

Failback from the Recovery Cluster to the Secondary Site within the

Primary Cluster

This procedure is used when the application fails back and runs on the secondary site while the

primary site is still down.

1. Halt the Continentalclusters monitor package.

2. Halt the Continentalclusters recovery packages at the recovery site.

3. Split the local mirror device (device C’) from the recovery replication group in the recovery

disk array.