Disaster recovery rehearsal in Continentalclusters Technical white paper Table of contents Executive summary............................................................................................................................... 2 Audience ........................................................................................................................................ 2 Terminology .................................................................................................................
Executive summary This white paper supplements the information provided in the Designing Disaster Recovery HA Clusters Using Metrocluster and Continentalclusters manual about disaster recovery (DR) rehearsal. It provides examples of how to set up and run DR rehearsal in a Continentalclusters cluster using Continuous Access P9000 or XP data replication, Continentalclusters using EMC SRDF data replication, and Continentalclusters in Three Data Center Disaster Recovery Solution (3DC DR Solution).
Recovery cluster It is a cluster on which recovery of a package takes place following a failure on the primary cluster. Recovery package In the event of primary cluster failure, the recovery package takes over at the recovery cluster. Rehearsal package In the event of a rehearsal, the rehearsal package is started at the recovery cluster. R1 The Symmetrix term indicating the data copy that is the primary copy. R2 The Symmetrix term indicating the remote data copy that is the secondary copy.
Introduction to DR rehearsal For a successful recovery in a Continentalclusters environment, it is critical that the configurations on all the systems at both primary and recovery clusters are kept synchronized. After the initial Continentalclusters setup, the configurations are usually subject to change. It is the operator’s responsibility to ensure that any changes done at the primary cluster nodes are also updated on the recovery cluster nodes.
Figure 1: DR rehearsal in Continentalclusters A command (cmdrprev) to preview the data replication storage failover in Metrocluster with Continuous Access EVA, Metrocluster with Continuous Access for P9000 and XP, and Metrocluster with EMC SRDF is provided. The command can be run on either the primary cluster nodes or the recovery cluster nodes at any time to identify errors in the data replication environment. The command does not require any preparation and is non-intrusive (i.e.
Sample environment to illustrate DR rehearsal The DR rehearsal feature in Continentalclusters can be used with any Single Instance or multi-instance application except when configured using SADTA. As an example, here we consider the setup procedures for rehearsing Single Instance, Oracle 9i RAC, and Oracle 10g RAC applications in Continentalclusters using Continuous Access P9000 or XP data replication and Continentalclusters using EMC SRDF data replication.
5. payroll_recgp payroll_primpkg, payroll_dg payroll_vg (LVM VG) mkt_dg mkt_vg (LVM VG) payroll_recpkg 6. mkt_recgp mkt_primpkg, mkt_recpkg In this white paper the recovery groups, sales_rac9i_recgp_1 and sales_rac9i_recgp_2, are assumed to be configured in Continentalclusters using EMC SRDF data replication. The recovery groups, inv_rac10g_recgp, billing_recgp, payroll_recgp, and mkt_recgp, are assumed to be configured in Continentalclusters using Continuous Access P9000 or XP data replication.
Package configuration This section describes the package related configuration procedures that you need to follow; 1. Disable domino mode (required only for Continentalclusters using EMC SRDF data replication). 2. Set up the file system for Continentalclusters state directory. 3. Configure the monitor package to mount the file system from the shared disk. 4. Configure the rehearsal package. 5.
4. Configure the volume group as cluster aware: De-activate the Volume Group on ATLnode1. #vgchange -a n /dev/vgcc #vgchange -c y /dev/vgcc 5. Import the volume group on all other nodes in the Atlanta cluster (i.e., ATLnode2). Use the vgexport command with the -p option to export the volume group vgcc #vgexport -s -p -m mapfile /dev/vgcc. Copy the map file to the remaining nodes in the Atlanta cluster (i.e., ATLnode2). 6.
6. Use the following procedure to import volume groups on the remaining nodes of Houston cluster (i.e, HOUnode2): #mkdir /dev/vgcc #mknod /dev/vgcc/group c 64 0xnn0000 7.
Rehearsal package configuration Rehearsal of a recovery group starts the rehearsal package, which is configured with the recovery package’s volume group/filesystem directory. The rehearsal package is a non-Metrocluster type package (i.e., the Metrocluster environment file is not present in the package directory) with its own package directory containing a separate copy of the configuration and control files.
Table 2 identifies the rehearsal packages to be used for the recovery groups, inv_rac10g_recgp, sales_rac9i_recgp_1, sales_rac9i_recgp_2, and billing_recgp. This section describes the procedures to configure the rehearsal packages on the recovery Houston cluster. Table 2. Recovery group configured with rehearsal package No. Recovery groups Primary package, recovery package, rehearsal package 1. inv_rac10g_recgp inv_rac10g_primpkg, Sl.
c. service_name 3. For all other parameters, provide the same values as specified in the recovery package configuration. 4. Validate the package configuration. # cmcheckconf –P 5. Apply the package configuration. # cmapplyconf –P Configure the rehearsal package billing_rhpkg (for recovery group billing_recgp) on the recovery Houston cluster.
Configure the rehearsal package inv_rac10g_rhpkg (for recovery group inv_rac10g_recgp) on the recovery Houston cluster. For configuring inv_rac10g_rhpkg in modular style, follow the steps in the section, Configuring Continentalclusters rehearsal packages as modular packages. From HOUnode1 (or from any of the nodes in Houston cluster), complete the following steps for configuring inv_rac10g_rhpkg in legacy style: 1. Create the rehearsal package directory. #mkdir /etc/cmcluster/inv_rac10g_rhpkg/ 2.
3. Copy the recovery package configuration file and the control file to the rehearsal package directory and rename it for the rehearsal package. #cp /tmp/sales_rac9i_recpkg_1.conf /etc/cmcluster/sales_rac9i_rhpkg_1/sales_rac9i_rhpkg_1.conf #cp sales_rac9i_recpkg_1.cntl /etc/cmcluster/sales_rac9i_rhpkg_1/sales_rac9i_rhpkg_1.cntl Note: If ECMT Oracle toolkit is used for creating recovery package, copy the files related to Oracle toolkit from recovery package directory to rehearsal package directory. 4.
4. Edit the package configuration file saved in the rehearsal package directory and change PACKAGE_NAME, RUN_SCRIPT and HALT_SCRIPT as follows: PACKAGE_NAME RUN_SCRIPT HALT_SCRIPT ”sales_rac9i_rhpkg_2” /etc/cmcluster/sales_rac9i_rhpkg_2/sales_rac9i_rhpkg_2.cntl /etc/cmcluster/sales_rac9i_rhpkg_2/sales_rac9i_rhpkg_2.cntl Also modify the SERVICE_NAME parameter. 5.
On ATLnode1, edit the file /etc/sales_rac9i_primpkg_1_srdf.env and set the variable AUTOSPLITR1 to 1 which is shown below. The name of package control file for package sales_rac9i_primpkg_1 is assumed to be sales_rac9i_primpkg_1.cntl.
Continentalclusters configuration The Continentalclusters parameter CONTINENTAL_CLUSTER_STATE_DIR is the absolute path to the Continentalclusters state directory where the state related information required for Maintenance mode feature is stored. This filesystem is mounted and unmounted by ccmonpkg at the time of package startup and halt respectively.
In this section, the Continentalclusters configuration is updated with 1) rehearsal package names used for rehearsing the recovery groups and 2) Continentalclusters state directory name. Complete the following procedure to update the configuration with the rehearsal packages and the Continentalclusters shared directory name. 1. In the Cluster section of the /etc/cmcluster/cmconcl.
Recovery group: billing_recgp Primary package : Atlanta/billing_primpkg Recovery package : Houston/billing_recpkg Rehearsal package : Houston/billing_rhpkg Recovery group: mkt_recgp Primary package : Atlanta/mkt_primpkg Recovery package : Houston/mkt_recpkg Recovery group: payroll_recgp Primary package : Atlanta/payroll_primpkg Recovery package : Houston/payroll_recpkg +--+ 3. Apply the Continentalclusters configuration information using the cmapplyconcl command. $cmapplyconcl –C cmconcl.
Rehearsing recovery in Continentalclusters This section describes how to perform a Continentalclusters recovery rehearsal for a Single Instance application, Oracle 10g RAC, and Oracle 9i RAC applications. Following are the steps that you need to follow to start and stop rehearsal for a recovery group: 1. Verify data replication environment. 2. Move the recovery group into maintenance mode. 3. Prepare the replication environment for DR rehearsal. 4. Start rehearsal for the recovery group. 5.
PACKAGE RECOVERY GROUP inv_rac10g_recgp Maintenance Mode no PACKAGE ROLE STATUS Atlanta/inv_rac10g_primpkg primary up Houston/inv_rac10g_recpkg recovery down Houston/inv_rac10g_rhpkg rehearsal down PACKAGE RECOVERY GROUP sales_rac9i_recgp_1 Maintenance Mode no PACKAGE ROLE STATUS Atlanta/sales_rac9i_primpkg_1 primary up Houston/sales_rac9i_recpkg_1 recovery down Houston/sales_rac9i_recpkg_1 rehearsal down PACKAGE RECOVERY GROUP sales_rac9i_recgp_2 Maintenance Mode no PACKAGE ROLE STATUS Atlanta/sales_rac9i_
Split the BC pair at recovery cluster. $) export HORCC_MRCF=1 $) pairsplit –g billing_dg $) unset HORCC_MRCF 5. Start rehearsal. $) cmrecovercl -r -g billing_recgp Note: Using the cmviewcl command, verify that the rehearsal package billing_rhpkg was started. Note: Before starting rehearsal, make any application configuration required due to change in the client access IP address which is now the rehearsal package IP address.
3. Using the cmviewconcl command, verify that the recovery group inv_rac10g_recgp is in maintenance mode. 4. Prepare the replication environment for DR rehearsal. Manually suspend the replication and enable write access to secondary mirror copy. $) pairsplit –g inv_rac10g_dg -rw For each volume group configured on the secondary mirror copy change the cluster id by running the following command from any of the recovery cluster nodes.
7. Stop rehearsal package. $) cmhaltpkg inv_rac10g_rhpkg 8. Restore replication environment for recovery. First synchronize the secondary mirror copy with the primary mirror copy and then synchronize the BC with the secondary mirror copy: $) $) $) $) pairresync –g inv_rac10g_dg export HORCC_MRCF=1 pairresync –g inv_rac10g_dg unset HORCC_MRCF 9. Move the recovery group inv_rac10g_recgp out of maintenance mode.
#/usr/sbin/cmdrprev -e “/etc/sales_rac9i_recpkg_2_srdf.env” 2. Move the recovery group sales_rac9i_recgp_1 and sales_rac9i_recgp_2 into maintenance mode. #cmrecovercl -d -g sales_rac9i_recgp_1 #cmrecovercl -d -g sales_rac9i_recgp_2 Note: Using the cmviewconcl command, verify that the recovery groups sales_rac9i_recgp_1 and sales_rac9i_recgp_2 are in maintenance mode. 3. Prepare the replication environment for DR rehearsal.
7. Move the recovery group sales_rac9i_recgp_1 and sales_rac9i_recgp_2 out of maintenance mode. #cmrecovercl -e -g sales_rac9i_recgp_1 #cmrecovercl -e -g sales_rac9i_recgp_2 Note: Using the cmviewconcl command, verify that the recovery group sales_rac9i_recgp_1 and sales_rac9i_recgp_2 are not in maintenance mode. Note: Restore the listener configuration on nodes HOUnode1 and HOUnode2 to use the original set up IP addresses i.e., sales_rac9i_recpkg_1 and sales_rac9i_recpkg_2 respectively.
- Successfully started package inv_rac10g_recpkg on node HOUnode1 Thu Aug 31 05:40:23 2006 - Running package inv_rac10g_recpkg on node HOUnode2 Thu Aug 31 05:41:03 2006 - Successfully started package inv_rac10g_recpkg on node HOUnode2 Successfully started package inv_rac10g_recpkg cmrecovercl: Attempting to recover Recovery Group from cluster Enabling recovery package billing_recpkg on recovery cluster Running package billing_recpkg Thu Aug 31 05:40:23 2006 - Running pack
Rehearsing recovery in the Three Data Center Disaster Recovery Solution (3DC DR Solution) Three Data Center Disaster Recovery Solution (3DC DR Solution) is a fully redundant disaster recovery solution that integrates HP Serviceguard, HP Metrocluster with Continuous Access for P9000 and XP, HP Continentalclusters, and the HP StorageWorks P9000 or XP 3DC Data Replication.
Complete the following steps to create a modular rehearsal package in a 3DC environment: a. Create a package configuration identical to the recovery package configuration but without Metrocluster with Continuous Access for P9000 and XP 3DC recovery module dts/recovery_xpca3dc. b. Change the values of the following parameters: • package_name • package_ip • service_name For all other parameters, provide the same values as specified in the recovery package configuration. c.
4. Prepare the replication environment for DR rehearsal: a) In case of a 3DC CAJ/CAJ Tri-Link configuration using Delta Resync, delete the Delta Resync pair from a node in the recovery cluster. Otherwise, go to step b: # pairsplit –g -R b) Manually suspend the Active-CAJ pair and enable write access to recovery disk at the recovery cluster: # pairsplit -g -rw -P c) Change the Cluster ID of all LVM and SLVM volume groups managed by the package.
c) d) Assign Remote Command devices to journal volumes of Delta Resync pair using XP Remote Web Console. For details on assigning remote command devices to journal volumes, see “Assigning Remote Command Devices to Journal Volumes ” section in the latest version of the Designing Disaster Recovery HA Clusters Using Metrocluster and Continentalclusters manual, available at www.hp.com/go/hpux-serviceguard-docs -> HP Serviceguard Metrocluster with Continuous Access for P9000 and XP.
Precautions This section describes the precautions the operator has to follow while performing DR rehearsals. Client access IP address at recovery cluster During a DR rehearsal, Continentalclusters will start the rehearsal package which could be configured to bring up the application instance at the recovery cluster.
Strengths This section lists the strengths of the DR rehearsals feature. 1. Minimal application reconfiguration for rehearsal: Because the rehearsal package is configured to use the volume group/filesystem that is used for the recovery package, the existing application setup at the recovery cluster can be started for rehearsal with minimal reconfiguration.
Limitations This section lists the limitation of the DR rehearsals feature. 1. The replication preparation and restoration for rehearsal and restoration for recovery is manual (i.e., operator has to prepare/restore the replication environment for each recovery group by following the instructions provided in this white paper and information provided in the Designing Disaster Recovery HA Clusters Using Metrocluster and Continentalclusters manual). 2.
Troubleshooting DR rehearsal problems This section describes the various problems that you can face while using DR rehearsal. The problems are listed followed by the workaround or solution. The “cmrecovercl –d” fails to move a recovery group into maintenance mode Follow the steps below to move a recovery group into maintenance mode even if the primary package is halted. 1. Verify from stdout messages that the command failed because the primary package was down.
3. Restore replication environment for recovery. Because the primary site is down, restore the secondary mirror copy from BCV. This cleans up rehearsal changes (if any) on secondary mirror copy and restores it with point-in-time copy of production data taken at the time of rehearsal startup. $) symmir –g sales_rac9i_dg restore 4. Move the recovery group sales_rac9i_recgp_1 and sales_rac9i_recgp_2 out of maintenance mode.
Related documentation The following related documents can be found at www.hp.com/go/hpux-serviceguard-docs. • Managing Serviceguard Nineteenth Edition, September 2010 • Designing Disaster Recovery HA Clusters Using Metrocluster and Continentalclusters, latest version © Copyright 2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.