Building Disaster Recovery Serviceguard Solutions Using Metrocluster with Continuous Access for P9000 and XP A.11.
Legal Notices © Copyright 2013 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use, or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor’s standard commercial license. The information contained herein is subject to change without notice.
Contents 1 Introduction.............................................................................................10 Overview of Continuous Access P9000 and XP concepts.............................................................10 PVOLs and SVOLs..............................................................................................................10 Device groups...................................................................................................................
Setting up replication..............................................................................................................36 Setting up the complex workload redundantly in Metrocluster.......................................................36 Configuring the storage device for the complex workload at the source disk site.........................37 Configuring the storage device using SG SMS CFS or CVM................................................
7 Administering Metrocluster.........................................................................69 Administering a Metrocluster using Continuous Access XP/P9000 replication.................................69 Adding a node to Metrocluster............................................................................................69 Node maintenance............................................................................................................69 Planned maintenance.............................
3DC DR Solution Configuration..............................................................................................100 Tri-Link Configuration........................................................................................................102 Bi-Link Configuration.........................................................................................................104 Overview of Device Group Monitor Feature in 3DC DR Solution.................................................
Data Maintenance with the Failure of a Metrocluster with Continuous Access for P9000 and XP Failover...............................................................................................................................152 Swap Takeover Failure (for Continuous Access Device Group Pair Between DC1 and DC2).......152 Takeover Timeout (for third data center)...............................................................................
Configuring the network...............................................................................................197 Configuring the storage device for installing Oracle clusterware........................................198 Setting up CRS OCR and VOTING directories ................................................................198 Installing and configuring Oracle clusterware..................................................................
Maintaining Oracle database RAC MNP packages on a site.................................................221 Maintaining Oracle database RAC....................................................................................222 Moving a site aware disaster tolerant Oracle RAC database to a remote site...........................222 Glossary..................................................................................................223 Index.................................................................
1 Introduction This document describes the Continuous Access P9000 and XP software and the additional files that integrate the P9000 or XP Disk Arrays with Metrocluster. The document explains how to configure Metrocluster using Continuous Access P9000 or XP. For more information about the general characteristics of Metrocluster, see Understanding and Designing Serviceguard Disaster Recovery Architectures available at http://www.hp.com/go/hpux-serviceguard-docs.
Continuous Access Synchronous Replication In Continuous Access Synchronous replication, all write operations on the primary volume are replicated to the secondary volume before the write is acknowledged to the host. This synchronous replication mode ensures the highest level of data currency possible. Host I/O performance is directly impacted by the distance between the primary and secondary volumes.
(I/O ordering). A CT group is equal to a device group in the Raid Manager configuration file. A consistency group ID (CTGID) is assigned automatically during pair creation. NOTE: Different P9000 and XP models support different maximum numbers of Consistency Groups. For more information, see the P9000 user guide or XP user guide.
Figure 2 Journal Volume Primary Host Primary Data Volume Secondary Host Secondary Data Volume Issuing Read Journal Command Journal obtain function Journal restore function Journal Copy Function Master Journal Volume Restore Journal Volume Primary Storage System Secondary Storage System Site A Site B By writing the records to journal disks instead of keeping them in cache, Continuous Access Journal overcomes the limitations of earlier asynchronous replication methods.
typically involves a destructive process such as rewriting all the changed tracks, with possible loss of data consistency for ordered writes. In contrast, Continuous Access Journal logs every change to the journal disk at the primary site, including the metadata required to apply the changes consistently.
software and P9000 or XP Continuous Access software commands from a HP-UX host. Every execution of P9000 or XP RAID Manager is known as a RAID Manager instance. RAID Manager instances running on the local nodes communicate with the RAID Manager instances running on the remote nodes to get the status of the device group pair.
Figure 3 Sample configuration of Metrocluster for Linux for P9000 Continuous Access Quorum Server A B Node 1 Node 2 Node 3 Node 4 Metrocluster Router Router Synchronous / Asynchronous / Journal XP Array XP Array Site 1 NOTE: Site 2 The P9000 disk array family does not support asynchronous replication using side file.
NOTE: If the monitor is configured to automatically resynchronize the data from PVOL to SVOL upon link recovery, a Business Copy (BC) volume of the SVOL must be configured as another mirror. In the case of a rolling disaster and the data in the SVOL becomes corrupt due to an incomplete resynchronization, the data in the BC volume can be restored to the SVOL. This will result in non-current, but usable data in the BC volumes.
2 Configuring an application in a Metrocluster environment Installing the necessary hardware and software When you complete the following procedures, an adoptive node is able to access the data belonging to a package after it fails over. Setting up the storage hardware To set up the storage hardware do the following: 1.
Table 1 Site Aware Failover configuration Attributes Description SITE_NAME To define a unique name for a site in the cluster. SITE SITE keyword under the node's NODE_NAME definition.
1. 2. Install the Raid Manager software for either P9000 or XP on every host system depending on the disk array used in your environment. Edit the /etc/services file, to add an entry for the Raid Manager instance to use with the cluster. Use the following format to add an entry to the /etc/services file. horcm /udp For Example, horcm0 11000/udp #Raid Manager instance 0 3. 4.
Now, use Raid Manager commands to get further information from the disk arrays. To verify the software revision of the Raid Manager and the firmware revision of the P9000 or XP disk array, run the following command: # raidqry -l NOTE: Verify for the minimum requirement level for XP/P9000, Raid Manager software, and firmware for your version listed in the Metrocluster with Continuous Access for P9000 and XP Release Notes.
Also, complete the HORCM_INST field, by supplying the names of only those hosts that are attached to the P9000 or XP disk array that is remote from the disk array that is directly attached to this host. For Example, if node 1 and node 2 are in the source disk site and nodes 3 and node 4 are in the recovery site, you must specify only nodes 3 and node 4 in the HORCM_INST field in a file you are creating on node 1 on the source disk site. Node 1 must have previously been specified in the HORCM_MON field. 10.
Notes on the Raid Manager configuration • A single P9000 or XP device group must be defined for every package on every host that is connected to the P9000 or XP series disk array. • Device groups are defined in the Raid Manager configuration file under the heading HORCM_DEV or HORCM_LDEV.
4. Install a VxFS file system on the logical volume. # newfs -F vxfs /dev//rlvol1 5. Deactivate and export the Volume Groups on the primary system without removing the special device files. # vgchange -a n # vgexport -s -p -m Ensure that you copy the mapfiles to all of the host systems. 6. On the source disk site import the VGs on all the other systems, that might run the Serviceguard package, and backup the LVM configuration.
4. Create the disk group to be used with the vxdg command only on the primary system. # vxdg init disk 2 5. Verify the configuration. # vxdg list 6. Use the vxassist command to create the logical volume. (XXXX is the size of the volume) # vxassist -g make XXXX 7. Verify the configuration. # vxprint -g 8. Make the filesystem. # newfs -F vxfs /dev/vx/rdsk// 9. Create a directory to mount the volume group. # mkdir /logs 10. Mount the volume group.
Repeat steps 4 through 9 on all the nodes in the cluster that require access to this disk group. 10. Resynchronize the Continuous Access pair device. # pairresync -g -c 15 Easy deployment of storage Starting with Serviceguard version A.11.20 Patch PHSS_41628, you can use the cmpreparestg command to create LVM volume groups and VxVM/CVM diskgroups with logical volumes, filesystem, and mount points in a Metrocluster environment.
# cmmakepkg –m dts/mcxpca –m sg/filesystem -m sg/package_ip -m ecmt/oracle/oracle temp.config 2. Edit the following attributes in the temp.config file: a. dts/dts/dts_pkg_dir - This is the package directory for this Metrocluster Modular package. The Metrocluster Environment file is generated for this package in this directory. This value must be unique for all packages. b. DEVICE_GROUP - Specify the device group name managed by this package, as specified in the RAID Manager configuration file. c.
Table 4 Device Group Monitor Parameters (continued) PARAMETERS DESCRIPTION MON_NOTIFICATION_SYSLOG <0 If you want notification messages to be logged in the syslog file, uncomment the MON_NOTIFICATION_SYSLOG variable and set it or 1> to 1. If the variable is not set, the default value is 0. MON_NOTIFICATION_CONSOLE <0 or 1> If you want notification messages to be logged on the system's console, uncomment the MON_NOTIFICATION_CONSOLE variable and set it to 1.
NOTE: If external_pre_script is specified in a Metrocluster package configuration, the external_pre_script is executed after the execution of Metrocluster module scripts in package startup. Metrocluster module scripts are always executed first during package startup. 6. Run the package on a node in the Serviceguard cluster. # cmrunpkg -n 7. Enable global switching for the package.
Figure 4 Creating modular package 4. If the product Metrocluster with Continuous Access XP Toolkit is installed, you are prompted to configure a Metrocluster package. Select the dts/mcxpca module, and then click Next. Figure 5 Selecting Metrocluster module 5. 6. 30 You are prompted next to include any other toolkit modules. In case, your application being configured requires a Serviceguard toolkit, select the appropriate toolkit, otherwise move to the next screen.
Figure 6 Configuring package name 7. Select additional modules depending on the application. For example, if the application uses LVM volumegroups or VxVM diskgroups, select the volume_group module. Click Next. Figure 7 Selecting additional modules 8. Review the node order in which the package will start, and modify other attributes, if required. Click Next.
Figure 8 Configuring generic failover attributes 9. You are prompted to configure the attributes for a Metrocluster package. Ensure that all the mandatory attributes (marked *) are accurately filled.
10. Enter the values for other modules selected in step 7. 11. After you enter the values for all modules,in the final screen review all the inputs for the various attributes, and then click OK to apply the configuration. Figure 10 Applying the configuration Easy deployment of Metrocluster modular packages Starting with Serviceguard version A.11.20, the Package Easy Deployment feature is introduced. This feature is available from the Serviceguard Managed version B.03.10 or later.
The following prerequisites and limitations are applicable to package easy deployment for Metrocluster with Continuous Access for P9000 and XP. Prerequisite The device group pair must have been already created. Limitations 1. 2. 3. 4. 5. 34 A device group configuration with a HORCMPERM file is not supported. All physical devices must belong to exactly one DEVICE GROUP. The Device Group Monitor is not configured as a part of Easy Deployment. Three data center configurations is not supported.
3 Configuring complex workloads using SADTA SADTA enables deploying complex workloads in a Metrocluster. Complex workloads are applications configured using multi-node and failover packages with dependencies. For more information on SADTA, see Understanding and Designing Serviceguard Disaster Recovery Architectures at http://www.hp.com/go/hpux-serviceguard-docs.
SITE ... . . . NODE_NAME SITE . . . NODE_NAME SITE . . . NODE_NAME SITE . . . 3. 4. Run the cmapplyconf command to apply the configuration file. Run the cmruncl command to start the cluster. After the cluster is started, you can run the cmviewcl command to view the site configuration.
The storage device for a complex workload must first be configured at the site with the source disk of the replication disk group. Then, a complex workload package stack must be created at this site. It is only at this stage that an identical complex workload using the target replicated disk must be configured with the complex workload stack at the other site.
8. Verify the package configuration file: # cmcheckconf -P cfspkg1.ascii 9. Apply the package configuration file: # cmapplyconf -P cfspkg1.ascii 10. Run the package: # cmrunpkg Configuring the storage device using Veritas CVM To set up the CVM disk group volumes, perform the following steps on the CVM cluster master node in the Source Disk Site: 1. Initialize the source disks of the replication pair.
dependency_condition dependency_location 5. SG-CFS-pkg=up same_node Apply the newly created package configuration. # cmapplyconf -v -P .conf Configuring the storage device using SLVM To create volume groups on the Source Disk Site: 1. Define the appropriate volume groups on every host system in the Source Disk Site. # mkdir /dev/ # mknod /dev//group c 64 0xnn0000 where the name /dev/ and the number nn are unique within the entire cluster.
Configuring complex workload packages to use SG SMS CVM or Veritas CVM When the storage used by complex workload is CVM disk groups, the complex workload packages must be configured to depend on the CVM disk group multi-node package. With this package dependency, the complex workload will not run until its dependent CVM disk group multi-node package is up, and will halt before the CVM disk group multi-node package is halted.
node_name node_name package_name cvm_disk_group cvm_activation_mode "node3=sw node4=sw" cfs_mount_point cfs_volume / cfs_mount_options "node3=cluster node4=cluster" cfs_primary_policy "" Where node3 and node4 are the nodes at the target disk site. Do not configure any mount specific attributes such as cfs_mount_point and cfs_mount_options if SG SMS CVM is configured as raw volumes. 4.
Ensure that the map files are copied to all the nodes in the target disk site. 3. On the target disk site, import the VGs on all the systems that will run the Serviceguard complex workload package. # vgimport -s -m Configure the identical complex workload stack at the recovery site The complex workload must be packaged as Serviceguard MNP packages. This step creates the complex workload stack at the target disk site that is configured to be managed by the Site Controller Package.
Figure 12 Creating a Site Controller package 4. If the product Metrocluster with Continuous Access for P9000 and XP Toolkit is installed, you are prompted to select the data replication type for the Site Controller package. Select the dts/mcxpca module, and click Next. Figure 13 Selecting replication module 5. 6. You are prompted to include any other toolkit modules, if installed. Skip this step if required, and move to the next screen. Enter the package name, and click Next.
Figure 14 Configuring Package Name and Type 7. Next, you are prompted to select additional modules required by the package. Skip this step if required, and move to the next screen. Figure 15 Selecting additional Modules 8. 44 Review the node order in which the package will start, and modify other attributes if required. Click Next.
Figure 16 Configuring Metrocluster attributes 9. Select Complex workload packages to be managed by Site Controller package on sites. Click Next. Figure 17 Selecting complex workload packages 10. You are prompted to configure the attributes for a Metrocluster package. Ensure that all the mandatory attributes (marked *) are accurately filled. Select Fence Level and specify the DEVICE_GROUP attribute.
Figure 18 Configuring generic failover attributes 11. Enter the service module values. Figure 19 Configuring service module attributes 12. After the values for all the modules are entered, review all the inputs entered for the various attributes in the final screen, and apply the configuration.
Figure 20 Applying the configuration NOTE: Site controller package can also be created using the Package Easy Deployment feature available in Serviceguard Manager version B.03.10. For more details, see Using Easy Deployment in Serviceguard and Metrocluster Environments on HP-UX 11i v3 available at http://www.hp.com/ go/hpux-serviceguard-docs —> HP Serviceguard. In case of easy deployment, the Metrocluster module parameters are auto-discovered.
to the other site. The site_preferred_manual failover policy provides automatic failover of packages within a site and manual failover across sites. To configure the Site Controller package for the complex workload: 1. From any node, create a Site Controller package configuration file using the dts/sc module. # cmmakepkg -m dts/sc -m dts/mcxpca \ /etc/cmcluster/cw_sc/cw_sc.config 2. Edit the cw_sc.
resource_up_value != DOWN resource_start automatic When using SLVM, or Veritas CVM, or SG SMS CVM/CFS configured as a modular package, run the cmapplyconf command to apply the modified package configuration. 2. Verify the Site Safety Latch resource configuration at both sites. If you have SLVM or Veritas CVM or SG SMS CVM or CFS configured in your environment, run the following command to view the EMS resource details: # cmviewcl -v –p Modifying Site Controller to manage complex workload 1.
1. 2. Run the cmviewcl command to view the complex workload configuration in a Metrocluster. Enable all the nodes in the Metrocluster for the Site Controller package. # cmmodpkg –e –n –n -n –n cw_sc 3. Start the Site Controller Package. # cmmodpkg -e cw_sc The Site Controller package, and the complex-workload package starts up on local site. 4. 50 Verify the Site Controller Package log file to ensure clean startup.
4 Metrocluster features Data replication storage failover preview In an actual failure, packages are failed over to the standby site. As part of the package startup, the underlying storage is failed over based on the parameters defined in the Metrocluster environment file. The storage failover can fail due to many reasons, and can be categorized as the following: • Incorrect configuration or setup of Metrocluster and data replication environment.
that you set up a cron job to regularly run the cmcheckconf command. For more information about setting the cron job, see Setting up Periodic Cluster Verification section in the latest version of the Managing Serviceguard manual available at http://www.hp.com/go/hpux-serviceguard-docs —> HP Serviceguard. “Validating Metrocluster package” (page 52) lists the checks made on a Metrocluster Package.
Table 6 Validating Metrocluster package (continued) VxVM uses an internal format to recognize devices that do not support legacy dsf format). Also skips the verify if the naming convention is enclosure based. In such cases, you can change the naming scheme using the vxddladm command so that VxVM output shows the disk names in a persistent dsf format. Table 7 (page 53) lists additional checks made on the Site Controller packages.
Table 7 Additional validation of Site Controller packages (continued) Check if the Site values are valid cmapplyconf # cmapplyconf -P Checks whether the site values in this package are the sites that are configured in the cluster configuration. cmcheckconf [-P/-p] # cmcheckconf -P # cmcheckconf -p
Both these configurations protect the VM environment from disasters that result in the failure of an entire data center. For more information on the configuration, see the white paper Implementing disaster recovery in virtualized environment using HP vPars and HP Integrity VM with Metrocluster and Continentalclusters on HP-UX 11i available at http://www.hp.com/go/hpux-serviceguard-docs —> HP Serviceguard Metrocluster with Continuous Access for P9000 and XP.
NOTE: • Remote command device can be configured in Continentalclusters. • Remote command device need not be configured when P9000 or XP Continuous Access Journal replication is used. Configuring a remote array RAID Manager instance To configure the remote array RAID Manager instance in a Metrocluster environment with Continuous Access P9000 or XP: 1. Configure the command device to manage the remote P9000 or XP arrays.
Configuring the command device to manage the remote P9000 or XP arrays A command device to manage the remote P9000 or XP arrays can be configured using one of the following methods: • Configure a command device (remote command device) in a Metrocluster site by mapping a command device in the remote P9000 or XP array to a local device in the local P9000 or XP array using the P9000 or XP external storage feature. Present this remote command device to all Metrocluster nodes in the local site.
command devices. The remote array RAID Manager instance at both sites use the remote command devices. • Configure a dedicated command device in the P9000 or XP array at the remote site that is directly presented to the nodes in the local site over an extended SAN without using the P9000 or XP external storage feature. Similarly, configure a command device in the remote site P9000 or XP array and present it to the nodes of the local site over an extended SAN. Figure 22 illustrates this configuration.
3. Create a RAID Manager configuration file for the remote array RAID Manager instance using the RAID Manager configuration file template. For Example, # cp /etc/horcm.conf /etc/horcm1.conf 4. Configure the following parameters in the RAID Manager configuration file: • HORCM_MON Enter the host name of the system on which you are editing the file and the TCP/IP port number specified for the RAID Manager instance in the /etc/services file.
Edit the following parameters in the configuration file /etc/rc.config.d/raidmgr: • START_RAIDMGR Set this parameter to 1. • RAIDMGR_INSTANCE Specify all the RAID Manager instances that must be started during node boot-up. Include the remote array RAID Manager instance number as the value for this parameter.
5 Using XP or P9000 features in a Metrocluster Environment Metrocluster with Continuous Access for P9000 and XP and thin provisioning volumes Metrocluster with Continuous Access for P9000 and XP includes support for Thin Provisioning Volumes (TPVOLs) only on arrays XP20000, XP24000, P9500 and beyond, and P9000 disk array family. When TPVOLs are configured, there is a possibility that the TPVOL utilization will exceed the specified capacity or threshold of the pool.
6 Understanding failover/failback scenarios Metrocluster package by default fails to start if the data is not current or if it is not able to determine the status of the device group. In such situations, the user has options to start a package either by setting the value of Metrocluster package AUTO parameters to “1” or by using FORCEFLAG. To use FORCEFLAG option, you must create a FORCEFLAG file in the package directory (/FORCEFLAG).
Table 8 Failover/failback scenarios (continued) Failover/Failback Scenarios Fence Level Metrocluster Behavior (By default) AUTO parameters or FORCEFLAG set from the recovery site to the primary site. Link failure followed by application failure of all nodes in the primary site. Failback to primary site after link restoration data or never Metrocluster package fails to start as AUTO_NONCURDATA is set to “0” and FORCEFLAG is not present .
Table 8 Failover/failback scenarios (continued) Failover/Failback Scenarios Fence Level Metrocluster Behavior (By default) AUTO parameters or FORCEFLAG set async (side file) Metrocluster package fails to start as AUTO_SVOLPSUE is set to “0” and FORCEFLAG is not present. Metrocluster package checks for AUTO_SVOLPSUE setting and FORCEFLAG presence. If AUTO_SVOLPSUE is set to “1” or FORCEFLAG is present, Metrocluster package starts and issues horctakeover -S which results in SVOL takeover.
Table 8 Failover/failback scenarios (continued) Failover/Failback Scenarios Fence Level Metrocluster Behavior (By default) AUTO parameters or FORCEFLAG set Manual suspend followed by failure of all nodes in the primary site. Any Metrocluster package fails to start as AUTO_SVOLPSUS is set to “0” and FORCEFLAG is not present. Metrocluster package checks for AUTO_SVOLPSUS setting and FORCEFLAG presence.
remote site node, the Site Controller package verifies whether all the instances of the failed active packages have halted cleanly. The Site Controller Package verifies the last_halt_failed flag for every instance of the workload packages. The flag is set to yes for an instance whose halt script execution resulted in an error. Even if one instance of any of the failed workload's packages did not halt successfully, the Site Controller package aborts site failover.
Package starts on the adoptive node at the remote site, it detects that the active complex workload's packages have failed. Consequently, the Site Controller package performs a site failover and starts the corresponding complex workload's packages on the site where the cluster has reformed. Disk array and SAN failure When a disk array or the host access SAN at a site fails, the active complex workload database running on the site might hang or fail based on the component that has failed.
When the remote site starts, the Site Controller Package detects that the active complex-workload packages have failed and initiates a site failover by activating the passive complex-workload packages that are configured in the current site. The disaster tolerant complex workloads that have their active packages on the surviving site, where the cluster reformed, continue to run without any interruption.
7 Administering Metrocluster Administering a Metrocluster using Continuous Access XP/P9000 replication Adding a node to Metrocluster To add a node to Metrocluster with Continuous Access for P9000 and XP, use the following procedure: 1. Add a node in the cluster by editing the Serviceguard cluster configuration file and applying the configuration: # cmapplyconf -C cluster.config 2. Create the RAID Manager configuration on the newly added node.
Failback If the primary site is restored after a failover to the recovery site, you may want to fail back the package to the primary site. Manually resync the data from the recovery site to the primary site and wait for the resynchronization to complete. Before failing back the package from the recovery site to the primary site, run the cmdrprev command on the primary site nodes to preview the data replication storage failover.
Viewing the Continuous Access journal status The following two sections describe using the pairdisplay and raidvchkscan commands for viewing the Continuous Access Journal Status. Viewing the pair and journal group information using the “ pairdisplay” command The command option “-fe” is added to the Raid Manager pairdisplay command. This option is used to display the Journal Group ID (and other data) of a device group pair.
• In case of the S-JNL, Q-Marker shows the latest sequence number putting on the cache. • Q-CNT: Displays the number of remaining Q-Marker of a journal group. Figure 23 Q-Marker and Q-CNT • U(%): Displays the usage rate of the journal data. • D-SZ: Displays the capacity for the journal data on the journal group. • Seq#: Displays the serial number of the XP12000. • Num: Displays the number of LDEV (journal volumes) configured for the journal group.
pairresync command in one of the following two ways depending on which site you are running the command: • pairresync -swapp—from the primary site. • pairresync -swaps—from the recovery site. These options take advantage of the fact that the recovery site maintains a bit-map of the modified data sectors on the recovery array. Either version of the command will swap the personalities of the volumes, with the PVOL becoming the SVOL and SVOL becoming the PVOL.
In this case, either use FORCEFLAG to startup the package on SVOL site or fix the problem and resume the data replication with the following procedures: 1. Split the device group pair completely (pairsplit -g -S). 2. Re-create a pair from original PVOL as source (use paircreate command). 3. Startup package on either the PVOL site or SVOL site.
Once the node maintenance procedures are complete, join the node to the cluster using the cmrunnode command. If the Site Controller Package is running on the site that the node belongs to, the active complex-workload package instances on the site that have the auto_run flag set to yes, start automatically. If the auto_run flag set to no, then these instances must be manually started on the restarted node.
1. Identify the node where the Site Controller Package is running. cmviewcl –p 2. Log in to the node where the Site Controller Package is running and go to the Site Controller Package directory. cd 3. Run the HP-UX touch command with the DETACH flag, in the Site Controller Package directory. touch DETACH 4. Halt the Site Controller Package.
After the complex-workload packages are up, examine the package log files for any errors that will have occurred at startup. Shutting down a complex workload You can shutdown the complex workload in SADTA by halting the corresponding Site Controller Package. To shutdown the complex workload, run the following command on any node in the cluster: cmhaltpkg This command halts the Site Controller Package and the current active complex-workload packages.
1. 2. 3. Access one of the node’s System Management Home Page at http://:2301. Log in using the root user’s credentials of the node. Click Tools, if Serviceguard is installed, one of the widgets will have Serviceguard as an option. Click the ‘Serviceguard Manager’ link. On the Cluster’s Home Page, click the Administration Tab and select the required options. Different administrative options are listed below. Select the appropriate packages for the required option.
Figure 25 Rolling upgrade procedure for Metrocluster The subsequent sections describe the procedures for completing a rolling upgrade for Metrocluster configurations with SADTA. These sections describe upgrading HP Serviceguard, HP-UX, and Metrocluster software in Metrocluster SADTA configurations. Upgrading Metrocluster software To perform a rolling upgrade of Metrocluster software: 1. Disable package switching for all Metrocluster packages. 2. Install the new Metrocluster software on all the nodes. 3.
1. Identify the sites in the Metrocluster and the associated nodes. # cmviewcl -l node Select a site to perform the rolling upgrade. 2. Select a node in a site to perform the rolling upgrade. # cmviewcl –l node -S 3. View all the packages running on the selected node. # cmviewcl -l package -n `hostname` Identify the Site Controller packages that are running on the node. 4.
Limitations of the rolling upgrade for Metrocluster Limitations of the rolling upgrade for Metrocluster: • The cluster or package configuration cannot be modified until the rolling upgrade is completed. If the configuration must be edited, upgrade all the nodes to the new release, and then modify the configuration file and copy it to all the nodes in the cluster. • New features of the latest version of Metrocluster cannot be used until all the nodes are upgraded to the latest version.
8 Troubleshooting Troubleshooting Metrocluster Analyse Metrocluster and Raid Manager log files to understand the problem in the respective environment and follow a recommended action based on the error or warning messages. Metrocluster log file Regularly review the following files for messages, warnings, and recommended actions. It is good to review these files after every system, data center, and/or application failures: • View the system log at /var/adm/syslog/syslog.log.
Troubleshooting the Metrocluster with Continuous Access for P9000 and XP device group monitor The following is a guideline to help identify the cause of potential problems with the Metrocluster with Continuous Access for P9000 and XP device group monitor. • Problems with email notifications: Metrocluster with Continuous Access for P9000 and XP device group monitor uses SMTP to send out email notifications. All email notification problems are logged in the package log file.
• The CVM DG MNP packages will log to a file named /etc/cmcluster/cfs/.log on their corresponding CFS sub-cluster nodes. • The Site Safety Latch mechanism logs are saved in the/etc/opt/resmon/log/api.log file. Cleaning the site to restart the Site Controller package The Site Controller Package startup on a site can fail for various reasons. The Site Safety Latch is in a special state: INTERMEDIATE.
For more information about using cmviewsc, see cmviewsc(1m). Identifying and cleaning RAC MNP stack packages that are halted The Site Controller Package does not start if the RAC MNP stack packages are not halted clean. An MNP package is halted unclean when the halt script does not run successfully on all the configured nodes of the package. This implies that there might be some stray resources configured with the package, that are online in the cluster.
Table 9 Error Messages and their resolution (continued) Log Messages Cause Resolution A package dependency condition is not met on this site. 1. Verify the log file of the failed package. 2. Identify and fix the problem. It is possible that the CRS MNP packages are not running. 3. Enable node switching for the failed MNP package. 4. Clean the site using the cmresetsc tool. 5. Restart the Site Controller Package. Unable to initiate site failover at site siteB.
Table 9 Error Messages and their resolution (continued) Log Messages Cause Resolution The Site definitions in the Serviceguard cluster are no longer available. 1. Verify the Serviceguard cluster configuration file and reapply with the sites defined appropriately. 2. Restart the Site Controller Package. Failed to prepare the storage for site. The preparation of the replicated disk and making it read-write on the site nodes failed. 1. Verify the host connectivity to disk arrays. 2.
Table 9 Error Messages and their resolution (continued) Log Messages Cause Resolution Failed to validate parameter value in Site /etc/cmcluster/scripts/mscripts/master_control_script.sh. Controller package On node validation failed with: configuration file. /etc/cmcluster/scripts/dts/sc.sh[49]: SC_CRITICAL_PACKAGE: The specified subscript cannot be greater than 1024. ERROR: Failed to validate /etc/cmcluster/scripts/dts/sc.sh Unable to execute command.
Table 9 Error Messages and their resolution (continued) Log Messages Cause Executing: cmrunpkg siteA_mg1 siteA_mg2. This message is logged 1. Verify the log file of the package on the because one of the nodes where node switching is not packages managed by enabled. the Site Controller 2. Clean any stray resources owned by the Package has failed to start package that are still online on the at this site. node. 3. Enable node switching for the package on the nodes. 4.
Table 9 Error Messages and their resolution (continued) Log Messages Cause Resolution Error: Not able to find CVM master node to import CVM DG(s) This message is logged because the CVM commands failed. Fix the issue that resulted in the CVM commands failing and restart the Site Controller Package.
9 Designing a Three Data Center Disaster Recovery Solution This chapter describes Three Data Center architecture through the following topics: NOTE: For additional information, see the Release Notes for your Metrocluster and Continentalclusters products and the documentation for your storage solution. Overview of Continuous Access 3DC Replication Technology The Continuos Access 3DC replication can be configured using either 3DC Sync/CAJ replication or 3DC CAJ/CAJ replication.
data replication is established to replicate data between the arrays at Site1 and Site2 when using 3DC CAJ/CAJ replication. • Continuous Access Journal data replication must be established between one of the arrays located in the Metrocluster - Site1 or Site2 - and the array located at Site3. This replication pair is used to replicate data to Site3 actively and will be referred as Active-CAJ pair.
Figure 27 Bi-Link Configuration with the replication link between Site2 and Site3 Continentalclusters Quorum server for recovery cluster Node Node Site 1 Node Node Node Node Site 2 Primary Cluster (Metrocluster) Site 3 Recovery Cluster (Serviceguard cluster) In Bi-Link configuration: • Continuous Access Synchronous data replication is established to replicate data between the arrays at Site1 and Site2 when using 3DC Sync/CAJ replication.
data replication to the third site (Site3) by copying only delta changes. This avoids a single point of failure for data replication following a site outage. Another advantage with the Tri-Link configuration is, you can balance the replication load to Site3 by using different replication link for different applications.
In Multi-Target configuration, the data enters the configuration on a specific node and then splits into two directions. One direction is the replication to one site and the other direction is the replication to the another site. An example for data being replicated in Multi-Target topology when the application is running at Site1 is shown in Figure 30 (page 95).
Delta Resync Pair The Delta Resync pair must be configured when using 3DC Tri-Link configuration. The Delta Resync pair must be created with the –nocsus option to the paircreate command. The –nocsus option creates a suspended journal volume, without copying the data. Figure 32 (page 96) shows an example 3DC Sync/CAJ replication using Delta Resync pair configured between Site2 and Site3.
Continuous Access Journal Software User Guide or HP StorageWorks XP24000/XP20000 Continuous Access Journal Software User Guide available at: http://h20000.www2.hp.com/bizsupport/TechSupport/DocumentIndex.jsp?lang=en&cc=us& taskId=101&prodClassId=-1&contentType=SupportManual&docIndexId=64255& prodTypeId=18964&prodSeriesId=4309847.
Figure 34 Mirror Unit Descriptors CA-MU# 0 can be used for either for CA Sync/Async or CA_Jnl C ync /As ync S A CA P-VOL CA-MU#0 P-Jnl Jnl -MU#1 P-Jnl Jnl-MU#2 P-Jnl Jnl-MU#3 P-Jnl Jnl S-VOL RM-MU# 0 or Omitted S-VOL RM-MU#h1 P-Jnl S-VOL RM-MU#h2 P-Jnl S-VOL RM-MU#h3 S-VOL RM-MU# 0 or Omitted HORRCC_MRCF=1 S-VOL RM-MU# 1 HORRCC_MRCF=1 S-VOL RM-MU#2 HORRCC_MRCF=1 P-Jnl P-Jnl nl CA J l CA Jn CA Jnl BC1 BC MU#0 BC-MU#1 BC2 BC-MU#2 BC3 In a 3DC Sync/CAJ replication, the C
Figure 36 3DC CAJ/CAJ solution and mirror unit descriptor usage example Site 2 Site 1 #1 P-VOL CA-jnl #1 Jnl Vol S-VOL Jnl Vol #3 #2 CA-jnl CA-jnl DR pair Site 3 #3 Jnl Vol S-VOL #2 Benefits of 3DC DR Solution The 3DC DR Solution provides the following benefits: • Maintains data currency ◦ • When using 3DC Sync/CAJ solution, synchronous replication over a short distance in a Metrocluster environment provides the highest level of data currency and application availability without significant
• Allows recovery even when a disaster exceeds regional boundaries or extended duration. A wide-area disaster could disable both data centers in Metrocluster, but with semi-automatic functionality the operations can be shifted to the third site and continue unaffected by the disaster. • Allows for additional staff at the remote data center outside the disaster area. A wide-area disaster affects people located within the disaster area, both professionally and personally.
Figure 37 Three Data Center Architecture Quorum Server WAN Standby LAN Standby LAN Heatbeat LAN Node 1 Heatbeat LAN Node 2 Node 3 Node 4 A DWDM FC Switch DWDM FC Switch FC Switch B FC Switch FC IP Converter FC IP Converter Array Array WAN Standby LAN Heatbeat LAN Node 5 Node 6 DWDM FC Switch FC Switch FC IP Converter Array A: Cluster Heartbeat Network and Storage Connections B: Redundant DWDM Links routed differently An application is deployed in 3DC DR Solution, by configuring i
over to in case of a disaster in its DC1 site. This hot-standby site is also referred as DC2 site for the application. An application must have its DC1 and DC2 sites within the Metrocluster. A second-standby site (far site), where the recovery cluster is located is referred as DC3 site for the application. In case of a disaster affecting the application’s DC1 and DC2 sites, the application can be recovered at the recovery cluster in its DC3 site.
Figure 38 Applications Deployed in a 3DC DR Solution using Tri-Link Configuration Continental Cluster Node B Node A FC Switch Node A Node B FC Switch Journal Replication for B Journal Replication for B FC Switch Array Sync Replication for A Journal Replication for A Array Array Metrocluster Serviceguard Cluster A Continuous Access Synchronous device group pair must be established when using 3DC Sync/CAJ replication to replicate data between the arrays at the application’s DC1 and DC2 sites.
Active-CAJ pair: The Continuous Access Journal device group pair from DC1 or DC2 to DC3 over which the replication is in progress. Delta Resync pair: The Delta Resync pair configured between DC1 or DC2 and DC3. If the Active-CAJ pair is configured between DC1 and DC3 then the Delta Resync pair must be configured between DC2 and DC3. Similarly, if the Active-CAJ pair is configured between DC2 and DC3 then the Delta Resync pair must be configured between DC1 and DC3.
Figure 39 Applications Deployed in a 3DC DR Solution using Bi-Link Configuration Continental Cluster Node B Node A Node A Node B FC Switch FC Switch FC Switch Sync Replication for B Array Journal Replication for A Journal Replication for B Journal Replication for A Array Array Serviceguard Cluster Metrocluster A Continuous Access Synchronous device group pair must be established when using 3DC Sync/CAJ replication to replicate data between the arrays at the application’s DC1 and DC2 sites.
When Continuous Access Journal device group pair is configured between DC2 and DC3, the data is replicated always in Multi-Hop topology from DC1. This configuration is called Multi-Hop Bi-Link configuration. In a typical customer DRS environment, more than one application is usually configured to be run in a Metrocluster/Continentalclusters. Depending on the application distribution in a 3DC environment, some applications can have Site 1 as its DC1.
if the data is being replicated from DC1 to DC2 and then from DC2 to DC3, and there is a link failure between DC2 and DC3, it results in data not being replicated to DC3. The DGM identifies link failure, notifies the user, and performs appropriate actions based on the monitor’s parameter settings. Configuring DGM is optional but strongly recommended. The DGM runs as a package service. The user can configure the monitor's setting through the DGM specific parameters in the package configuration file.
13. Configure a recovery package in the recovery cluster using the 3DC modules. 14. Configure Continentalclusters and Continentalclusters Recovery Group. NOTE: This section provides information about configuring a single-instance application in a 3DC environment. For configuring a complex workload in a 3DC environment using SADTA, in addition to this section, see section “Deploying a Complex Workload in Three Data Center Solution using SADTA” (page 126).
HORCM_MON Enter the host-name or IP address of the system on which you are editing and the UDP port number specified for this Raid Manager instance in the /etc/ services file. For example: HORCM_MON # ip_address service poll(10ms) timeout(10ms) NodeA horcm0 1000 3000 HORCM_CMD Enter the primary and alternate link device file names for both the primary and redundant command devices (for a total of four raw device file names).
10. Define a device group that contains all of these devices. For normal Three Data Center operations, a package requires three different device groups for the configuration.
Figure 40 Sample Multi-Target Bi-Link Configuration MU# omitted Node A P-Vol P-Jnl #0 dg 12 #1 CA Sync Array Node B Datacenter 1 #0 S-Vol dg 13 #2 dg 23 (Phantom Device Group) Node C S-Jnl Array #1 S-Vol #2 Datacenter 2 CA-Journal Array Datacenter 3 Sample Raid Manager Configuration on a DC1 NodeA (multi-target Bi-Link) HORCM _MON #ip_address service NodeA horcm0 poll(10ms) 1000 timeout(10ms) 3000 HORCM_CMD #dev_name dev_name dev_name /dev/rdsk/c6t12d0 /dev/rdsk/c9t12d0 HORCM_DEV #dev
Sample Raid Manager Configuration on a DC2 NodeB (multi-target Bi-Link) HORCM _MON #ip_address service NodeB horcm0 poll(10ms) 1000 timeout(10ms) 3000 HORCM_CMD #dev_name dev_name dev_name /dev/rdsk/c21t8d0 /dev/rdsk/c24t8d0 HORCM_DEV #dev_group dev_name port# TargetID LU# dg12 dg12_d0 CL1-A 13 #Phantom device group dg23 dg23 dg23_d0 CL1-A 13 HORCM _INST #dev_group ip_address service dg12 NodeA.dc1.net horcm0 dg23 NodeC.dc3.
Figure 41 Multi-Hop Bi-Link (1:1:1) Node A Node B MU# h1 MU# omitted P-VOL #0 CA Sync dg 12 #0 S/P P-Jnl #2 Node C #1 S-VOL CA Journal #1 dg 23 #2 S-Jnl dg 13(Phantom Device Group) MU# h2 Array Array Array Datacenter 1 Datacenter 2 Datacenter 3 Sample Raid Manager Configuration on a DC1 NodeA (multi-hop-Bi-Link) HORCM _MON #ip_address service NodeA horcm0 poll(10ms) 1000 timeout(10ms) 3000 HORCM_CMD #dev_name dev_name dev_name /dev/rdsk/c6t12d0 /dev/rdsk/c9t12d0 HORCM_DEV #dev_gr
Sample Raid Manager Configuration on a DC3 NodeC (multi-hop-Bi-Link) HORCM _MON #ip_address service NodeC horcm0 poll(10ms) 1000 timeout(10ms) 3000 HORCM_CMD #dev_name dev_name dev_name /dev/rdsk/c6t2d0 /dev/rdsk/c8t2d0 HORCM_DEV #dev_group dev_name port# TargetID LU# dg23 dg23_d0 CL2-A 0 #Phantom device group dg13 dg13 dg13_d0 CL2-A 0 HORCM _INST #dev_group ip_address service dg23 NodeB.dc2.net horcm0 dg13 NodeA.dc1.
/dev/rdsk/c21t8d0 HORCM_LDEV # dev_group dg12 dg23 /dev/rdsk/c24t8d0 dev_name Serial# CU:LDEV(LDEV#) dg12_d0 10048 02:05 dg23_d0 10048 02:05 HORCM _INST #dev_group ip_address service dg12 NodeA.dc1.net horcm0 dg23 NodeC.dc3.
first create Continuous Access sync or Journal pair between DC1 and DC2 and then Continuous Access Journal between DC1 and DC3. Then, create a Delta Resync pair between DC2 and DC3 after creating of the Continuous Access Journal pairs. To create pairs with Delta Resync pair between DC2 and DC3, use the following: 1. Create the Continuous Access sync or Journal device group between DC1 and DC2. a.
NOTE: HP Storage XP 3DC Sync/CAJ with Delta Resync replication does not allow the creation of Delta Resync pair when data is being replicated in Multi-Hop topology. To create pairs with Delta Resync pair between DC1 and DC3, and when using 3DC Sync/CAJ replication: 1. Create the Continuous Access Synchronous device group between DC1 and DC2. a. Create the Continuous Access Synchronous pair dg12 from any node in DC1: # paircreate –g dg12 –f –vl –c 15 b.
Creating Device Group Pairs for Multi-Target Bi-Link configuration For a Multi-Target-Bi-Link configuration, first create the Continuous Access-Sync or Continuous Access Journal pair between DC1 and DC2. Then create the Continuous Access journal pair between DC1 and DC3. Use the following to create the pairs: 1. Create the Continuous Access sync or Journal device group between DC1 and DC2. a.
1. Define the appropriate Volume Groups on all cluster nodes that run the application package. # mkdir /dev/vgxx # mknod /dev/vgxx/group c 64 0xnn0000 Where the VG name and minor number nn are unique for each volume group defined in the node. 2. Create the Volume Group only on one node in primary data center (DC1).
5. Verify the configuration. # vxprint -g logdata 6. Make the filesystem. # newfs -F vxfs /dev/vx/rdsk/logdata/logfile 7. Create a directory to mount the volume group. # mkdir /logs 8. Mount the volume group. # mount /dev/vx/dsk/logdata/logfile /logs 9. Check if file system exits, then unmount the file system. # umount /logs 10. Deport the disk group on the primary node.
For Multi-Target Bi-Link configuration, specify "multi-target-bi-link": 3DC_TOPOLOGY multi-target-bi-link For Multi-Hop Bi-Link configuration, specify "multi-hop-bi-link": 3DC_TOPOLOGY multi-hop-bi-link c. d. e. f. g.
a. Configure the parameters for the Device Group Monitor in the package configuration file. Specify the polling interval for the DGM. If the parameter is not defined (commented out), the default value is 10 minutes. Otherwise, the value will be set to the desired polling interval in minutes. MON_POLL_INTERVAL Specify the value for MON_NOTIFICATION_FREQUENCY. This value is used to control the frequency of notification message when the state of the device group remains the same.
NOTE: While the Device Group Monitor (DGM) is used, the online package re-configuration of the DGM parameters is not supported. The online package re-configuration of the 3DC parameters is supported only for the following: • AUTO_SVOLPSUS • AUTO_SVOLPSUE • AUTO_SVOLPFUS • AUTO_PSUSSSSWS • AUTO_PSUEPSUS • AUTO_NONCURDATA • AUTO_FENCEDATA_SPLIT When DGM is not used, the online package re-configuration of all 3DC parameters is supported. b.
Metrocluster environment file is automatically generated on all nodes when this package configuration is applied in the cluster. CAUTION: Do not delete or edit the Metrocluster environment file that is generated. This file is crucial for 3DC operations. Complete the following procedure on a node in the recovery cluster to configure a recovery package using the 3DC modules: 1.
j. Specify the MU# used by the device group configured between DC2 and DC3 (DC2_DC3_DEVICE_GROUP) for the DC2_DC3_DEVICE_GROUP_MU parameter: DC2_DC3_DEVICE_GROUP_MU k. l. Specify the MU# used by the device group configured between DC1 and DC3 (DC1_DC3_DEVICE_GROUP) for the DC1_DC3_DEVICE_GROUP_MU parameter: DC1_DC3_DEVICE_GROUP_MU Specify one of the following values for the FENCE parameter: async if the Continuous Access journal pair is configured between DC1 and DC2.
# cmapplyconcl -C You have now configured an application in the 3DC solution. At any point in time, you can preview the data replication storage failover on any node in the 3DC solution using the procedure mentioned in “Previewing the Data Replication Storage Failover by Using cmdrprev” (page 140).
Figure 42 Package view for SADTA configuration in a 3DC solution In the 3DC Solution, a complex workload is configured redundantly by configuring it at all sites. A separate set of packages is used to configure complex workloads at each site. A Site Controller Package is created in both, the primary cluster (Metrocluster between Site1 and Site2) and the recovery cluster (Serviceguard cluster at Site3) to manage the complex workload.
Figure 43 Recovery of a complex workload after primary cluster failure Figure 42 (page 127) illustrates a complex workload with 3DC replication in a Multi-Hop Bi-Link configuration. You can also have the 3DC replication in Multi-Target Bi-Link or Tri-Link configuration. IMPORTANT: In this section, subsequent topics describe configuring complex workloads in 3DC with SADTA using Oracle RAC as an example. These topics also explain how to configure volume managers such as SLVM and LVM in this solution.
between the sites in the primary cluster. A Site Controller Package is created in the recovery cluster to manage the RAC database configured in the recovery cluster. A recovery group is configured in Continentalclusters with the Site Controller Package in the primary cluster as a primary package and the Site Controller Package in the recovery cluster as a recovery package. Figure 44 (page 129) illustrates a sample Oracle RAC configuration in the 3DC solution.
a. b. c. 7. 8. Configure Oracle RAC in the recovery cluster. Configure the Site Controller Package in the recovery cluster. Configure the Site Safety Latch Dependencies in the recovery cluster. Configure Continentalclusters. Configure a Continentalclusters Recovery Group.
For information on configuring the RAID Manager configuration files in the primary and recovery cluster, see “Creating Device Group Pairs” (page 131). Creating Device Group Pairs After configuring the Raid Manager configuration files on all nodes in the primary and recovery cluster, you must create Continuous Access device group pairs. An application configured for an 3DC solution may contain either three or two device group pairs based on Tri-Link or Bi-Link configuration respectively.
/etc/cmcluster/hrdb_sc/hrdb_sc.config 2. Edit the hrdb_sc.config file and specify the following: • Name for the package_name attribute - package_name hrdb_sc • Names of the nodes explicitly using the node_name attribute. • The Site Controller Package directory for the dts/dts/dts_pkg_dir attribute —dts/dts/dts_pkg_dir /etc/cmcluster/hrdb_sc This is the package directory for this Site Controller Package. The Metrocluster environment file is automatically generated for this package in this directory.
Configuring Site Safety Latch Dependencies in the Primary Cluster After the Site Controller Package configuration is applied, the corresponding Site Safety Latch is also configured automatically in the cluster. The Site Safety Latch dependencies must be configured. For more information on configuring these dependencies in the primary cluster, see “Configuring the Site Safety Latch Dependencies”.
5. 6. Test the RAC database in the recovery cluster Resume replication to the recovery cluster Suspend replication to the recovery cluster A RAC database using the replicated disk in the recovery cluster must be configured with the RAC MNP stack. Prior to preparing a RAC database in the recovery cluster, first split the data replication such that the disk at the recovery cluster is in the Read/Write mode.
4. Set up the second RAC database instance on the recovery cluster. In this example, run the following commands from the second RAC database instance node in Site 3. # cd /dbs # ln -s /oradata/hrdb/orapwhrdb orapwhrdb2 # chown oracle:oinstall inithrdb2.ora # chown -h oracle:oinstall orapwhrdb2 5. Create the Oracle admin directory on the nodes in Site 3.
3. 4. Halt the RAC database in the recovery cluster Start the Site Controller Package in the primary cluster Halt the Site Controller Package in the primary cluster You must halt the Site Controller Package and the RAC database in the primary cluster before testing the RAC database startup in the recovery cluster. Halt the Site Controller Package running on the primary cluster using the cmhaltpkg command. This command halts the RAC database running in the primary cluster.
Complete the following procedure on a node in the recovery cluster to configure the Site Controller Package for the RAC database in the recovery cluster: 1. Create a Site Controller Package configuration file using the dts/sc and dts/recovery_xpca3dc modules. # cmmakepkg -m dts/sc –m dts/recovery_xpca3dc \ /etc/cmcluster/hrdb_sc/hrdb_sc.config 2. Edit the hrdb_sc.
◦ WAITTIME ◦ AUTO_NONCURDATA For more information on the 3DC parameters, see “Configuring Recovery Packages in the Recovery Cluster” (page 123). 3. Apply the empty Site Controller Package configuration file in the recovery cluster. # cmapplyconf -P /etc/cmcluster/hrdb_sc/hrdb_sc.config IMPORTANT: Ensure there are no packages configured with the critical_package or managed_package attributes in the Site Controller Package configuration file.
site site3 critical_package site3_hrdb managed_package site3_hrdb_dg managed_package site3_hrdb_mp NOTE: 4. • Do not add any comments after specifying the critical and managed packages. • Always set auto_run parameter to yes for failover packages configured as critical or managed packages. • The packages configured with mutual dependency must not be configured as critical or managed packages. Re-apply the Site Controller Package configuration. # cmapplyconf -v -P /etc/cmcluster/hrdb_sc/hrdb_sc.
Previewing the Data Replication Storage Failover by Using cmdrprev To preview the data replication storage failover and to identify potential problems that can cause Metrocluster package failover or Continentalclusters recovery to fail, run the following command on all nodes in the 3DC solution: # cmdrprev {[–p ] | [-e ] The cmdrprev command previews the failover of the data replication storage that indicates if the storage preparation will be successful or not i
When run on a Metrocluster node at the site connected to the target storage of the DC1-DC2 replication pair, the cmdrprev command previews data replication preparation for a Metrocluster remote failover. When run on a Metrocluster node connected to the source storage of the DC1-DC2 replication pair, the command previews the data replication preparation for a local failover.
9. If Delta Resync pair is deleted as part of Step 2, follow the below steps: • Re-create the Delta Resync pair, if deleted in the step 1, using the paircreate command from DC3 node: # paircreate –g -vr –f async –jp -js -nocsus • Assign Remote Command devices to journal volumes of Delta Resync pair using Remote Web Console.
10. If Delta Resync pair is deleted as part of Step 2: • Re-create the Delta Resync pair deleted in Step1 using the paircreate command from DC3 node: # paircreate -g -vr –f async –jp -js -nocsus • Assign Remote Command devices to journal volumes of Delta Resync pair using Remote Web Console. For details on assigning remote command devices to journal volumes, see “Assigning Remote Command Devices to Journal Volumes” (page 118). 11.
Figure 46 Package Failover with data being replicated to DC3 from DC1 NOTE: When the application package is running on DC1 and the data is being replicated to DC3 from DC1 (i.e. Multi-Target topology) as shown in Figure 9, upon the failure of all nodes in DC1, the application package fails over to a node in DC2. As part of the package startup, the data replication between DC1 and DC2 is reversed and the data starts replicating from DC2 to DC1. This suspends the Active-CAJ pair internally by the firmware.
NOTE: In Figure 47 (page 144), the writes to the disk at DC2 are not accepted till the Delta Resync pair is re-synchronized when using 3DC CAJ/CAJ replication. So, in the event of 3DC DR software failing to resync the Delta Resync pair, the application package fails to come up. In this case, though the DC1-DC2 device group pair is in SSWS state at DC2, the writes to the disk are rejected.
8. 9. Enable the package on all nodes in the primary cluster. Start the package on its primary node in the primary cluster. Recovering the latest data from DC3 Recovering the latest data from DC3, as described in Step 6 above, guarantees that before the packages are run on DC1, the latest data from DC3 is replicated on to DC1. The process for this may vary depending on whether the 3DC configuration uses either Multi-hop Bi-Link configuration or Multi-target Bi-Link configuration or Tri-Link configuration.
3. Resync sync device group to get latest data from DC2 to DC1. Log onto any node at DC2. • a. If in Step 1, the dg12 pair has been brought to SMPL state. Create the DC1 and DC2 Sync device group when using 3DC Sync/CAJ replication. # paircreate -g dg12 -f never/data -c 15 –vl Or Create the DC1 and DC2 Journal device group when using 3DC CAJ/CAJ replication. # paircreate -g dg12 -f async -c 15 –vl –jp -js b. Wait for the PAIR state to come up for the DG12 device group.
a. Check the pair status of the dg12 device group. # pairvolchk -g dg12 -s b. If the local dg12 volume is in PVOL or SVOL-SSWS # pairsplit -g dg12 Go to Step 2 If the local dg volume is other than SVOL-SSWS or PVOL perform an SVOL takeover to make the local volume SVOL-SSWS or PVOL. # horctakeover -g dg12 -S # pairsplit -g dg12 Go to step 2, If above command fails, split the pair to SMPL. # pairsplit -g dg12 -S 2. Resync data from DC3 to DC1.
# pairsplit -g dg12 b. Resync the device group dg12 # pairresync -g dg12 c. Wait for PAIR state to come up. # pairevtwait -g dg12 -t 300 -s pair • If in step 1, the dg12 pair was not split to SMPL and the dg12 local volume is SVOL-SSWS # pairresync -g dg12 -c 15 -swaps a. Wait for PAIR state to come up # pairevtwait -g dg12 -t 300 -s pair NOTE: See “HP StorageWorks RAID Manager XP User's Guide” or “HP StorageWorks P9000 RAID Manager User Guide” for explanation of different command options.
# pairvolchk –g dg13 –s –c 2. Get the recent data from DC3 to DC1 by creating a new journal pair between DC3 and DC1 a. Create a DC1-DC3 device group pair with DC3 as PVOL side. Log in to any DC3 node and perform the following: # paircreate –g dg13 –f async –vl –c 15 –jp js b. Wait for the PAIR state to come up for the Journal device group. # pairevtwait -g dg13 -t 300 -s pair c.
NOTE: Do not use Serviceguard to failback from DC3. You need to take manual steps to replicate data back from DC3. See “Failback Scenarios” (page 348). In a three data center configuration whenever a package tries to start up a RAID Manager instance on a host, that host communicates with other RAID Manager instances in different data centers.
Data Maintenance with the Failure of a Metrocluster with Continuous Access for P9000 and XP Failover The following section describes data maintenance in the event of a Swap Takeover in a Metrocluster Continuous Access P9000 or XP environment.
1. 2.
A Checklist and worksheet for configuring a Metrocluster with Continuous Access for P9000 and XP Disaster recovery checklist Use this checklist to ensure you have adhered to the disaster tolerant architecture guidelines for two main data centers and a third location configuration. Data centers A and B have the same number of nodes to maintain quorum in case an entire data center fails. Arbitrary nodes or Quorum Server nodes are located in a separate location from either of the primary data centers (A or B).
Timing Parameters _____________________________________________________ Member Timeout: ___________________________________________________ Network Polling Interval: ____________________________________________ AutoStart Delay: ____________________________________________________ Package configuration worksheet Use this package configuration worksheet either in place of, or in addition to the worksheet provided in the latest version of the Managing Serviceguard manual available at http://www.hp.
__________________________________________________________________________ Device Group: _____________ Fence Level: _____________ Raid Manager Instance#: _____________ Legacy package configuration worksheet Package configuration file data ________________________________________________________ Package configuration File data _________________________________________________________ Package Name: _________________________________________________________ Primary Node: _________________________ Data Center:
Table 15 Site configuration Item Site Site Site Physical Location Name of the location Site Name One word name for the site that is used in configurations Node Names 1) 2) 1) 2) Name of the nodes to be used for configurations 1st HeartBeat Subnet IP IP address of the node on the 1st Serviceguard Heart Beat Subnet 2nd HeartBeat Subnet IP IP address of the node on the 2nd Serviceguard HeartBeat Subnet Replication Configuration Table 16 Replication configuration Item Data Replication RAID Device Gro
Table 16 Replication configuration (continued) Item Data 1) 2) 3) 4) 5) 6) 7) 8) 9) 10) CRS Sub-cluster Configuration – using CFS Table 17 Configuring a CRS Sub-cluster using CFS Item Site Site CRS Sub Cluster Name Name of the CRS cluster CRS Home Local FS Path for CRS HOME CRS Shared Disk Group name CVM disk group name for CRS shared disk CRS cluster file system mount point Mount point path where the vote and OCR is created CRS Vote Disk Path to the vote disk or file CRS OCR Disk Path to the OCR disk
Table 17 Configuring a CRS Sub-cluster using CFS (continued) Item Site Site IP addresses for RAC Interconnect Private IP names IP address names for RAC Interconnect Virtual IP IP addresses for RAC VIP Virtual IP names IP address names for RAC VIP RAC Database Configuration Table 18 RAC database configuration Property Value Database Name Name of the database Database Instance Names Instance names of the database RAC data files file system mount point Mount Point for Oracle RAC data files RAC data files
Table 18 RAC database configuration (continued) Property Value CFS MP MNP package name for RAC data files file system RAC Flash Area DG MNP CFS DG MNP package name for RAC flash file system RAC Flash Area MP MNP CFS MP MNP package name for RAC flash file system Node Names Database Instance Names Site Controller Package Configuration Table 19 Site Controller package configuration PACKAGE_NAME Name of the Site Controller package Site Safety Latch /dts/mcsc/ Name of the EMS resource name.
B Package attributes for Metrocluster with Continuous Access for P9000 or XP This appendix lists all Serviceguard package attributes for Metrocluster with Continuous Access for P9000 or XP. HP recommends that you use the default settings for most of these variables, so exercise caution when modifying them. AUTO_FENCEDATA_SPLIT (Default = 1) This parameter applies only when the fence level is set to DATA, which causes the application to fail if the Continuous Access link fails or if the remote site fails.
determine the data is current, it will not allow the package to start up. (Note: For fence level DATA and NEVER, the data is current when both PVOL and SVOL are in PAIR state.) 1 – Startup the application even when the data might not be current.
RCU. Data that does not make it to the RCU is stored on the bit map of the MCU. When failing back to the primary site any data that was in the MCU side file that is now stored on the bit map is lost during resynchronization. In synchronous mode with fence level NEVER, when the Continuous Access link fails, the application continues running and writing data to the PVOL. At this point the SVOL contains non-current data.
If the package is configured for a three data center (3DC) environment, this parameter is applicable only when the package is attempting to start up in either the primary (DC1) or secondary (DC2) data center. This parameter is not relevant in (the third data center) in the recovery cluster. Use this parameter’s default value in the third data center.
data center does not support Asynchronous mode of data replication. Leave this parameter with its default value in all data centers. AUTO_SVOLPSUS (Default = 0) This parameter applies when the PVOL and SVOL both have the state of suspended (PSUS). The problem with this situation is we cannot determine the earlier state: COPY or PAIR. If the earlier state was PAIR, it is completely safe to startup the package at the remote site.
but a phantom device group must be present between DC1 and DC3. • multi-target-Bi-Link: When the package is configured for 1:2 or multi target topology with two Continuous Access links. One, the sync Continuous Access link, is between DC1 and DC2 and the other, the journal Continuous Access link, is between DC1 and DC3. There is no physical Continuous Access link present between DC2 and DC3 but a phantom device group must be present between DC2 and DC3.
FENCE Fence level. Possible values are NEVER, DATA, and ASYNC. Use ASYNC for improved performance over long distances. If a Raid Manager device group contains multiple items where either the PVOL or SVOL devices reside on more than a single P9000 or XP Series array, then the Fence level must be set to “data” in order to prevent the possibility of inconsistent data on the remote side if an Continuous Access link or an array goes down.
then the package startup timeout value must be greater than the HORCTIMEOUT value, which is greater than the Continuous Access link timeout value: Pkg Startup Timeout > HORCTIMEOUT > Continuous Access link timeout value For Continuous Access Journal mode package, journal volumes in PVOL might contain a significant amount of journal data to be transferred to SVOL. Also, the package startup time might increase significantly when the package fails over and waits for all of the journal data to be flushed.
Table 21 AUTO_NONCURDATA (continued) Local State Remote State Fence Level PVOL_PFUS EX_ENORMT ASYNC PVOL_PFUS EX_CMDIOE data in the control log file of the package. SVOL_PAIR PVOL_PSUE EX_ENORMT EX_CMDIOE NEVER/DATA/ASYNC Do not start with exit 1 PVOL_PSUE EX_ENORMT EX_CMDIOE ASYNC Perform SVOL takeover, which changes SVOL to PSUS (SSWS). After the takeover succeeds, package starts with a warning message about non-current data in the package log file.
Table 25 AUTO_SVOLPSUE Local State Remote State Fence Level AUTO_SVOLPSUE =0 (Default) AUTO_SVOLPSUE =1 or FORCEFLAG=yes SVOL_PSUE PVOL_PSUS NEVER/DATA/ASYNC SVOL_PSUE EX_ENORMT Do not start with exit 2. SVOL_PSUE EX_CMDIOE Perform a SVOL to PSUS(SSWS). After the takeover succeeds, package starts with a warning message about non-current data in the control log file of the package.
notifications to the syslog file. If the parameter is not defined (commented out), the default value is 0. MON_NOTIFICATION_CONSOLE (Default = 0) This parameter defines whether the monitor sends console notifications. When the parameter is set to 0, the monitor will NOT send console notifications. When the parameter is set to 1, the monitor sends console notifications. If the parameter is not defined (commented out), the default value is 0.
module_version 1 module_name dts/dts module_version 1 module_name dts/xpca module_version 1 package_type failover node_name * auto_run yes node_fail_fast_enabled no run_script_timeout no_timeout halt_script_timeout no_timeout successor_halt_timeout no_timeout script_log_file /etc/cmcluster/logg operation_sequence /etc/cmcluster/scripts/dts/mc.sh operation_sequence /etc/cmcluster/scripts/dts/xpca.sh operation_sequence $SGCONF/scripts/sg/volume_group.
C Legacy packages Configuring legacy Metrocluster package To configure a legacy package: 1. Create a directory /etc/cmcluster/ for every package. # mkdir /etc/cmcluster/ 2. Create a package configuration file. # cd /etc/cmcluster/ # cmmakepkg -p .config Customize the package configuration file as appropriate to your application. Be sure to include the pathname of the control script (/etc/cmcluster//.cntl) for the RUN_SCRIPT and HALT_SCRIPT parameters.
NOTE: If you do not use a package name as a filename for the package control script, you must follow the convention of the environment file name. This is the combination of the file name of the package control script without the file extension, an underscore and type of the data replication technology (xpca) used. The extension of the file must be env. The following examples demonstrate how the environment file name must be chosen. For example, If the file name of the control script is pkg.
Using ftp might be preferable at your organization, because it does not require the use of a.rhosts file for root. Root access via the .rhosts might create a security issue. 11. Verify that every node in the Serviceguard cluster has the following files in the directory /etc/ cmcluster/. .cntl Seviceguard package control script _xpca.env Metrocluster/Continuous Access environment file .config Serviceguard package ASCII configuration file .
5. Validate the package configuration file. # cmcheckconf -P 6. Apply the package configuration with the modular configuration file created in step 3 # cmapplyconf -P 7. Run the package on a node in the Serviceguard cluster. # cmrunpkg -n 8. Enable global switching for the package.
where node1 and node2 are the nodes in the Source Disk Site. Configuring the storage device for complex workload at the Target Disk Site using SG SMS CFS or CVM To import CVM disk groups on the nodes in the target disk site and to create CFS disk group and mount point MNP packages: 1. From the CVM master node at the target disk site, import the disk groups used by the complex workload. # vxdg -stfC import 2.
there are no dependents on the legacy CFS mount point MNP packages. In case CFS mount point MNP packages have not been configured, this step will ensure that there are no dependents on the legacy CVM diskgroup MNP packages: # cmgetconf -p > # cmdeleteconf -p • Use the cfsmntadm command to delete all the legacy disk group and mount point MNP packages managed by the Site Controller from a node in the recovery site.
3. Validate the package configuration file. # cmcheckconf -P 4. Apply the package configuration file. # cmapplyconf -P
D Sample configuration files for Metrocluster with Continuous Access for P9000 and XP A.11.00 Sample Raid Manager configuration file The following is an example of a Raid Manager configuration file for one node (ftsys1). ## horcm0.conf.ftsys1- This is an example Raid Manager configuration file for node ftsys1.Note that this configuration file is for Raid Manager instance 0, which can be determined by the "0" in the filename "horcm0.conf".
#/************************* HORCM_DEV *************************************/ # # The HORCM_DEV parameter is used to define the addresses of the physical # volumes corresponding to the paired logical volume names. every group # name is a unique name used by the hosts which will access the volumes. # # The group and paired logical volume names defined here must be the same for # all other (remote) hosts that will access this device group.
Sample file for configuring automatic Raid Manager startup --------------------------RAID MANAGER -----------------------------------# Metrocluster with Continuous Access Toolkit script for configuring the # startup parameters for a HP Disk Array XP Raid Manager # instance. The Raid Manager instance must be running before any # Metrocluster package can start up successfully. # # @(#) $Revision: 1.
# run the raidscan command on ftsys1a or ftsys2a that is connected to the # remote XP array. # # # # # # # # # # # # # # # # The HORCM_LDEV parameters are used to describe stable LDEV# and Serial# as another way of HORCM_DEV used 'port#,Target-ID,LUN'. [Note]: This feature depends on the micro version of RAID or kind of RAID. This parameter is used to define the device group name for paired logical volumes.
E Sample output of the cmdrprev command The following procedure shows you how to use the cmdrprev command to preview the data replication preparation for a package in an MC with P9000/XP environment. Run the following command on : $> /usr/sbin/cmdrprev -p demopkg1 Following is the output that is displayed: cmdrprev: Info - File pkg_data_xpca.env found in /etc/cmcluster/demopkg1. Feb 29 10:03:24 - Node : Package Environment File: /etc/cmcluster/demopkg1/pkg_data_xpca.
F Sample configuration files for a 3DC Environment Sample Raid Manager configuration files on DC1, DC2 and DC3 nodes for Multi-Target-Bi-Link configuration Sample Raid Manager Configuration on DC1 NodeA HORCM _MON #ip_address service NodeA horcm0 poll(10ms) 1000 timeout(10ms) 3000 HORCM_CMD #dev_name dev_name dev_name /dev/rdsk/c6t12d0 /dev/rdsk/c9t12d0 HORCM_DEV #dev_group dg12 dg13 dev_name port# TargetID LU# MU# dg12_d0 CL3-E 6 dg13_d0 CL3-E 6 HORCM _INST #dev_group ip_address service dg12 NodeB.
HORCM _INST #dev_group ip_address service dg23 NodeB.dc2.net horcm0 dg13 NodeA.dc1.
HORCM _INST #dev_group ip_address service dg23 NodeB.dc2.net horcm0 dg13 NodeA.dc1.
HORCM _INST #dev_group ip_address service dg23 NodeB.dc2.net horcm0 dg13 NodeA.dc1.
G Configuring Oracle RAC in SADTA Overview of Metrocluster for RAC The Oracle RAC database can be deployed in a Metrocluster environment for disaster tolerance using SADTA. This configuration is referred as Metrocluster for RAC. In this architecture, a disaster tolerant RAC database is configured as two RAC databases that are replicas of every other; one at every site of the Metrocluster.
Because, a disaster tolerant RAC database has two identical but independent RAC databases configured over the replicated storage in a Metrocluster, it is important to prevent packages of both sites RAC MNP stacks to be up and running simultaneously. If the packages of the redundant stack at both sites are running simultaneously, it leads to data corruption.
Summary of required procedures This section summarizes the procedures required to configure Oracle RAC database in a SADTA. To set up SADTA in your environment: 1. Set up P9000 or XP data replication in your environment. 2. Install software for configuring Metrocluster. This includes: a. Creating Serviceguard Clusters b. Configuring Cluster File System-Multi-node Package (SMNP) 3. Install Oracle. a. Install and configure Oracle Clusterware. b. Install and configure Oracle Real Application Clusters (RAC). c.
If using SLVM, create appropriate SLVM volume groups with required raw volumes over the replicated disks. b. 11. 12. 13. 14. 15. 16. The Set up file systems for RAC database flash recovery. If you have SLVM, CVM, or CFS configured in your environment, see the following documents available at http://www.hp.
CFS file system at the host for database storage management. As the underlying Serviceguard cluster is configured with the site, there are two CFS sub-clusters: one at the Site A site with membership from SFO_1 and SFO_2 nodes and the other at the Site B site with membership from SJC_1 and SJC_2 nodes.
Table 27 CRS Sub-clusters configuration in the Metrocluster (continued) Site Site A Site B CRS Voting Disk /cfs/sfo_crs/VOTE/vote /cfs/sjc_crs/VOTE/vote CRS mount point /cfs/sfo_crs /cfs/sjc_crs CRS MP MNP package sfo_crs_mp sjc_crs_mp CRS DG MNP package sfo_crs_dg sjc_crs_dg sfo_crsdg sjc_crsdg CVM DG Name Private IPs Virtual IPs 192.1.7.1 SFO_1p.hp.com 192.1.8.1 SJC_1p.hp.com 192.1.7.2 SFO_2p.hp.com 192.1.8.2 SJC_2p.hp.com 16.89.140.202 SFO_1v.hp.com 16.89.141.202 SJC_1v.hp.
MNP packages must be configured using the critical_package attribute, and the CFS MP MNP and CVM DG MNP database packages must be configured using the managed_package attribute. As a result, the Site Controller Package monitors only the RAC database MNP package and initiates a site failover when it fails.
HEARTBEAT_IP NETWORK_INTERFACE HEARTBEAT_IP NETWORK_INTERFACE STATIONARY_IP NETWORK_INTERFACE NETWORK_INTERFACE STATIONARY_IP NETWORK_INTERFACE 192.1.3.1 lan3 #SG HB 2 192.1.5.1 lan4 #SFO_CRS CSS HB 192.1.7.1 lan5 #SFO_CRS CSS HB standby lan1 # SFO client access 16.89.140.201 lan6 # SFO client access standby NODE_NAME sfo_2 SITE san_francisco NETWORK_INTERFACE lan2 #SG HB 1 HEARTBEAT_IP 192.1.3.2 NETWORK_INTERFACE lan3 #SG HB 2 HEARTBEAT_IP 192.1.5.
Run the following command on any node, at both sites, to view the list of nodes and the status of every node: # cfscluster status Following is the output that is displayed: Node : SFO_1 Cluster Manager : up CVM state : up (MASTER) Node : SFO_2 Cluster Manager : up CVM state : up Installing and configuring Oracle clusterware After setting up replication in your environment and configuring the Metrocluster, you must install Oracle Clusterware.
export ORACLE_BASE=/opt/app/oracle export ORACLE_HOME=$ORACLE_BASE/product/10.2.0/db_1 export ORA_CRS_HOME=/opt/crs/oracle/product/10.2.
8. Mount the clustered file system on the site CFS sub-cluster. # cfsmount /cfs/sfo_crs 9. Create the Clusterware OCR directory in the clustered file system. # mkdir /cfs/sfo_crs/OCR # chmod 755 /sfo_cfs/crs/OCR 10. Create the Clusterware VOTE directory in the clustered file system. mkdir /cfs/sfo_crs/VOTE chmod 755 /cfs/sfo_crs/VOTE 11. Set oracle as the owner for the Clusterware directories.
8. In the Specify Voting Disk Location screen, select External Redundancy and specify the CFS file system directory if you have an independent backup mechanism for the Voting Disk. To use the internal redundancy feature of Oracle, select Normal Redundancy and specify additional locations. In this example, for the SFO Clusterware sub-cluster, the location is specified as: /cfs/sfo_crs/VOTE/vote 9. Complete the remaining on-screen instructions to complete the installation.
4. 5. On the Select Configuration Option screen, select the Install Database Software Only option. Create a listener on both nodes of the site using Oracle NETCA. For more information about using NETCA to configure listeners in a CRS cluster, see the Oracle RAC Installation and Configuration user’s guide. After installing Oracle RAC, you must create the RAC database.
7. Create mount points for the RAC database data files and set appropriate permissions. # mkdir /cfs # chmod 775 /cfs # mkdir /cfs/rac 8. Create the Mount Point MNP packages. # cfsmntadm add hrdbdg rac_vol /cfs/rac sfo_hrdb_mp all=rw SFO_1\ SFO_2 9. Mount the cluster file system on the CFS sub-cluster. # cfsmount /cfs/rac 10. Create a directory structure for the RAC database data files in the cluster file system. Set proper permission and owners for the directory.
8. Create Mount Point MNP package for the cluster file system. # cfsmntadm add flashdg flash_vol /cfs/flash sfo_flash_mp all=rw SFO_1 SFO_2 9. Mount the RAC database flash recovery file system in the site CFS sub-cluster. # cfsmount /cfs/flash 10. Create directory structure in the cluster file system for the RAC database flash recovery area.
For example, to prepare the replication environment: 1. Ensure the replication is in PAIR state at the replication Target Disk Site node. The STATUS field must have PAIR value. In this example, run the following command from the SJC_1 or SJC_2 node: # pairdisplay -g hrdb_devgroup 2. Perform a replication swap take over from the Target Disk Site node. In this example, run the following command from the SJC_1 or SJC_2 node. # horctakeover -g hrdb_devgroup -t 360 3.
# chown oracle:oinstall inithrdb1.ora 3. Copy the second RAC database instance pfile from the source site to the target site second RAC database instance node. In this example, copy the RAC database instance pfile from the SFO_2 node to the SJC_2 node. # cd /opt/app/oracle/product/10.2.0/db_1/db # rcp -p inithrdb2.ora SJC_2:$PWD The -p option retains the permissions of the file. 4. Set up the second RAC database instance on the target site. In this example, run the following commands from the SJC_2 node.
Controller Package. For more information about configuring the RAC database in MNP packages, see the Serviceguard Extension for Oracle RAC toolkit README. Halting the RAC database on the recovery cluster You must halt the RAC database on the Target Disk Site so that it can be restarted at the source disk site. Use the cmhaltpkg command to halt the RAC MNP stack on the replication Target Disk Site node. Deport the disk groups at the replication Target Disk Site nodes using the vxdg deport command.
IMPORTANT: Following are some guidelines that you must follow while configuring the Site Controller Package: • The default value of the priority parameter is set at no_priority. The Site Controller Package must not be subjected to any movement due to package prioritization. Do not change this default value. • The default value of the failover_policy parameter for the Site Controller Package is set to site_preferred.
After applying the Site Controller Package configuration, run the cmviewcl command to view the packages that are configured. Starting the disaster tolerant RAC database in the Metrocluster At this point, you have completed configuring SADTA in your environment with the Oracle Database 10gR2 RAC. This section describes the procedure to start the disaster tolerant RAC database in the Metrocluster. To start the disaster tolerant RAC database: 1.
gather information about service availability on the RAC servers and assist in making client connections to the RAC instances. Additionally, they provide failure notifications and load advisories to clients, thereby enabling fast failover of client connections and client-side load-balancing. These capabilities are facilitated by an Oracle 10g feature called Fast Application Notification (FAN). For more information about Fast Application Notification, see the following documents: http://www.oracle.
To configure the SGeRAC Cluster Interconnect packages: 1. Create a package directory on all nodes in the site. # mkdir -p /etc/cmcluster/pkg/sfo_ic 2. Create a package configuration file and control script file. Use site-specific names for the files. You must follow the legacy package creation steps. # cmmakepkg -p sfo_ic.conf # cmmakepkg -s sfo_ic.cntl 3. 4. 5. 6. 7. Specify a site-specific package name in the package configuration file. Specify only the nodes in the site for the node_name parameter.
Figure 50 Sample Oracle RAC database with ASM in SADTA SiteA CRS SiteB CRS SiteA RAC DB pkg SiteA_hrdb_dg SiteA ASM DG pkg SiteB RAC DB pkg Site safety latch SiteB ASM DG pkg Site Controller Node 1 Node 3 Node 2 Node 4 Site A Site B Router Router RAC DB Disk RAC DB Disk CRS, OCR, & Voting CRS, OCR, & Voting Disk Array Disk Array Active Inactive The Oracle Clusterware software must be installed at every site in the Metrocluster.
3. 4. 5. 6. 7. 8. 9. 10. The Install and configure Oracle Clusterware. Install Oracle Real Application Clusters (RAC) software. Create the RAC database with ASM: a. Configure ASM disk group. b. Configure SGeRAC Toolkit Packages for the ASM disk group. c. Create the RAC database using the Oracle Database Configuration Assistant. d. Configure and test the RAC MNP stack at the source disk site. e. Halt the RAC database at the source disk site. Configure the identical ASM disk group at the remote site.
Installing Oracle RAC software The Oracle RAC software must be installed twice in the Metrocluster, once at every site. Also, the RAC software must be installed in the local file system in all the nodes in a site. To install Oracle RAC, use the Oracle Universal Installer (OUI). After installation, the installer prompts you to create the database. Do not create the database until you install Oracle RAC at both sites. You must create identical RAC databases only after installing RAC at both sites.
1. 2. When using Oracle 11g R2 with ASM, the remote_listener for the database is set to the : by default. But, in the Metrocluster for RAC configuration, the SCAN name is different for every site CRS subcluster. So, the remote_listener for the database must be changed to the net service name configured in the tnsnames.ora for the database. This task must be done prior to halting the RAC database stack on the Source Disk Site: a. Log in as the Oracle user. # su – oracle b.
# rcp -p orapw+ASM1 :$PWD The -p option retains the permissions of the file. 4. Setup the first ASM instance on the target disk site. In this example, run the following commands from node1 in the site2. # cd /opt/app/oracle/product/11.1.0/db_1/dbs # ln –s /opt/app/oracle/admin/+ASM/pfile/init.ora init+ASM1.ora # chown -h oracle:oinstall init+ASM1.ora # chown oracle:oinstall orapw+ASM1 5. Copy the second ASM instance pfile and password file from site1 to the second ASM instance node in site2.
# chown oracle:oinstall initrhrdb1.ora 3. Copy the second RAC database instance pfile and password file from the source site to the second RAC database instance node in the target disk. In this example, run the following commands from the second node in site1: # cd /opt/app/oracle/product/11.1.0/db_1/dbs # rcp -p inithrdb2.ora :$PWD # rcp -p orapwhrdb2 :$PWD The -p option retains the permissions of the file. 4. Set up the second RAC database instance on the target disk site.
Configuring the Site Safety Latch dependencies After the Site Controller Package configuration is applied, the corresponding Site Safety Latch is also configured automatically in the cluster. This section describes the procedure to configure the Site Safety Latch dependencies. To configure the Site Safety Latch dependencies: 1. Add the EMS resource details in ASM DG package configuration file.
failure is handled based on the manner in which the RAC MNP stack is configured with the Site Controller Package. When the RAC MNP package is configured as a critical_package, the Site Controller Package considers only the RAC MNP package status to initiate a site failover. Since the RAC MNP package fails when the contained RAC database fails, the Site Controller Package fails over to start on the remote site node and initiates a site failover from the remote site.
Adding nodes online on a primary site where the RAC database is running To add nodes online on a primary site where the RAC database package stack is running: 1. Install the required software on the new node and prepare the node for Oracle installation. 2. Halt the Site Controller Package in the DETACH mode to avoid unnecessary site failover of the RAC database. 3. Ensure that the new node can access the Clusterware OCR and VOTE disks, and Oracle database disks and add the node to the Serviceguard cluster.
3. Delete an instance from the RAC database. For more information about deleting an instance, see the documentation available at the Oracle documentation website. 4. Delete the RAC database software and Oracle Clusterware. For more information about deleting the RAC database and Oracle Clusterware, see the documentation available at the Oracle documentation website. 5. 6. 7. Remove the node from the node list of the Site Controller Package. Run the cmhaltnode command to halt the cluster on this node.
The Site Controller Package starts on the preferred node at the site. At startup, the Site Controller Package starts the corresponding RAC MNP stack packages in that site that are configured as managed packages. After the RAC MNP stack packages are up, you must verify the package log files for any errors that will have occurred at startup. If the CRS MNP instance on a node is not up, the RAC MNP stack instance on that node does not start.
MNP package can only be started by restarting the Site Controller Package. This is because the Site Safety Latch closes when the Site Controller Package halts. Maintaining Oracle database RAC A RAC database configured using SADTA has two replicas of the RAC database configuration; one at every site. The database configuration is replicated between the replicas using a replicated storage. Most of the maintenance changes done at the site with the active database configuration is propagated to the other site.
Glossary A arbitrator Nodes in a disaster tolerant architecture that act as tie-breakers in case all of the nodes in a data center go down at the same time. These nodes are full members of the Serviceguard cluster and must conform to the minimum requirements. The arbitrator must be located in a third data center to ensure that the failure of an entire data center does not bring the entire cluster down.
disaster recovery The process of restoring access to applications and data after a disaster. Disaster recovery can be manual, meaning human intervention is required, or it can be automated, requiring little or no human intervention. disaster recovery services Services and products offered by companies that provide the hardware, software, processes, and people necessary to recover from a disaster. disaster tolerant The characteristic of being able to recover quickly from a disaster.
N network failover The ability to restore a network connection after a failure in network hardware when there are redundant network links to the same IP subnet. notification A message that is sent following a cluster or package event. O off-line data replication. Data replication by storing data off-line, usually a backup tape or disk stored in a safe location; this method is best for applications that can accept a 24-hour recovery time.
T transparent failover A client application that automatically reconnects to a new server without the user taking any action. transparent IP failover Moving the IP address from one network interface card (NIC), in the same node or another node, to another NIC that is attached to the same IP subnet so that users or applications might always specify the same IP name/address whenever they connect, even after a failure.
Index A J asynchronous mode, 12 replication, 11 journal group, 14 pair state, 14 volume, 12 B bi-directional configurations, 18 C L LVM volume group, 23 exporting, 23 Cluster verification, 51 cluster cmdeploycl, 36 Serviceguard, 18 Cluster File System (CFS) Multi-node Package, 36 command line interface, 47 Consistency Group, 11 Continuous Access journal, 12 link timeout, 11 P9000 and XP concepts, 10 M D O DC1HOST package control script variables, 174 deployment storage, 26 device group monitor, 1
software, 18 stack source disk site, 39 Storage smart tiers, 61 storage device source disk site, 37 SVOLs, 10 T target disk site replicated disk, 40 V Veritas, 40 volume monitor, 29 Voting disks, 197 VxVM disk group, 23 228 Index