HP Serviceguard Toolkits for Database Replication Solutions User Guide HP Part Number: 5900-1878 Published: August 2011 Edition: 2
© Copyright 2011 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. The information contained herein is subject to change without notice.
Contents 1 Introduction...............................................................................................5 2 Serviceguard toolkit for Oracle Data Guard...................................................6 Overview................................................................................................................................6 Advantages.........................................................................................................................6 Dependencies............
Documentation feedback.........................................................................................................47 Related information.................................................................................................................47 Typographic conventions.........................................................................................................47 A To configure SSH connection without password for root user between two nodes.....................................
1 Introduction The HP Serviceguard Toolkits for Database Replication Solutions User Guide includes the HP Serviceguard toolkit for Oracle Data Guard (ODG toolkit) and the HP Serviceguard toolkit for DB2 High Availability Disaster Recovery (DB2 HADR toolkit). NOTE: The product name used for the depot is “Serviceguard Disaster Recovery Toolkits for Databases”. However, the product name was changed to “HP Serviceguard Toolkits for Database Replication Solutions” to match the functionality delivered.
2 Serviceguard toolkit for Oracle Data Guard Overview The HP Serviceguard toolkit for Oracle Data Guard (ODG) toolkit facilitates easy integration of Oracle Data Guard (ODG) in an HP Serviceguard cluster for improved high availability and disaster recovery for an Oracle database. This toolkit contain scripts that manage the ODG primary and standby database instances.
Dependencies The ODG toolkit requires the ECMT Oracle toolkit to provide high availability to a single-instance ODG. Similarly, in an RAC environment, the ODG toolkit requires the SGeRAC toolkit to provide high availability to ODG RAC database instances. NOTE: For information about supportability and compatibility with various versions of Serviceguard, Toolkits and HP-UX, see the HP Serviceguard Toolkit Compatibility Matrix available at http://www.hp.com/go/hpux-serviceguard-docs.
The Oracle database is started using the ECMT Oracle toolkit. The Data Guard processes are then started using the ODG toolkit, after which, the application is monitored. If either the Oracle database or any of the Data Guard processes fail, the package fails over because the Oracle database and the Data Guard are integrated in a single package.
NOTE: The package parameter, START_MODE, must be set to mountwhen an ECMT Oracle toolkit is used in combination with an ODG toolkit. For an Active Data Guard, the standby database is started up to the [open] state. Set the ACTIVE_STANDBY parameter to [yes], if you have purchased the optional license to enable Active Standby functionality in the Oracle Data Guard Enterprise Edition. Active Data Guard is supported in Oracle database version 11gR1 or later.
Figure 3 Data Guard replication between RAC primary package and single-instance stand-alone standby database. Figure 3 (page 10), shows a Data Guard configuration where the primary database is Oracle RAC and the standby is a single-instance database instance. The RAC primary is configured on nodes 1 and 2 of the SG cluster 1. NOTE: There can be more than two nodes in a cluster. In the above mentioned example, we have taken two nodes, for better understanding.
Figure 4 Data Guard replication between RAC primary package and single-instance standby package Figure 4 (page 11), shows a Data Guard configuration where the primary database is configured as an RAC and the standby database is a single-instance database. Both primary and standby databases are configured in separate Serviceguard clusters for high availability. The RAC primary is combined with the ODG toolkit and the SGeRAC toolkit in a single package.
In RAC standby, only one standby instance performs the task of applying the redo logs to the database. This instance is known as the recovery instance and all other standby instances are known as receiving instances. When the recovery instance fails, the method of restarting the redo apply depends on whether the Data Guard Broker is configured. If the Broker is used, the redo apply restarts automatically on the first available standby instance.
Figure 6 Continentalclusters environment In Oracle single-instance environment, Figure 6 (page 13), all the three packages (Primary, Data receiver, and Recovery packages) are configured as failover packages. When the primary database fails, the Serviceguard configured on the primary cluster fails over the database to another node within the primary cluster, thus providing high availability to the primary database.
Figure 7 RAC environment In an RAC environment, Figure 7 (page 14), all the three packages (Primary, Data Receiver and Recovery Packages) are configured as Multi-Node packages (MNPs). When the primary cluster is down, the recovery package on the recovery cluster must be brought up by manually running the cmrecovercl command. This command halts the Data Receiver package, which in turn halts the standby database.
While Metrocluster provides storage-based data replication across metropolitan distances, organizations often like to build in additional data replicas, either locally or at other data centers. When you perform both storage and ODG replication, it helps to safeguard customer data. ODG protects against logical errors, while storage replication replicates any errors to the other replica. For example, a dropped table.
cluster with two nodes and a shared disk. The standby database instance is started on any one node in the third location, the database and the archived logs are located on the shared disk. If high availability is not needed for the standby database, the third location may not be configured in a Serviceguard cluster. In this situation, there will only be one server at the third location and the standby database instance must be brought up manually.
NOTE: This configuration is supported both in single-instance and RAC environments. For better understanding, the packages in the Figure 9 (page 16) are shown for a single-instance database. Three data center configuration Figure 10 Single-instance Data Guard setup in a Continentalclusters environment where the primary cluster is configured as a Metrocluster Figure 10 (page 17), describes a Continentalclusters setup with two clusters spread over three different sites.
NOTE: This configuration is supported both in single-instance and RAC environments. For better understanding, the packages in the Figure 10 (page 17) are shown for a single-instance database. Configuring multiple instances of Oracle Data Guard To support configuring Multiple instances of ODG single-instance/RAC databases, in one Serviceguard cluster, all the instances must be configured in such a way that they function independently.
NOTE: These configurations are applicable both in single-instance and RAC environment. High availability for data guard broker High availability for Data Guard Broker is supported only on RAC and not on single-instance database because Oracle does not support high availability for Data Guard Broker with vendor clusterware for single-instance Oracle database.
a. b. 4. Select Action > Mark For Installation (m) To install only a specific toolkit in “T2259AA”, see the options by pressing enter and then select as appropriate. Select Action > Install to initiate the installation. To verify the installation completion, run the command: # swlist -l product T2259AA This command returns the list of toolkits that you selected during installation. NOTE: • After the installation of ODG toolkit is complete, “hadg.sh”, “hadg_rac.sh”, “hadg_rac_cc.sh”, “hadg.
Table 2 Files created on installation of the HP Serviceguard toolkit for Oracle Data Guard File Name Description Available in Directory SGAlert.sh Alert Mail generation script Main Script in Single Instance Environment (hadg.sh) This script contains a list of internally used variables and functions that support the starting, stopping, and monitoring of an ODG instance. This script is called by tkit_module.
Table 4 Module scripts of the HP Serviceguard toolkit for Oracle Data Guard File Name Description Available in Directory Toolkit Module Script (tkit_module.sh) This script is called by the Master /etc/cmcluster/scripts/tkit/ Control Script and acts as an interface dataguard between the Master Control Script and the toolkit interface script (hadg.sh/hadg_rac.sh). It is also responsible for calling the toolkit Configuration File Generator Script (described below).
Table 5 Package attributes (continued) Variable Name Description START_STANDBY_AS_PRIMARY This parameter specifies whether the standby database must be started as the primary database or not. It has to be set to [yes] in the recovery package of the Continentalclusters' recovery group. When primary package goes down, the user must run the command cmrecovercl to bring up the recovery package on the recovery cluster.
Single-instance environment The sample configuration mentioned below uses the installation directory mode operation. This example on ODG package setup and configuration is for an ODG configuration using LVM. It illustrates the creation of a package for ODG in a single-instance environment. 1. Creating a package configuration • Create two packages: one for the primary database on the primary cluster and the other for the standby database on the standby cluster.
# Define the instance type # ecmt/oracle/oracle/INSTANCE_TYPE database -----------------------------------------------------------------------# # Define Oracle home # ecmt/oracle/oracle/ORACLE_HOME /var/orahome -----------------------------------------------------------------------# # Define user name of Oracle database administrator # ecmt/oracle/oracle/ORACLE_ADMIN oracle -----------------------------------------------------------------------# # Define oracle session name # ecmt/oracle/oracle/SID_NAME ORC
# #ecmt/oracle/oracle/LISTENER_RESTART ------------------------------------------------------------------------ 26 Serviceguard toolkit for Oracle Data Guard
NOTE: The following are the service commands for the package: service_name oracle_service_test service_cmd “$SGCONF/scripts/ecmt/oracle/tkit_module.sh oracle_monitor” service_restart none service_fail_fast_enabled no service_halt_timeout 300 service_name oracle_listener_service_test service_cmd “$SGCONF/scripts/ecmt/oracle/tkit_module.
# # "vg" is used to specify which volume groups are used by this package. # vg vgora -----------------------------------------------------------------------# # "fs_name", "fs_directory", "fs_mount_opt", "fs_umount_opt", "fs_fsck_opt", # and "fs_type" specify the file systems which are used by this package.
-----------------------------------------------------------------------# # "run_script_timeout" is the number of Seconds allowed for package to start. # "halt_script_timeout" is the number of Seconds allowed for package to halt. # run_script_timeout 600 halt_script_timeout 700 Note:"halt_script_timeout" has to be more than the sum of all the individual "service_halt_timeout"s of the "service_cmds". In SGeRAC toolkit this value is 600, by default.
tkit/dataguard/dataguard/START_STANDBY_AS_PRIMARY no -----------------------------------------------------------------------# # Define e-mail address for sending alerts # #tkit/dataguard/dataguard/ALERT_MAIL_ID ------------------------------------------------------------------------ Adding the package to the Serviceguard cluster After the setup is complete, add the package to the Serviceguard cluster, and then start the cluster. $ cmapplyconf -P dgpkg.
Single-instance environment NOTE: is In the example the package name is considered to be dgpkg, and the package directory /etc/cmcluster/pkg/dgpkg, and the ORACLE_HOME is configured as /orahome. 1. To disable the failover of the package, enter following command at the prompt: $ cmmodpkg -d dgpkg 2. To pause the monitor script, create an empty file /etc/cmcluster/pkg/dgpkg/ dataguard.debugby entering the command: $ touch /etc/cmcluster/pkg/dgpkg/dataguard.
verification logs appropriate warning messages. It does not lead to a package validation failure during package apply or package check. In a single-instance environment, consider a two-node cluster, where both nodes have Serviceguard A.11.20, ECMT B.06.00 and same Oracle database versions but different ODG toolkit versions. Use cmcheckconf to check package configuration using the node1# cmcheckconf -P pkg.
• When using the ODG Broker toolkit in Continentalclusters environment, the “Fast Start Failover” feature of ODG Broker must be disabled. In case of a disaster at the primary site, the “Fast start failover” feature of ODG Broker enables automatic failover of the primary database to an available standby database. This may lead to Data Integrity issues when the toolkit attempts to failover the primary to a different standby.
3 Serviceguard toolkit for DB2 High Availability Disaster Recovery Overview The HP Serviceguard toolkit for DB2 High Availability Disaster Recovery (DB2 HADR toolkit) enables you to configure the DB2 primary and standby database as two Serviceguard packages. It provides high availability for DB2 database and role management assistance, such as role takeover and role switch for DB2 HADR. DB2 HADR toolkit handles role takeover automatically.
Installing and uninstalling HP Serviceguard Toolkits for Database Replication Solutions DB2 HADR toolkit is part of the HP Serviceguard Toolkits for Database Replication Solutions and is available on installing the HP Serviceguard Toolkits for Database Replication Solutions. NOTE: The product name used for the depot is “Serviceguard Disaster Recovery Toolkits for Databases”.
When primary and standby packages are in the same cluster Figure 12 Primary and Standby Packages in the Same Cluster In this configuration, Figure 12 (page 36), DB2 primary database is configured in a volume group shared between Node1 and Node2. The standby database is configured in a volume group shared between Node3 and Node4. HADR is configured between primary and standby database. The DB2 database and the HADR are packaged using the DB2 HADR toolkit.
If either the standby database or the standby HADR is down, the standby package fails on Node2. In this case, the standby package logs a failure message in the package log and sends an email if the ALERT_MAIL_ID package attribute is set. The primary package continues to run. Standby is not connected to the primary database, so the primary package logs a warning message, Standby disconnected and sends an email if ALERT_MAIL_ID package attribute is set. The standby package fails over to Node1.
3. 4. If the RESTORE_ROLE package attribute is set to [yes], the original primary package performs a role switch to resume its role as the primary database. This automatically enables the original standby database to resume as the standby database. If the RESTORE_ROLE package attribute is set to [no], the original primary (which is now standby), continues to run as standby.
To provide high availability only to primary database Figure 14 HA to Primary Database This configuration provides High Availability (HA) to primary databases. In Figure 14 (page 39), DB2 primary database is configured in a volume group shared between Node1 and Node2 in a Serviceguard cluster. The standby database is running on Node3 placed outside the cluster. The primary package is configured to run either on Node1 or Node2 and running on Node1.
4. Edit the following attributes manually in this file before creating the package: Attributes Description package_name The package name must be unique in the cluster. package_type Package must be a failover package. node_name Name of the cluster node on which the package will run. Ensure that the primary package and the standby package have different node names. The primary and the standby packages must not run on the same node.
Attributes HADR_IP Description Set the IP address used for performing the role switch. Provide IP address from the subnet that is monitored by Serviceguard NOTE: This IP address should not be used to configure HADR and the HADR must not use this IP for any purpose. The format of this value should be as follows: : For example: tkit/db2hadr/db2hadr/HADR_IP 10.76.1.200:10.76.1.0, where: • 10.76.1.200 is the IP address • 10.76.1.0 is the network subnet.
Attributes Description For DB2 HADR Service 5. 6. service_name Name of the service that Serviceguard monitors while the package is up. This name must be unique for both primary and standby packages in a Serviceguard cluster. The default value is db2hadr_service. service_cmd $SGCONF/scripts/tkit/db2hadr/ tkit_module.sh db2hadr_monitor service_restart This attribute specifies the number of times the service was restarted before failing. The default value is [none].
1. 2. 3. To enable the maintenance mode, in the Package Configuration file set the MAINTENANCE_FLAG attribute to [Yes ] before applying the cmapplyconf command. To start the maintenance mode for the package, in the TKIT_DIR path create the db2.debug file. To stop the maintenance mode and bring back the package in the running state, remove the db2.debug file from the TKIT_DIR directory.
Troubleshooting This section explains some of the problem scenarios that you might encounter while working with the DB2 HADR toolkit in an HP Serviceguard Cluster. Problem Scenario Possible Cause Recommended Action If the package log contains an error message: The SSH connection without password Configure SSH connection without is not configured properly. password properly. Host key verification failed. Lost connection. To verify the possible cause: 1.
Problem Scenario Possible Cause Recommended Action 1. Run db2 takeover hadr on db sample by force command on machine where the state of the DB2 HADR is one of the following states: remote catchup pending, peer or disconnected peer. 2. After the state changes to “peer”, run the db2 takeover hadr on db sample command on standby. Limitations This section lists the limitations of DB2 HADR toolkit in an HP Serviceguard Cluster: 1. Start the standby package before you start the primary package.
4 Support and other resources Information to collect before contacting HP Be sure to have the following information available before you contact HP: • Software product name • Hardware product model number • Operating system type and version • Applicable error message • Third-party hardware or software • Technical support registration number (if applicable) How to contact HP Use the following methods to contact HP technical support: • See the Contact HP worldwide website: http://www.hp.
Warranty information HP will replace defective delivery media for a period of 90 days from the date of purchase. This warranty applies to all Insight software products. HP authorized resellers For the name of the nearest HP authorized reseller, see the following sources: • In the United States, see the HP U.S. service locator website: http://www.hp.com/service_locator • In other locations, see the Contact HP worldwide website: http://welcome.hp.com/country/us/en/wwcontact.
WARNING An alert that calls attention to important information that, if not understood or followed, results in personal injury. CAUTION An alert that calls attention to important information that, if not understood or followed, results in data loss, data corruption, or damage to hardware or software. IMPORTANT An alert that calls attention to essential information. NOTE An alert that contains additional or supplementary information. TIP An alert that provides helpful information.
A To configure SSH connection without password for root user between two nodes This section describes how to configure SSH connection without password for root user between two nodes. In this example, it is considered that DB2 HADR is configured using the host names of the two nodes (Node2 and Node3) as shown in the following db2 command result: db2 get db cfg for | grep -i hard In the following output, Node2 and Node3 are the host names of the nodes that are used to configure DB2 HADR.
Node2# ssh Node3 cat /.ssh/id_rsa.pub >> /.ssh/authorized_keys Node2# ssh Node3 cat /.ssh/id_dsa.pub >> /.ssh/authorized_keys Node2# scp /.ssh/authorized_keys Node3:.ssh/authorized_keys NOTE: Provide root user’s password when asked. Node2# exec /usr/bin/ssh-agent $SHELL Node2# /usr/bin/ssh-add Identity added: /.ssh/id_rsa (/.ssh/id_rsa) Identity added: /.ssh/id_dsa (/.ssh/id_dsa) Node2# ssh Node2 ls /.
Offending key for IP in /home/user/.ssh/known_hosts:6 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@IIT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! 3. 4. 5. 6. 7. A warning message Offending key for IP in /home/user/.ssh/known_hosts is displayed. Remove the [key number 6] from /home/user/.ssh/known_hosts, and then copy it in a temporary file. To add a new key to/home/user/.ssh/known_hosts.
B Sample package configuration file for the DB2 HADR standby package This section provides with a sample package configuration file for the DB2 HADR standby package: # ********************************************************************** # ****** HIGH AVAILABILITY PACKAGE CONFIGURATION FILE (template) ******* # ********************************************************************** # ******* Note: This file MUST be edited before it can be used.
Glossary ECMT Enterprise Cluster Master Toolkit EDC Extended Distance Cluster HA High Availability HADR High Availability Disaster Recovery MAA Maximum Availability Architecture MNP Multi Node Package ODG Oracle Data Guard RAC Oracle Real Application Clusters vg Volume Group 53
Index A O adding package to SG cluster, 30 ODG maintenance RAC environment, 31 single-instance environment, 31 ODG toolkit advantages, 6 configuration benefits, 18 installation, 19 limitations, 32 maintenance, 30 setting up Oracle Data Guard toolkit, 19 troubleshooting, 32 uninstalltion, 20 Oracle Data Guard configuring multiple instances, 18 overview, 6 C cluster verification, 31 configuring ODG toolkit RAC environment, 28 single-instance environment, 24 Continentalclusters environment configuration, 1