Using Serviceguard Extension for RAC Version A.11.20 - (August 2011)

ManualsBrandsHP ManualsSoftwareHP Serviceguard Extension for RAC (SGeRAC)

Using Serviceguard Extension for RAC

Version A.11.20

HP Part Number: 5900-1887

Published: August 2011

Summary of content (155 pages)

PAGE 1
Using Serviceguard Extension for RAC Version A.11.
PAGE 2
Legal Notices © Copyright 2011 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use, or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor’s standard commercial license. The information contained herein is subject to change without notice.
PAGE 3
Contents Advantages of using SGeRAC.........................................................................8 User Guide Overview....................................................................................9 Where to find Documentation on the Web......................................................11 1 Introduction to Serviceguard Extension for RAC............................................12 What is a Serviceguard Extension for RAC Cluster? .........................................................
PAGE 4
Network Monitoring...........................................................................................................28 SGeRAC Heartbeat Network..........................................................................................28 CSS Heartbeat Network.................................................................................................28 RAC Cluster Interconnect................................................................................................28 Public Client Access...
PAGE 5
Mirror Detachment Policies with CVM..........................................................................55 Using CVM 5.x.................................................................................................................55 Preparing the Cluster for Use with CVM 5.x......................................................................55 Starting the Cluster and Identifying the Master Node....................................................56 Converting Disks from LVM to CVM.................
PAGE 6
Startup and shutdown of the combined Oracle RAC-SGeRAC stack .........................................85 How Serviceguard Extension for RAC starts, stops and checks Oracle Clusterware ....................86 How Serviceguard Extension for RAC Mounts, dismounts and checks ASM disk groups...............86 How Serviceguard Extension for RAC Toolkit starts, stops, and checks the RAC database instance..................................................................................................................
PAGE 7
Removing Serviceguard Extension for RAC from a System..........................................................130 Monitoring Hardware ...........................................................................................................131 Using Event Monitoring Service..........................................................................................131 Using EMS Hardware Monitors..........................................................................................
PAGE 8
Advantages of using SGeRAC HP Serviceguard Extension for RAC (SGeRAC) amplifies the availability and simplifies the management of Oracle Real Application Cluster (RAC). SGeRAC allows you to integrate Oracle RAC into a Serviceguard cluster while also easily managing the dependency between Oracle Clusterware and Oracle RAC with a full range of storage management options.
PAGE 9
User Guide Overview This user guide covers how to use Serviceguard Extension for RAC (Oracle Real Application Cluster) to configure Serviceguard clusters for use with Oracle Real Application Cluster software, on HP High Availability clusters running the HP-UX operating system. • Chapter 1— Introduction to Serviceguard Extension for RAC Describes a Serviceguard cluster and provides a roadmap for using this guide.
PAGE 10
If you will be using Veritas Cluster Volume Manager (CVM) and Veritas Cluster File System (CFS) from Symantec with Serviceguard refer to the HP Serviceguard Storage Management Suite Version A.03.01 for HP-UX 11i v3 Release Notes. These release notes describe suite bundles for the integration of HP Serviceguard A.11.20 on HP-UX 11i v3 with Symantec’s Veritas Storage Foundation.
PAGE 11
Where to find Documentation on the Web • SGeRAC Documentation Go to www.hp.com/go/hpux-serviceguard-docs, and then click HP Serviceguard Extension for RAC. • Related Documentation Go to www.hp.com/go/hpux-serviceguard-docs, www.hp.com/go/ hpux-core-docs, and www.hp.com/go/hpux-ha-monitoring-docs. The following documents contain additional useful information: ◦ Clusters for High Availability: a Primer of HP Solutions.
PAGE 12
1 Introduction to Serviceguard Extension for RAC Serviceguard Extension for RAC (SGeRAC) enables the Oracle Real Application Cluster (RAC), formerly known as Oracle Parallel Server RDBMS, to run on HP high availability clusters under the HP-UX operating system. This chapter introduces Serviceguard Extension for RAC and shows where to find different kinds of information in this book.
PAGE 13
When properly configured, Serviceguard Extension for RAC provides a highly available database that continues to operate even if one hardware component fails. Group Membership Group membership allows multiple instances of RAC to run on each node. Related processes are configured into groups. Groups allow processes in different instances to choose which other processes to interact with. This allows the support of multiple databases within one RAC cluster.
PAGE 14
are supported are those specified by Hewlett-Packard, and you can create your own multi-node packages. For example, the packages HP supplies for use with the Veritas Cluster Volume Manager (CVM) and the Veritas Cluster File (CFS) System (on HP-UX releases that support Veritas CFS and CVM. Also, see “About Veritas CFS and CVM from Symantec” (page 15)). • Multi-node package. A system multi-node package must run on all nodes that are active in the cluster. If it fails on one active node, that node halts.
PAGE 15
Package Dependencies When CFS is used as shared storage, the application and software using the CFS storage should be configured to start and stop using Serviceguard packages. These application packages should be configured with a package dependency on the underlying multi-node packages, which manages the CFS and CVM storage reserves. Configuring the application to be start/stop through Serviceguard package is to ensure the synchronization of storage activation/deactivation and application startup/shutdown.
PAGE 16
NOTE: Beginning with HP-UX 11i v3 1109 HA-OE/DC-OE, SGeRAC is included as a licensed bundle at no additional cost. To install SGeRAC A.11.20 on your system during 1109 HA-OE/DC-OE installation, you must select T1907BA (SGeRAC) in the Software tab.
PAGE 17
Oracle 10g/11gR1/11gR2 RAC uses the following two subnets for cluster communication purposes: • CSS Heartbeat Network (CSS-HB)—Oracle Clusterware running on the various nodes of the cluster communicate among themselves using this network. • RAC Cluster Interconnect Network (RAC-IC)—Database instances of a database use this network to communicate among themselves. NOTE: In this document, the generic terms CRS and Oracle Clusterware will subsequently be referred to as Oracle Cluster Software.
PAGE 18
For example, when a multi-node package (pkgA) is configured to run on all nodes of the cluster, and configured to monitor a subnet (SubnetA) using the CLUSTER_INTERCONNECT_SUBNET parameter: • If more than one instance of pkgA is running in the cluster and SubnetA fails on one of the nodes where the instance of pkgA is running, the failure is handled by halting the instance of pkgA on the node where the subnet has failed.
PAGE 19
configured to automatically fail over from the original node to an adoptive node. When the original node is restored, the listener package automatically fails back to the original node. In the listener package ASCII configuration file, the FAILBACK_POLICY is set to AUTOMATIC. The SUBNET is a set of monitored subnets. The package can be set to automatically startup with the AUTO_RUN setting. Each RAC instance can be configured to be registered with listeners that are assigned to handle client connections.
PAGE 20
Figure 4 After Node Failure In the above figure, pkg1 and pkg2 are not instance packages. They are shown to illustrate the movement of the packages. Larger Clusters Serviceguard Extension for RAC supports clusters of up to 16 nodes. The actual cluster size is limited by the type of storage and the type of volume manager used. Up to Four Nodes with SCSI Storage You can configure up to four nodes using a shared F/W SCSI bus; for more than four nodes, FibreChannel must be used.
PAGE 21
Figure 5 Four-Node RAC Cluster In this type of configuration, each node runs a separate instance of RAC and may run one or more high availability packages as well. The figure shows a dual Ethernet configuration with all four nodes connected to a disk array (the details of the connections depend on the type of disk array). In addition, each node has a mirrored root disk (R and R).
PAGE 22
Figure 6 Eight-Node Cluster with EVA, XP or EMC Disk Array FibreChannel switched configurations also are supported using either an arbitrated loop or fabric login topology. For additional information about supported cluster configurations, refer to the HP 9000 Servers Configuration Guide, available through your HP representative.
PAGE 23
4. 5. Restart the Serviceguard cluster. Restart Oracle Clusterware (for Oracle 10g, 11gR1, and 11gR2) and Oracle RAC database instance on all nodes. Use the following steps to disable the GMS authorization: 1. 2. 3. 4. 5. If Oracle RAC database instance and Oracle Clusterware (for Oracle 10g, 11gR1, and 11gR2) are running, shut them down on all nodes. Halt the Serviceguard cluster. Edit /etc/opt/nmapi/nmutils.conf and comment the GMS_USER[] settings on all nodes. Restart the Serviceguard cluster.
PAGE 24
Configuring Clusters with Serviceguard Manager You can configure clusters and packages in Serviceguard Manager. You must have root (UID=0) access to the cluster nodes.
PAGE 25
2 Serviceguard Configuration for Oracle 10g, 11gR1, or 11gR2 RAC This chapter shows the additional planning and configuration that is needed to use Oracle Real Application Clusters 10g/11gR1/11gR2 with Serviceguard.
PAGE 26
CSS Timeout When SGeRAC is on the same cluster as Oracle Cluster Software, the CSS timeout is set to a default value of 600 seconds (10 minutes) at Oracle software installation. This timeout is configurable with Oracle tools and should not be changed without ensuring that the CSS timeout allows enough time for Serviceguard Extension for RAC (SGeRAC) reconfiguration and to allow multipath (if configured) reconfiguration to complete.
PAGE 27
The file /var/opt/oracle/oravg.conf must not be present so Oracle Cluster Software will not activate or deactivate any shared storage. Multipathing Multipathing is automatically configured in HP-UX 11i v3 (this is often called native multipathing). Multipathing is supported through either SLVM pvlinks or CVM Dynamic Multipath (DMP). In some configurations, SLVM or CVM does not need to be configured for multipath as the multipath is provided by the storage array.
PAGE 28
Manual Startup and Shutdown Manual listener startup and shutdown is supported through the following commands: srvctl and lsnrctl. Network Monitoring SGeRAC cluster provides network monitoring. For networks that are redundant and monitored by Serviceguard cluster, Serviceguard cluster provides local failover capability between local network interfaces (LAN) that is transparent to applications utilizing User Datagram Protocol (UDP) and Transport Control Protocol (TCP).
PAGE 29
Manual Startup and Shutdown Manual RAC instance startup and shutdown is supported through the following commands: srvctl or sqlplus. Shared Storage It is expected the shared storage is available when the RAC instance is started. Since the RAC instance expects the shared storage to be available, ensure the shared storage is activated. For SLVM, the shared volume groups must be activated and for CVM, the disk group must be activated. For CFS, the cluster file system must be mounted.
PAGE 30
(DLPI) and supported over Serviceguard heartbeat subnet networks, including primary and standby links. • Highly available virtual IP (HAIP) (only when using Oracle Grid Infrastructure 11.2.0.2) — IP addresses, which Oracle Database and Oracle ASM instances use to ensure highly available and load balanced across the provided set of cluster interconnect interfaces.
PAGE 31
Volume Planning with SLVM Storage capacity for the Oracle database must be provided in the form of logical volumes located in shared volume groups. The Oracle software requires at least two log files for each Oracle instance, several Oracle control files and data files for the database itself. For all these files, Serviceguard Extension for RAC uses HP-UX raw logical volumes located in volume groups that are shared between the nodes in the cluster.
PAGE 32
Fill out the Veritas Volume worksheet to provide volume names for volumes that you will create using the Veritas utilities. The Oracle DBA and the HP-UX system administrator should prepare this worksheet together. Create entries for shared volumes only. For each volume, enter the full pathname of the raw volume device file. Be sure to include the desired size in MB. Following are sample worksheets filled out. Refer to Appendix B: “Blank Planning Worksheets”, for samples of blank worksheets.
PAGE 33
Installing Serviceguard Extension for RAC Installing Serviceguard Extension for RAC includes updating the software and rebuilding the kernel to support high availability cluster operation for Oracle Real Application Clusters.
PAGE 34
nomenclature. You are not required to migrate to agile addressing when you upgrade to 11i v3, though you should seriously consider its advantages. It is possible, though not a best practice, to have legacy DSFs on some nodes and agile addressing on others—this allows you to migrate the names on different nodes at different times, if necessary. NOTE: The examples in this document use legacy naming conventions.
PAGE 35
Limitations of cDSFs • cDSFs are supported only within a single cluster; you cannot define a cDSF group that crosses cluster boundaries. • A node can belong to only one cDSF group. • cDSFs are not supported by CVM, CFS, or any other application that assumes DSFs reside only in /dev/disk and /dev/rdisk. • cDSFs do not support disk partitions. Such partitions can be addressed by a device file using the agile addressing scheme, but not by a cDSF.
PAGE 36
any network configured for Oracle cluster interconnect must also be configured as SGeRAC A.11.20 heartbeat network. NOTE: Do not configure Serviceguard heartbeat and Oracle cluster interconnect in mutually exclusive networks. 2. Serviceguard standby interfaces must not be configured for the networks used for Serviceguard heartbeat and Oracle cluster interconnect. For Oracle Grid Infrastructure 11.2.0.
PAGE 37
Alternate Configuration—Fast Reconfiguration with Low Node Member Timeout High RAC-IC traffic may interfere with SG-HB traffic and cause unnecessary member timeout if Serviceguard cluster configuration parameter MEMBER_TIMEOUT is low. If MEMBER_TIMEOUT cannot be increased, use of an additional network dedicated for SG-HB alone avoids unnecessary member timeouts when RAC-IC traffic is high.
PAGE 38
monitoring CSS-HB subnet (Oracle Cluster Interconnect Subnet Package as shown in the package configuration parameters examples below). NOTE: Do not configure CLUSTER_INTERCONNECT_SUBNET in the RAC Instance package due to the RAC-IC network being the same as CSS-HB network.
PAGE 39
the appropriate subnet if the RAC Instances use a RAC-IC network different from CSS-HB network. No special subnet monitoring is needed for CSS-HB Subnet because Serviceguard monitors the subnet (heartbeat) and will handle failures of the subnet. The database instances that use 192.168.2.0 must have cluster_interconnects defined in their SPFILE or PFILE as follows: orcl1.cluster_interconnects=’192.168.2.1’ orcl2.cluster_interconnects=’192.168.2.
PAGE 40
NOTE: 1. The “F” represents the Serviceguard failover time as given by the max_reformation_duration field of cmviewcl –v –f line output. 2. SLVM timeout is documented in the whitepaper, LVM link and Node Failure Recovery Time. Limitations of Cluster Communication Network Monitor In Oracle Grid Infrastructure 11.2.0.2, using HAIP feature, Oracle allows to use a subnet for Oracle Clusterware (CSS-HB) and RAC interconnect (RAC-IC) communication which is not configured in SGeRAC cluster.
PAGE 41
• Creating RAC Volume Groups on Disk Arrays • Creating Logical Volumes for RAC on Disk Arrays The Event Monitoring Service HA Disk Monitor provides the capability to monitor the health of LVM disks. If you intend to use this monitor for your mirrored disks, you should configure them in physical volume groups. For more information, refer to the manual Using HA Monitors. NOTE: When using LVM version 2.x, the volume groups are supported with Serviceguard.
PAGE 42
Creating a Volume Group with PVG-Strict Mirroring Use the following steps to build a volume group on the configuration node (ftsys9). Later, the same volume group will be created on other nodes. 1. Set up the group directory for vgops: # mkdir /dev/vg_rac 2.
PAGE 43
Logical volume “/dev/vg_rac/redo1.log” has been successfully created with character device “/dev/vg_rac/rredo1.log” Logical volume “/dev/vg_rac/redo1.log” has been successfully extended NOTE: With LVM 2.1 and above, mirror write cache (MWC) recovery can be set to ON for RAC Redo Logs and Control Files volumes. Example: # lvcreate -m 1 -M y -s g -n redo1.
PAGE 44
Logical volume “/dev/vg_rac/system.dbf” has been successfully created with character device “/dev/vg_rac/rsystem.dbf” Logical volume “/dev/vg_rac/system.dbf” has been successfully extended NOTE: The character device file name (also called the raw logical volume name) is used by the Oracle DBA in building the OPS database.
PAGE 45
It is only necessary to do this with one of the device file names for the LUN. The -f option is only necessary if the physical volume was previously used in some other volume group. 4. Use the following to create the volume group with the two links: # vgcreate /dev/vg_rac /dev/dsk/c0t15d0 /dev/dsk/c1t3d0 LVM will now recognize the I/O channel represented by/dev/dsk/c0t15d0 as the primary link to the disk.
PAGE 46
Table 1 Required Oracle File Names for Demo Database (continued) Logical Volume Name LV Size (MB) Raw Logical Volume Path Name Oracle File Size (MB)* opsdata2.dbf 208 /dev/vg_rac/ropsdata2.dbf 200 opsdata3.dbf 208 /dev/vg_rac/ropsdata3.dbf 200 opsspfile1.ora 5 /dev/vg_rac/ropsspfile1.ora 5 pwdfile.ora 5 /dev/vg_rac/rpwdfile.ora 5 opsundotbs1.dbf 508 /dev/vg_rac/ropsundotbs1.log 500 opsundotbs2.dbf 508 /dev/vg_rac/ropsundotbs2.log 500 example1.dbf 168 /dev/vg_rac/ropsexample1.
PAGE 47
1. On ftsys9, copy the mapping of the volume group to a specified file. # vgexport -s -p -m /tmp/vg_rac.map 2. /dev/vg_rac Still on ftsys9, copy the map file to ftsys10 (and to additional nodes as necessary.) # rcp /tmp/vg_rac.map ftsys10:/tmp/vg_rac.map 3. On ftsys10 (and other nodes, as necessary), create the volume group directory and the control file named group.
PAGE 48
For more information, refer to your version of the Serviceguard Extension for RAC Release Notes and HP Serviceguard Storage Management Suite Release Notes located at www.hp.com/go/hpux-serviceguard-docs. CAUTION: Once you create the disk group and mount point packages, you must administer the cluster with CFS commands, including cfsdgadm, cfsmntadm, cfsmount, and cfsumount. You must not use the HP-UX mount or umount command to provide or remove access to a shared file system in a CFS environment.
PAGE 49
NODE ever3a ever3b 5. STATUS up up STATE running running Configure the Cluster Volume Manager (CVM). Configure the system multi-node package, SG-CFS-pkg, to configure and start the CVM/CFS stack. The SG-CFS-pkg does not restrict heartbeat subnets to a single subnet and supports multiple subnets. # cfscluster config -s The following output will be displayed: CVM is now configured Starting CVM When CVM starts up, it selects a master node.
PAGE 50
11. Creating volumes and adding a cluster filesystem.
PAGE 51
14. Check CFS mount points. # bdf | grep cfs /dev/vx/dsk/cfsdg1/vol1 10485760 36455 9796224 0% /cfs/mnt1 /dev/vx/dsk/cfsdg1/vol2 10485760 36455 9796224 0% /cfs/mnt2 /dev/vx/dsk/cfsdg1/vol3 614400 17653 559458 3% /cfs/mnt3 15. View the configuration.
PAGE 52
The following output will be generated: Mount point “/cfs/mnt3” was disassociated from the cluster Cleaning up resource controlling shared disk group “cfsdg1” Shared disk group “cfsdg1” was disassociated from the cluster. NOTE: 3. The disk group package is deleted if there is no dependency. Delete disk group multi-node package. # cfsdgadm delete cfsdg1 The following output will be generated: Shared disk group “cfsdg1” was disassociated from the cluster.
PAGE 53
Using CVM 5.x or later This section has information on how to set up the cluster and the system multi-node package with CVM—without the CFS filesystem, on HP-UX releases that support them. See “About Veritas CFS and CVM from Symantec” (page 15). Preparing the Cluster and the System Multi-node Package for use with CVM 5.x or later The following steps describe how to prepare the cluster and the system multi-node package with CVM 5.x or later only. 1. Create the cluster file.
PAGE 54
that uses the volume group must be halted. This procedure is described in the Managing Serviceguard Eighteenth Edition user guide Appendix G. • Initializing disks for CVM. It is necessary to initialize the physical disks that will be employed in CVM disk groups. If a physical disk has been previously used with LVM, you should use the pvremove command to delete the LVM header data from all the disks in the volume group (this is not necessary if you have not previously used the disk with LVM).
PAGE 55
IMPORTANT: After creating these files, use the vxedit command to change the ownership of the raw volume files to oracle and the group membership to dba, and to change the permissions to 660. Example: # cd /dev/vx/rdsk/ops_dg # vxedit -g ops_dg set user=oracle * # vxedit -g ops_dg set group=dba * # vxedit -g ops_dg set mode=660 * The logical volumes are now available on the primary node, and the raw logical volume names can now be used by the Oracle DBA.
PAGE 56
After the above command completes, start the cluster and create disk groups for shared use as described in the following sections. Starting the Cluster and Identifying the Master Node Run the cluster to activate the special CVM package: # cmruncl After the cluster is started, it will run with a special system multi-node package named VxVM-CVM-pkg that is on all nodes.
PAGE 57
Creating Volumes Use the vxassist command to create logical volumes. The following is an example: # vxassist -g ops_dg make log_files 1024m This command creates a 1024MB volume named log_files in a disk group named ops_dg. The volume can be referenced with the block device file /dev/vx/dsk/ops_dg/log_files or the raw (character) device file /dev/vx/rdsk/ops_dg/log_files.
PAGE 58
Table 2 Required Oracle File Names for Demo Database (continued) Volume Name Size (MB) Raw Device File Name Oracle File Size (MB) ops2log3.log 128 /dev/vx/rdsk/ops_dg/ops2log3.log 120 opssystem.dbf 508 /dev/vx/rdsk/ops_dg/opssystem.dbf 500 opssysaux.dbf 808 /dev/vx/rdsk/ops_dg/opssysaux.dbf 800 opstemp.dbf 258 /dev/vx/rdsk/ops_dg/opstemp.dbf 250 opsusers.dbf 128 /dev/vx/rdsk/ops_dg/opsusers.dbf 120 opsdata1.dbf 208 /dev/vx/rdsk/ops_dg/opsdata1.dbf 200 opsdata2.
PAGE 59
Prerequisites for Oracle 10g, 11gR1, or 11gR2 (Sample Installation) The following sample steps prepare an SGeRAC cluster for Oracle 10g, 11gR1, or 11gR2. Refer to the Oracle documentation for Oracle installation details. 1. Create inventory groups on each node. Create the Oracle inventory group if one does not exist, create the OSDBA group, and create the Operator Group (optional). # groupadd oinstall # groupadd dba # groupadd oper 2. Create Oracle user on each node.
PAGE 60
# chown -R oracle:oinstall /mnt/app/crs/oracle/product/10.2.0/crs # chmod -R 775 /mnt/app/crs/oracle/product/10.2.0/crs 8. Create Oracle base directory (for RAC binaries on local file system). If installing RAC binaries on local file system, create the oracle base directory on each node. # mkdir -p /mnt/app/oracle # chown -R oracle:oinstall /mnt/app/oracle # chmod -R 775 /mnt/app/oracle # usermod -d /mnt/app/oracle oracle 9. Create Oracle base directory (for RAC binaries on cluster file system).
PAGE 61
The following is a sample of the mapping file for DBCA: system=/dev/vg_rac/ropssystem.dbf sysaux=/dev/vg_rac/ropssysaux.dbf undotbs1=/dev/vg_rac/ropsundotbs01.dbf undotbs2=/dev/vg_rac/ropsundotbs02.dbf example=/dev/vg_rac/ropsexample1.dbf users=/dev/vg_rac/ropsusers.dbf redo1_1=/dev/vg_rac/rops1log1.log redo1_2=/dev/vg_rac/rops1log2.log redo2_1=/dev/vg_rac/rops2log1.log redo2_2=/dev/vg_rac/rops2log2.log control1=/dev/vg_rac/ropsctl1.ctl control2=/dev/vg_rac/ropsctl2.ctl control3=/dev/vg_rac/ropsctl3.
PAGE 62
NOTE: The volume groups are supported with Serviceguard. The steps shown in the following section are for configuring the volume groups in Serviceguard clusters LVM version 1.0. For more information on using and configuring LVM version 2.x, see the HP-UX System Administrator's Guide: Logical Volume Management located at www.hp.com/go/ hpux-core-docs —> HP-UX 11i v3. Installing Oracle 10g, 11gR1, or 11gR2 Cluster Software The following sample steps for an SGeRAC cluster for Oracle 10g, 11gR1, or 11gR2.
PAGE 63
Installing RAC Binaries on Cluster File System Logon as a “oracle” user: $ export ORACLE BASE=/cfs/mnt1/oracle $ export DISPLAY={display}:0.0 $ cd <10g/11g RAC installation disk directory> $ ./runInstaller Use following guidelines when installing on a local file system: 1. In this example, the path to ORACLE_HOME is located on a CFS directory /cfs/mnt1/ oracle/product//db_1. 2. Select installation for database software only. 3. When prompted, run root.sh on each node.
PAGE 64
Creating a RAC Demo Database on CFS Export environment variables for “oracle” user: export ORACLE_BASE=/cfs/mnt1/oracle export ORACLE_HOME=$ORACLE_BASE/product//db_1 export ORA_CRS_HOME=/mnt/app/crs/oracle/product//crs LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib:$ORACLE_HOME/rdbms/lib SHLIB_PATH=$ORACLE_HOME/lib32:$ORACLE_HOME/rdbms/lib32 export LD_LIBRARY_PATH SHLIB_PATH export \ PATH=$PATH:$ORACLE_HOME/bin:$ORA_CRS_HOME/bin:/usr/local/bin: CLASSPATH=$ORACLE_HOME/jre:$ORA
PAGE 65
IMPORTANT: Beginning with HP Serviceguard Storage Management Suite (SMS) Version A.04.00.00, /opt/VRTSodm/lib/libodm.sl is changed to /opt/VRTSodm/lib/ libodm.so on itanium(IA) architecture. If you are using itanium architecture and HP Serviceguard SMS Version A.04.00.00 or later, then you must use /opt/VRTSodm/lib/libodm.so file instead of /opt/VRTSodm/lib/ libodm.sl file. #ll -L /opt/VRTSodm/lib/libodm.sl output: -r-xr-xr-x 1 root sys 94872 Aug 25 2009 /opt/VRTSodm/lib/libodm.
PAGE 66
io req: io calls: comp req: comp calls: io mor cmp: io zro cmp: cl receive: cl ident: cl reserve: cl delete: cl resize: cl same op: cl opt idn: cl opt rsv: **********: 3. 9102431 6911030 73480659 5439560 461063 2330 66145 18 8 1 0 0 0 332 17 Verify that the Oracle disk manager is loaded: # kcmodule -P state odm Output: state loaded 4. In the alert log, verify the Oracle instance is running. The log should contain output similar to the following: For CFS 4.1: Oracle instance running with ODM: VERITAS 4.
PAGE 67
$ rm libodm11.so $ ln -s ${ORACLE_HOME}/lib/libodmd11.so ${ORACLE_HOME}/lib/libodm11.so 5. Restart the database. Using Serviceguard Packages to Synchronize with Oracle 10g/11gR1/11gR2 RAC It is recommended to start and stop Oracle Cluster Software in a Serviceguard package—the Oracle Cluster Software will start after SGeRAC is started, and will stop before SGeRAC is halted.
PAGE 68
Modify the package control script to set the CVM disk group to “activate” for shared write and to specify the disk group. CVM_DG[0]=”ops_dg” • Storage Activation (CFS) When the Oracle Cluster Software required storage is configured on a Cluster File System (CFS), the Serviceguard package should be configured to depend on the CFS multi-node package through package dependency.
PAGE 69
3 Support of Oracle RAC ASM with SGeRAC Introduction This chapter discusses the use of the Oracle 10g Release 2 (10g R2) and11g Release 1 (11g R1) database server feature called Automatic Storage Management (ASM) in configurations of HP Serviceguard Extension for Real Application Clusters (SGeRAC). We begin with a brief review of ASM—functionality, pros, cons, and method of operation. Then, we look in detail at how we configure ASM with SGeRAC (version A.11.17 or later is required).
PAGE 70
for specific types of disk arrays. Other advantages of the "ASM-over-SLVM" configuration are as follows: • ASM-over-SLVM ensures that the HP-UX devices used for disk group members will have the same names (the names of logical volumes in SLVM volume groups) on all nodes, easing ASM configuration. • ASM-over-SLVM protects ASM data against inadvertent overwrites from nodes inside/outside the cluster.
PAGE 71
Figure 10 1-1 mapping between SLVM logical and physical volumes for ASM configuration 4 If the LVM patch PHKL_36745 (or equivalent) is installed in the cluster, a timeout equal to (2* PV timeout) will suffice to try all paths. The SLVM volume groups are marked as shared volume groups and exported across the SGeRAC cluster using standard SGeRAC procedures.
PAGE 72
# vgextend /dev/vgora_asm /dev/dsk/c10t0d1 # vgextend /dev/vgora_asm /dev/dsk/c10t0d2 2. For each of the two PVs, create a corresponding LV. • Create an LV of zero length. • Mark the LV as contiguous. • Extend each LV to the maximum size possible on that PV (the number of extents available in a PV can be determined via vgdisplay -v ). • Configure LV timeouts, based on the PV timeout and number of physical paths, as described in the previous section.
PAGE 73
Step 2 remains the same. Logical volumes are prepared for the new disks in the same way. In step 3, switch the volume group back to shared mode, using SNOR, and export the VG across the cluster, ensuring that the right ownership and access rights are assigned to the raw logical volumes. Activate the volume group, and restart ASM and the database(s) using ASM-managed storage on all nodes (they are already active on node A).
PAGE 74
The advantages of the "ASM-over-SLVM" configuration are as follows: • ASM-over-SLVM ensures that the HP-UX devices used for disk group members will have the same names (the names of logical volumes in SLVM volume groups) on all nodes, easing ASM configuration. • ASM-over-SLVM protects ASM data against inadvertent overwrites from nodes inside/outside the cluster.
PAGE 75
Figure 11 1-1 mapping between SLVM logical and physical volumes for ASM configuration The SLVM volume groups are marked as shared volume groups and exported across the SGeRAC cluster using standard SGeRAC procedures. Please note that, for the case in which the SLVM PVs being used by ASM are disk array LUs, the requirements in this section do not place any constraints on the configuration of the LUs.
PAGE 76
• Extend each LV to the maximum size possible on that PV (the number of extents available in a PV can be determined via vgdisplay -v ) • Configure LV timeouts, based on the PV timeout and number of physical paths, as described in the previous section. If a PV timeout has been explicitly set, its value can be displayed via pvdisplay -v. If not, pvdisplay will show a value of default, indicating that the timeout is determined by the underlying disk driver.
PAGE 77
or later) to support ASM on raw disks/disk array LUs. In HP-UX 11i v3, new DSF is introduced. SGeRAC will support the DSF format that ASM support with the restriction that native multipathing feature is enabled. The advantages for “ASM-over-raw” are as follows: • There is a small performance improvement from one less layer of volume management. • Online disk management (adding disks, deleting disks) is supported with ASM-over-raw.
PAGE 78
of the volume groups is to first shut down the ASM instance and its clients (including all databases that use ASM based storage) on that node. The major implications of this behavior include the following: • Many SGeRAC customers use SGeRAC packages to start and shut down Oracle RAC instances. In the startup and shutdown sequences, the package scripts activate and deactivate the SLVM volume groups used by the instance.
PAGE 79
Additional Documentation on the Web and Scripts • Oracle Clusterware Installation Guide 11g Release 1 (11.1) for HP-UX at www.oracle.com/ pls/db111/portal.portal_db?selected=11&frame= → HP-UX Installation Guides → Clusterware Installation Guide for HP-UX • ASM related sections in Oracle Manuals ◦ Oracle® Database Administrator's Guide 10g R2 (10.2) at www.oracle.com/pls/db102/ portal.
PAGE 80
4 SGeRAC Toolkit for Oracle RAC 10g or later Introduction This chapter discusses how Serviceguard Extension for RAC Toolkit enables a new framework for the integration of Oracle 10g Release 2 (10.2.0.1) or later version of Real Application Clusters (Oracle RAC1) with HP Serviceguard Extension for Real Application Clusters A.11.17 or later (SGeRAC2). SGeRAC Toolkit leverages the multi-node package and simple package dependency features introduced by HP Serviceguard (SG) A.11.
PAGE 81
Clusterware voting and registry devices can also be configured using oracle ASM (Automatic Storage Management) disk groups. The members of disk groups are configured as raw devices (on HP-UX 11i v3). Oracle 11gR2 is supported only on HP-UX 11i v3 (11.31) with SGeRAC A.11.19 or later. • The RAC database files can be configured as shared raw logical volumes managed by SGeRAC using SLVM or CVM. Beginning with SGeRAC A.11.17, the RAC database files may be configured as shared files managed by SGeRAC using CFS.
PAGE 82
by, for example, using the command srvctl start instance... and srvctl stop instance... respectively. NOTE: The above mentioned steps are the mandatory prerequisite steps to be performed before you configure SGeRAC toolkit for CRS, ASMDG (if storage is ASM/SLVM), and RAC MNP’s.
PAGE 83
Simple package dependencies have the following features/restrictions: • cmrunpkg will fail if the user attempts to start a package that has a dependency on another package that is not running. The package manager will not attempt to start a package if its dependencies are not met. If multiple packages are specified to cmrunpkg, they will be started in dependency order. If the AUTO_RUN attribute is set to YES, the package manager will start the packages automatically in dependency order.
PAGE 84
An example of a bottleneck created if we only have a package for Oracle Clusterware is this: if we concentrate all storage management in the Oracle Clusterware package, then any time there is a change in the storage configuration for one database (for example, an SLVM volume group is added), we would have to modify the Oracle Clusterware package. These are the main arguments in favor of having separate packages for Oracle Clusterware and each RAC database.
PAGE 85
Figure 12 Resources managed by SGeRAC and Oracle Clusterware and their dependencies Startup and shutdown of the combined Oracle RAC-SGeRAC stack The combined stack is brought up in proper order by cmrunnode or cmruncl as follows. 1. SGeRAC starts up. 2. The SGeRAC package manager starts up Oracle Clusterware via the Oracle Clusterware MNP, ensuring that the storage needed is made available first.
PAGE 86
Next, SGeRAC package manager shuts down Oracle Clusterware via the Oracle Clusterware MNP, followed by the storage needed by Oracle Clusterware (this requires subsequent shutdown of mount point and disk group MNPs in the case of the storage needed by Oracle Clusterware being managed by CFS). It can do this since the dependent RAC database instance MNP is already down. Before shutting itself down, Oracle Clusterware shuts down the ASM instance if configured, and then the node applications.
PAGE 87
package manager fails the corresponding ASMDG MNP and the RAC MNP that is dependent on ASMDG MNP. How Serviceguard Extension for RAC Toolkit starts, stops, and checks the RAC database instance Next, the toolkit interaction with the RAC database is discussed. The MNP for the RAC database instance provides start and stop functions for the RAC database instance and has a service for checking the status of the RAC database instance. The start function executes su to the Oracle software owner user id.
PAGE 88
Use Case 2: Oracle Clusterware storage and database storage in CFS Figure 14 Use Case 2 Setup In this case, Oracle Clusterware quorum and registry device data is stored in files in a CFS. Oracle database files are also stored in a CFS. For each CFS used by Oracle Clusterware for its quorum and registry device data, there will be a dependency configured from the Oracle Clusterware MNP to the mount point MNP corresponding to that CFS. The mount point MNP has a dependency on the CFS system MNP (SMNP).
PAGE 89
dependent on Oracle Clusterware MNP. Disk groups that are exclusively used by a RAC DB should be managed in separate ASM DG MNP. If different RAC Database uses different ASM Disk groups then those, ASM DGs should not be configured in a single ASMDG MNP. As RAC DB Instance MNP 3 and RAC DB Instance MNP 4 use completely different ASM diskgroups, they are made dependent on their respective ASMDG MNP(ASMDG MNP 2, ASMDG MNP 3).
PAGE 90
3. 4. The user can maintain the Oracle ASM disk groups on that node while Oracle ASMDG MNP package is still running. After the maintenance work is completed, the user can remove the created asm_dg.debug in step 2 to bring the Oracle ASMDG MNP package out of maintenance mode to resume normal monitoring by Serviceguard. The maintenance mode message will appear in the Oracle database instance package log files, e.g. “Starting ASM DG MNP checking again after maintenance.
PAGE 91
Figure 16 Internal structure of SGeRAC for Oracle Clusterware Figure 17 Internal structure of SGeRAC for ASMDG MNP Serviceguard Extension for RAC Toolkit internal file structure 91
PAGE 92
Figure 18 Internal structure of SGeRAC for RAC DB instance Support for the SGeRAC Toolkit NOTE: The content in this section was part of SGeRAC Toolkit README file till SGeRAC A.11.20 patch PHSS_41590. From SGeRAC A.11.20 patch PHSS_41642 onwards the README content has been moved to this Administration guide. CONTENTS: A. Overview B. SGeRAC Toolkit Required Software C. SGeRAC Toolkit File Structure D. SGeRAC Toolkit Files E.
PAGE 93
membership information to the Oracle Clusterware and provides clustered storage to meet the needs of Oracle Clusterware and RAC database instances. The Oracle Clusterware manages the database and associated resources (e.g. database instances, services, virtual IP addresses, listeners, etc), and ASM instances if configured. The startup and shutdown of the various components in the combined SGeRAC-Oracle Clusterware configuration must be coordinated in the proper sequence.
PAGE 94
----------------------------| | | | | | V V ----------------------------| | | | | | | | | CFS-DG1-MNP | | CFS-DG2-MNP | | | | | | | | | ----------------------------| | | | | | V V --------------------------------------| | | | | SG-CFS-pkg | | | | | --------------------------------------- 3. Dependency structure in the case of ASM over SLVM and ASM over HP-UX raw disks.
PAGE 95
After installation of SGeRAC, the SGeRAC Toolkit module Attribute Definition Files (ADF) reside under the /etc/cmcluster/modules/sgerac directory and the module scripts reside under the /etc/cmcluster/scripts/sgerac directory. The SGeRAC Toolkit files reside under /opt/cmcluster/SGeRAC/toolkit. This directory contains three subdirectories crsp, asmp and racp. Subdirectory crsp contains the Toolkit scripts for OC MNP. Subdirectory racp contains the Toolkit scripts for RAC MNP.
PAGE 96
which is read by the Toolkit script oc.sh. oc.check oc.sh - Toolkit monitor script that checks if the Oracle Clusterware is running. - Toolkit script to start, stop, and check the Oracle Clusterware. The files under /opt/cmcluster/SGeRAC/toolkit/racp are for the RAC MNP: toolkit_dbi.sh - The entry point to the Toolkit entity. It is an interface between the RAC database instance MNP package control script and the rac_dbi.* files listed below. rac_dbi.
PAGE 97
run_script_timeout, halt_script_timeout Default value is 600 seconds for a 4 node cluster. This value is suggested as an initial value. It may need to be tuned for your environment. script_log_file Set by default to "$SGRUN/log/$SG_PACKAGE.log" TKIT_DIR Set to the OC MNP working directory. After the cmapplyconf command, the OC MNP configuration file oc.conf will be created in this directory. If the oc.
PAGE 98
package_description Set by default to "SGeRAC Toolkit Oracle ASMDG package" node_name Specify the names for the nodes that the ASMDG MNP will run on. auto_run Set to yes or no depending on whether the ASMDG MNP is to be started on cluster join or on demand. local_lan_failover_allowed Set by default to yes to allow the cluster to switch LANs locally in the event of a failure. node_fail_fast_enabled Set by default to no. script_log_file Set by default to "$SGRUN/log/$SG_PACKAGE.
PAGE 99
Set to yes or no depending on whether the RAC MNP is to be started on cluster join or on demand. local_lan_failover_allowed Set by default to yes to allow the cluster to switch LANs locally in the event of a failure. node_fail_fast_enabled Set by default to no. script_log_file Set by default to "$SGRUN/log/$SG_PACKAGE.log" TKIT_DIR Set to the RAC MNP working directory. After the cmapplyconf command, the RAC MNP configuration file rac_dbi.conf will be created in this directory. If the rac_dbi.
PAGE 100
DEPENDENCY_CONDITION DEPENDENCY_LOCATION OC-MNP-PKG=UP SAME_NODE DEPENDENCY_NAME MP-MNP-name DEPENDENCY_CONDITION MP-MNP-PKG=UP DEPENDENCY_LOCATION SAME_NODE Note: For modular style CFS DG-MP package, as a dependency OC MNP and modular style CFS DG-MP MNP must be mentioned in RAC MNP configuration file.
PAGE 101
PACKAGE_TYPE Set to MULTI_NODE. FAILOVER_POLICY, FAILBACK_POLICY Comment out. NODE_NAME Specify the names for the nodes that the ASMDG MNP will run on. AUTO_RUN Set to YES or NO depending on whether the ASMDG MNP is to be started on cluster join or on demand. LOCAL_LAN_FAILOVER_ALLOWED Set by default to YES to allow cluster to switch LANs locally in the event of a failure. NODE_FAIL_FAST_ENABLED Set by default to NO. RUN_SCRIPT, HALT_SCRIPT Set to the package control script.
PAGE 102
DEPENDENCY_NAME DEPENDENCY_CONDITION DEPENDENCY_LOCATION DG-MNP-name DG-MNP-PKG=UP SAME_NODE For a package using CFS: DEPENDENCY_NAME DEPENDENCY_CONDITION DEPENDENCY_LOCATION OC-MNP-name OC-MNP-PKG=UP SAME_NODE DEPENDENCY_NAME DEPENDENCY_CONDITION DEPENDENCY_LOCATION MP-MNP-name MP-MNP-PKG=UP SAME_NODE When ASMDG package is configured: DEPENDENCY_NAME DEPENDENCY_CONDITION DEPENDENCY_LOCATION ASMDG-MNP-name ASMDG-MNP-PKG=UP SAME_NODE Note: When ASMDG MNP is configured, make sure you configure the dep
PAGE 103
- set SERVICE_CMD[0] to "/toolkit_dbi.sh check" - set SERVICE_RESTART[0] to "" In the function customer_defined_run_cmds: - start the RAC instance using the command: /toolkit_dbi.sh start In the function customer_defined_halt_cmds: - stop the RAC instance using the command: /toolkit_dbi.sh stop For the ASMDG MNP: ------------ set VGCHANGE to "vgchange -a s" . When using ASM over HP-UX raw disks, ignore this step.
PAGE 104
not. This parameter can be set to either Yes or No(default Yes) ASM_DISK GROUP ASM Disk groups used by the database instance ASM_VOLUME_GROUP: Volume groups used in the ASM disk groups for this database instance. CHECK_INTERVAL Time interval in seconds (default 60) between consecutive checks of ASM disk group status by the MNP. MAINTENANCE_FLAG ASMDG MNP maintenance mode: yes or no(default). This variable will enable or disable maintenance mode for the ASMDG MNP.
PAGE 105
OC_TKIT_DIR Set to the OC MNP working directory. When MAINTENANCE_FLAG is yes, the RAC MNP uses this parameter to check the OC MNP maintenance status: If the OC MNP MAINTENANCE_FLAG is set to yes and oc.debug is in the OC_TKIT_DIR directory, the RAC MNP knows the OC MNP on the same node is in maintenance mode. In this case, because of the dependency on the OC MNP, the RAC MNP will go into maintenance mode as well regardless of the presence of its debug file.
PAGE 106
Clusterware parameters in this file directly: : cmmakepkg -m sg/multi_node_all -m sgerac/erac_tk_oc pkgConfigFile Edit the package template files based on the description in E-1-1. 4. Now apply the package configuration file: : cmapplyconf -P pkgConfigFile F-3. OC MNP creation procedures [For Legacy packages]: 1. On one node of the cluster, create an OC MNP working directory under /etc/cmcluster and copy the files in the Toolkit directory /opt/cmcluster/SGeRAC/toolkit/crsp.
PAGE 107
3. Generate the package configuration file for the ASMDG MNP and edit the file based on the description in E-1. Then configure ASMDG MNP. If asm_dg.conf is configured and tested in step 2, use the following command to create the package configuration file: : cmmakepkg -m sg/multi_node_all -m sgerac/erac_tk_asmdg -t asm_dg.
PAGE 108
Edit the package template files based on the description in E-1. 4. Now apply the package configuration file: : cmapplyconf -P pkgConfigFile F-9. RAC MNP creation procedures [For Legacy Packages]: 1. On one node of the cluster, create a RAC MNP working directory under /etc/cmcluster and copy over the files in the Toolkit directory /opt/cmcluster/SGeRAC/toolkit/racp. : mkdir /etc/cmcluster/YourOwn-RACMNP-Dir : cd /etc/cmcluster/YourOwn-RACMNP-Dir : cp /opt/cmcluster/SGeRAC/toolkit/racp/* . 2.
PAGE 109
H. SGeRAC Toolkit Supported Configurations This version of Toolkit supports the following configurations. The maximum number of nodes supported in SGeRAC Toolkit depends on the number of nodes supported by SGeRAC with each storage management configuration. Refer to "Number of nodes supported in SGeRAC for SLVM, CVM and CFS Matrix" posted on http://www.hp.com/go/hpux-serviceguard-docs -> Serviceguard Extension for RAC.
PAGE 110
2. Shutdown the legacy RAC MNP if a RAC MNP is running. : cmhaltpkg 3. Create a new RAC MNP package working directory on one node, then cd to the new package directory. : cd /etc/cmcluster/RACMNP-Dir 4. Use the cmmigratepkg command to migrate the legacy RAC MNP to modular format. : cmmigratepkg -p -s -o 5. Create a new modular RAC MNP ascii file.
PAGE 111
OC MNP ascii file with legacy style DG and MP package: DEPENDENCY_NAME DG-MNP-name DEPENDENCY_CONDITION DG-MNP-PKG=UP DEPENDENCY_LOCATION SAME_NODE DEPENDENCY_NAME DEPENDENCY_CONDITION DEPENDENCY_LOCATION DG-MNP-name DG-MNP-PKG=UP SAME_NODE OC MNP ascii file with modular style CFS DG-MP package : DEPENDENCY_NAME OC-DGMP-name DEPENDENCY_CONDITION OC-DGMP-PKG=UP DEPENDENCY_LOCATION SAME_NODE 7. Take backup of all RAC MNP configuration file.
PAGE 112
3. 4. 5. 6. Start the ASMDG MNP. Edit the RAC MNP configuration script, and add a dependency on its corresponding ASMDG MNP. Run cmapplyconf on the RAC MNP configuration file. Start the RAC MNP. N. SGeRAC Toolkit Package Cleanup 1. Shutdown the RAC MNP. : cmhaltpkg 2. Delete the RAC MNP configuration. : cmdeleteconf -p 3.
PAGE 113
5 Maintenance This chapter includes information about carrying out routine maintenance on a Real Application Cluster configuration. Starting with version SGeRAC A.11.17, all log messages from cmgmsd log to /var/adm/syslog/syslog.log by default. As presented here, these tasks differ in some details from the similar tasks described in the Managing Serviceguard documentation.
PAGE 114
CLUSTER cluster_mo NODE minie STATUS up STATUS up STATE running Quorum_Server_Status: NAME STATUS white up STATE running Network_Parameters: INTERFACE STATUS PRIMARY up PRIMARY up STANDBY up NODE mo PATH 0/0/0/0 0/8/0/0/4/0 0/8/0/0/6/0 STATUS up NAME lan0 lan1 lan3 STATE running Quorum_Server_Status: NAME STATUS white up STATE running Network_Parameters: INTERFACE STATUS PRIMARY up PRIMARY up STANDBY up PATH 0/0/0/0 0/8/0/0/4/0 0/8/0/0/6/0 NAME lan0 lan1 lan3 MULTI_NODE_PACKAGES PACKAGE SG-CF
PAGE 115
NODE_NAME mo STATUS up Dependency_Parameters: DEPENDENCY_NAME SG-CFS-pkg PACKAGE SG-CFS-MP-1 NODE_NAME minie STATUS up STATUS up Dependency_Parameters: DEPENDENCY_NAME SG-CFS-DG-1 NODE_NAME mo STATUS up Dependency_Parameters: DEPENDENCY_NAME SG-CFS-DG-1 PACKAGE SG-CFS-MP-2 NODE_NAME minie STATUS up STATUS up Dependency_Parameters: DEPENDENCY_NAME SG-CFS-DG-1 NODE_NAME mo STATUS up Dependency_Parameters: DEPENDENCY_NAME SG-CFS-DG-1 PACKAGE SG-CFS-MP-3 NODE_NAME minie STATUS up STATUS up Dependenc
PAGE 116
Cluster Status The status of a cluster may be one of the following: • Up. At least one node has a running cluster daemon, and reconfiguration is not taking place. • Down. No cluster daemons are running on any cluster node. • Starting. The cluster is in the process of determining its active membership. At least one cluster daemon is running. • Unknown. The node on which the cmviewcl command is issued cannot communicate with other nodes in the cluster.
PAGE 117
Package Switching Attributes Packages also have the following switching attributes: • Package Switching. Enabled—the package can switch to another node in the event of failure. • Switching Enabled for a Node. Enabled—the package can switch to the referenced node. Disabled—the package cannot switch to the specified node until the node is enabled for the package using the cmmodpkg command. Every package is marked Enabled or Disabled for each node that is either a primary or adoptive node for the package.
PAGE 118
Network Status The network interfaces have only status, as follows: • Up. • Down. • Unknown—Whether the interface is up or down cannot be determined. This can happen when the cluster is down. A standby interface has this status. NOTE: Serial Line Status has been de-supported as of Serviceguard A.11.18.
PAGE 119
ftsys10 up running Network_Parameters: INTERFACE STATUS PRIMARY up STANDBY up PATH 28.1 32.
PAGE 120
PACKAGE STATUS VxVM-CVM-pkg up NODE ftsys8 STATE running STATUS down NODE STATUS ftsys9 up Script_Parameters: ITEM STATUS Service up STATE halted STATE running MAX_RESTARTS 0 RESTARTS 0 NAME VxVM-CVM-pkg.
PAGE 121
Alternate NODE ftsys10 up STATUS up enabled ftsys9 (current) STATE running Network_Parameters: INTERFACE STATUS PRIMARY up STANDBY up PATH 28.1 32.1 NAME lan0 lan1 Now pkg2 is running on node ftsys9. Note that it is still disabled from switching.
PAGE 122
Policy_Parameters: POLICY_NAME CONFIGURED_VALUE Failover min_package_node Failback automatic Script_Parameters: ITEM STATUS Resource up Subnet up Resource up Subnet up Resource up Subnet up Resource up Subnet up NODE_NAME manx manx burmese burmese tabby tabby persian persian NAME /resource/random 192.8.15.0 /resource/random 192.8.15.0 /resource/random 192.8.15.0 /resource/random 192.8.15.
PAGE 123
NOTE: • All of the checks below are performed when you run cmcheckconf without any arguments (or with only -v, with or without -k or -K). cmcheckconf validates the current cluster and package configuration, including external scripts and pre-scripts for modular packages, and runs cmcompare to check file consistency across nodes. (This new version of the command also performs all of the checks that were done in previous releases.) See “Checking Cluster Components” (page 123) for details.
PAGE 124
Table 5 Verifying Cluster Components (continued) Component (Context) Tool or Command; More Information Comments • Same physical volumes on each node • Physical volumes connected on each node Volume groups (package) cmcheckconf (1m), cmapplyconf (1m) Checked only on nodes configured to run the package. LVM logical volumes (package) cmcheckconf (1m), cmapplyconf (1m) Checked for modular packages only, as part of package validation (cmcheckconf -P).
PAGE 125
Table 5 Verifying Cluster Components (continued) Component (Context) Tool or Command; More Information Comments exist and are executable. Service commands whose paths are nested within an unmounted shared filesystem are not checked. IP addresses (cluster) cmcheckconf (1m), cmapplyconf (1m) Commands check that all IP addresses configured into the cluster are in each node's /etc/ hosts.
PAGE 126
Limitations Serviceguard does not check the following conditions: • Access Control Policies properly configured • File systems configured to mount automatically on boot (that is, Serviceguard does not check /etc/fstab) • Shared volume groups configured to activate on boot • Volume group major and minor numbers unique • Redundant storage paths functioning properly • Kernel parameters and driver configurations consistent across nodes • Mount point overlaps (such that one file system is obscured w
PAGE 127
1) Create a directory for RAC toolkit configuration file. This directory should be same as the one which is created on the existing cluster nodes. For example; mkdir /etc/cmcluster/
PAGE 128
# vgchange -S y -c y /dev/vg_rac This command is issued from the configuration node only, and the cluster must be running on all nodes for the command to succeed. Note that both the -S and the -c options are specified. The -S y option makes the volume group shareable, and the -c y option causes the cluster ID to be written out to all the disks in the volume group. This command specifies the cluster that a node must be a part of to obtain shared access to the volume group.
PAGE 129
1. 2. Ensure that the Oracle RAC database is not active on either node. From node 2, use the vgchange command to deactivate the volume group: # vgchange -a n /dev/vg_rac 3. From node 2, use the vgexport command to export the volume group: # vgexport -m /tmp/vg_rac.map.old /dev/vg_rac This dissociates the volume group from node 2. 4. From node 1, use the vgchange command to deactivate the volume group: # vgchange -a n /dev/vg_rac 5.
PAGE 130
Adding Additional Shared LVM Volume Groups To add capacity or to organize your disk resources for ease of management, you may wish to create additional shared volume groups for your Oracle RAC databases. If you decide to use additional shared volume groups, they must conform to the following rules: • Volume groups should include different PV links to each logical unit on the disk array. • Volume group names must be the same on all nodes in the cluster.
PAGE 131
Monitoring Hardware Good standard practice in handling a high-availability system includes careful fault monitoring so as to prevent failures if possible, or at least to react to them swiftly when they occur. The following should be monitored for errors or warnings of all kinds.
PAGE 132
13. Start up the Oracle RAC instances on all nodes. 14. Activate automatic cluster startup. NOTE: As you add new disks to the system, update the planning worksheets (described in Appendix B: “Blank Planning Worksheets”), so as to record the exact configuration you are using. Replacing Disks The procedure for replacing a faulty disk mechanism depends on the type of disk configuration you are using and on the type of Volume Manager software.
PAGE 133
6. Issue the following command to extend the logical volume to the newly inserted disk: # lvextend -m 1 /dev/vg_sg01 /dev/dsk/c2t3d0 7. Finally, use the lvsync command for each logical volume that has extents on the failed physical volume. This synchronizes the extents of the new disk with the extents of the other mirror.
PAGE 134
NOTE: After executing one of the commands above, any I/O queued for the device will restart. If the device replaced in step #2 was a mirror copy, then it will begin the resynchronization process that may take a significant amount of time to complete. The progress of the resynchronization process can be observed using the vgdisplay(1M), lvdisplay(1M) or pvdisplay(1M) commands.
PAGE 135
the bus without harm.) When using inline terminators and Y cables, ensure that all orange-socketed termination packs are removed from the controller cards. NOTE: You cannot use inline terminators with internal FW/SCSI buses on D and K series systems, and you cannot use the inline terminator with single-ended SCSI buses. You must not use an inline terminator to connect a node to a Y cable. Figure 19 shows a three-node cluster with two F/W SCSI buses.
PAGE 136
Replacement of I/O Cards After an I/O card failure, you can replace the card using the following steps. It is not necessary to bring the cluster down to do this if you are using SCSI inline terminators or Y cables at each node. 1. Halt the node by using Serviceguard Manager or the cmhaltnode command. Packages should fail over normally to other nodes. 2. Remove the I/O cable from the card. With SCSI inline terminators, this can be done without affecting the disks or other nodes on the bus. 3.
PAGE 137
1. Use the cmgetconf command to obtain a fresh ASCII configuration file, as follows: # cmgetconf config.ascii 2. Use the cmapplyconf command to apply the configuration and copy the new binary file to all cluster nodes: # cmapplyconf -C config.ascii This procedure updates the binary file with the new MAC address and thus avoids data inconsistency between the outputs of the cmviewconcl and lanscan commands. Monitoring RAC Instances The DB Provider provides the capability to monitor RAC databases.
PAGE 138
6 Troubleshooting Go to www.hp.com/go/hpux-serviceguard-docs, and then click HP Serviceguard . In the User Guide section, click on the latest Managing Serviceguard manual and see the “Troubleshooting your Cluster” chapter. NOTE: 138 Troubleshooting All messages from cmgmsd log to /var/adm/syslog/syslog.log by default.
PAGE 139
A Software Upgrades Serviceguard Extension for RAC (SGeRAC) software upgrades can be done in the two following ways: • rolling upgrade • non-rolling upgrade Instead of an upgrade, moving to a new version can be done with: • migration with cold install Rolling upgrade is a feature of SGeRAC that allows you to perform a software upgrade on a given node without bringing down the entire cluster. SGeRAC supports rolling upgrades on version A.11.
PAGE 140
• Each node must be running a version of HP-UX that supports the new SGeRAC version. • Each node must be running a version of Serviceguard that supports the new SGeRAC version. For more information on support, compatibility, and features for SGeRAC, refer to the Serviceguard Compatibility and Feature Matrix, located at www.hp.com/go/hpux-serviceguard-docs —> HP Serviceguard Extension for RAC.
PAGE 141
cmhaltnode b. 3. Select the SGeRAC bundle T1907BA while upgrading the node to HP-UX 11i v3 1109 HA-OE/DC-OE After upgrading to HP-UX 11i v3 1109 HA-OE/DC-OE, Serviceguard, and SGeRAC A.11.20, you must install Serviceguard A.11.20 patch PHSS_42137 on the upgraded node. This patch allows an upgraded Serviceguard and SGeRAC A.11.20 node to join the existing Serviceguard A.11.19 cluster. NOTE: If Serviceguard A.11.20 patch PHSS_42137 is not installed on the upgraded Serviceguard and SGeRAC A.11.
PAGE 142
1. 2. Perform a rolling upgrade to Serviceguard A.11.19. Perform a rolling or offline upgrade to HP-UX 11i v3 1109 HA-OE/DC-OE with SGeRAC. Refer the procedures described in the scenario “ Upgrading from an existing Serviceguard A.11.19 cluster to HP-UX 11i v3 1109 HA-OE/DC-OE along with SGeRAC” (page 140) to perform rolling upgrade from Serviceguard A.11.19 to HP-UX 11i v3 1109 HA-OE/DC-OE with SGeRAC. NOTE: Using DRD utilities can significantly reduce the planned maintenance time to perform this upgrade.
PAGE 143
NOTE: It is optional to set this parameter to “1.” If you want the node to join the cluster at boot time, set this parameter to “1”, otherwise set it to “0.” 6. 7. 8. Restart the cluster on the upgraded node (if desired). You can do this in Serviceguard Manager, or from the command line, issue the Serviceguard cmrunnode command. Start Oracle (Clusterware, RAC) software on the local node. Repeat steps 1-7 on the other nodes, one node at a time until all nodes have been upgraded.
PAGE 144
Figure 20 Running Cluster Before Rolling Upgrade Step 1. 1. 2. Halt Oracle (RAC, Clusterware) software on node 1. Halt node 1. This will cause the node’s packages to start up on an adoptive node. You can do this in Serviceguard Manager, or from the command line issue the following: # cmhaltnode -f node1 This will cause the failover package to be halted cleanly and moved to node 2. The Serviceguard daemon on node 1 is halted, and the result is shown in Figure 21.
PAGE 145
NOTE: If you install Serviceguard and SGeRAC separately, Serviceguard must be installed before installing SGeRAC. Figure 22 Node 1 Upgraded to SG/SGeRAC 11.16 Step 3. 1. If you prefer, restart the cluster on the upgraded node (node 1). You can do this in Serviceguard Manager, or from the command line issue the following: # cmrunnode node1 2. 3. At this point, different versions of the Serviceguard daemon (cmcld) are running on the two nodes, as shown in Figure 23.
PAGE 146
Step 4. 1. 2. Halt Oracle (RAC, Clusterware) software on node 2. Halt node 2. You can do this in Serviceguard Manager, or from the command line issue the following: # cmhaltnode -f node2 This causes both packages to move to node 1. See Figure A-5. 3. 4. Upgrade node 2 to Serviceguard and SGeRAC (A.11.16) as shown in Figure A-5. When upgrading is finished, enter the following command on node 2 to restart the cluster on node 2: # cmrunnode node2 5. Start Oracle (Clusterware, RAC) software on node 2.
PAGE 147
Figure 25 Running Cluster After Upgrades Limitations of Rolling Upgrades The following limitations apply to rolling upgrades: • During a rolling upgrade, you should issue Serviceguard/SGeRAC commands (other than cmrunnode and cmhaltnode) only on a node containing the latest revision of the software. Performing tasks on a node containing an earlier revision of the software will not work or will cause inconsistent results.
PAGE 148
For more information on support, compatibility, and features for SGeRAC, refer to the Serviceguard and Serviceguard Extension for RAC Compatibility and Feature Matrix, located at www.hp.com/go/hpux-serviceguard-docs —> HP Serviceguard Extension for RAC . • You cannot delete Serviceguard/SGeRAC software (via swremove) from a node while the cluster is in the process of a rolling upgrade.
PAGE 149
refer to the Serviceguard Compatibility and Feature Matrix, located at www.hp.com/go/ hpux-serviceguard-docs —> HP Serviceguard Extension for RAC. 6. Recreate any user accounts needed for the cluster applications. 7. Recreate the network and storage configurations (Set up stationary IP addresses and create LVM volume groups and/or CVM disk groups required for the cluster). 8. Recreate the SGeRAC cluster. 9. Restart the cluster. 10. Reinstall the cluster applications, such as RAC. 11. Restore the data.
PAGE 150
to A.11.19 in preparation for a rolling upgrade to A.11.20, continue with the following subsection that provides information on upgrading to A.11.19.
PAGE 151
B Blank Planning Worksheets This appendix reprints blank planning worksheets used in preparing the RAC cluster. You can duplicate any of these worksheets that you find useful and fill them in as a part of the planning process.
PAGE 152
Instance 1 Redo Log: _____________________________________________________ Instance 2 Redo Log 1: _____________________________________________________ Instance 2 Redo Log 2: _____________________________________________________ Instance 2 Redo Log 3: _____________________________________________________ Instance 2 Redo Log: _____________________________________________________ Instance 2 Redo Log: _____________________________________________________ Data: System _________________________
PAGE 153
Index A activation of volume groups in shared mode, 128 administration cluster and package states, 113 array replacing a faulty mechanism, 132, 133, 134 B building a cluster CVM infrastructure, 52 building an RAC cluster displaying the logical volume infrastructure, 46 logical volume infrastructure, 40 building logical volumes for RAC, 45 C CFS, 47, 51 cluster state, 118 status options, 116 Cluster Communication Network Monitoring, 35 cluster volume group creating physical volumes, 41 creating a storage i
PAGE 154
adding disk hardware, 131 making changes to shared volume groups, 128 monitoring hardware, 131 N network status, 118 node halting status, 121 in an RAC cluster, 12 status and state, 116 non-rolling upgrade DRD, 149 O online hardware maintenance by means of in-line SCSI terminators, 134 Online node addition and deletion, 126 Online reconfiguration, 126 opsctl.ctl Oracle demo database files, 45, 57 opslog.
PAGE 155
T temp.