Using Serviceguard Extension for RAC, 10th Edition, April 2011

ManualsBrandsHP ManualsSoftwareHP Serviceguard Extension for RAC (SGeRAC)

Using Serviceguard Extension for RAC

Version 11.20

HP Part Number: T1859-90064

Published: April 2011

Summary of content (153 pages)

PAGE 1
Using Serviceguard Extension for RAC Version 11.
PAGE 2
Legal Notices © Copyright 2011 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use, or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor’s standard commercial license. The information contained herein is subject to change without notice.
PAGE 3
Contents Advantages of using SGeRAC.........................................................................9 User Guide Overview..................................................................................10 Where to find Documentation on the Web......................................................12 1 Introduction to Serviceguard Extension for RAC............................................13 What is a Serviceguard Extension for RAC Cluster? ..........................................................
PAGE 4
Mirroring and Resilvering...............................................................................................28 Shared Storage Activation..............................................................................................28 Listener.............................................................................................................................28 Automated Startup and Shutdown...................................................................................
PAGE 5
Deleting CFS from the Cluster..............................................................................................52 Creating a Storage Infrastructure with CVM................................................................................53 Initializing the Veritas Volume Manager................................................................................53 Using CVM 5.x or later......................................................................................................
PAGE 6
4 SGeRAC Toolkit for Oracle RAC 10g or later...............................................81 Introduction ...........................................................................................................................81 Background...........................................................................................................................81 Coordinating the Oracle RAC/Serviceguard Extension for RAC stack.......................................
PAGE 7
Online Node Addition and Deletion...................................................................................128 Managing the Shared Storage...............................................................................................129 Making LVM Volume Groups Shareable..............................................................................129 Making a Volume Group Unshareable ..........................................................................
PAGE 8
Index.......................................................................................................
PAGE 9
Advantages of using SGeRAC HP Serviceguard Extension for RAC (SGeRAC) amplifies the availability and simplifies the management of Oracle Real Application Cluster (RAC). SGeRAC allows you to integrate Oracle RAC into a Serviceguard cluster while also easily managing the dependency between Oracle Clusterware and Oracle RAC with a full range of storage management options.
PAGE 10
User Guide Overview This user guide covers how to use Serviceguard Extension for RAC (Oracle Real Application Cluster) to configure Serviceguard clusters for use with Oracle Real Application Cluster software, on HP High Availability clusters running the HP-UX operating system. • Chapter 1— Introduction to Serviceguard Extension for RAC Describes a Serviceguard cluster and provides a roadmap for using this guide.
PAGE 11
If you will be using Veritas Cluster Volume Manager (CVM) and Veritas Cluster File System (CFS) from Symantec with Serviceguard refer to the HP Serviceguard Storage Management Suite Version A.03.01 for HP-UX 11i v3 Release Notes. These release notes describe suite bundles for the integration of HP Serviceguard A.11.20 on HP-UX 11i v3 with Symantec’s Veritas Storage Foundation.
PAGE 12
Where to find Documentation on the Web • SGeRAC Documentation Go to www.hp.com/go/hpux-serviceguard-docs, and then click HP Serviceguard Extension for RAC. • Related Documentation Go to www.hp.com/go/hpux-serviceguard-docs, www.hp.com/go/ hpux-core-docs, and www.hp.com/go/hpux-ha-monitoring-docs. The following documents contain additional useful information: 12 ◦ Clusters for High Availability: a Primer of HP Solutions.
PAGE 13
1 Introduction to Serviceguard Extension for RAC Serviceguard Extension for RAC (SGeRAC) enables the Oracle Real Application Cluster (RAC), formerly known as Oracle Parallel Server RDBMS, to run on HP high availability clusters under the HP-UX operating system. This chapter introduces Serviceguard Extension for RAC and shows where to find different kinds of information in this book.
PAGE 14
With RAC on HP-UX, you can maintain a single database image that is accessed by the HP servers in parallel and gain added processing power without the need to administer separate databases. When properly configured, Serviceguard Extension for RAC provides a highly available database that continues to operate even if one hardware component fails. Group Membership Group membership allows multiple instances of RAC to run on each node. Related processes are configured into groups.
PAGE 15
automatically transfer control of the package to another cluster node, allowing services to remain available with minimal interruption. • System multi-node packages. There are also packages that run on several cluster nodes at once, and do not fail over. These are called system multi-node packages and multi-node packages. As of Serviceguard Extension for RAC A.11.18, the non-failover packages that are supported are those specified by Hewlett-Packard, and you can create your own multi-node packages.
PAGE 16
that support Veritas CFS and CVM). For more information, see “About Veritas CFS and CVM from Symantec” (page 16)). For information on configuring CFS and CVM with Serviceguard, refer to the latest edition of the Managing Serviceguard user’s guide at www.hp.com/go/hpux-serviceguard-docs —> HP Serviceguard. Package Dependencies When CFS is used as shared storage, the application and software using the CFS storage should be configured to start and stop using Serviceguard packages.
PAGE 17
for up-to-date information at www.hp.com/go/hpux-serviceguard-docs —> HP Serviceguard Extension for RAC. CAUTION: Once you create the disk group and mount point packages, you must administer the cluster with CFS commands, including cfsdgadm, cfsmntadm, cfsmount, and cfsumount. You must not use the HP-UX mount or umount command to provide or remove access to a shared file system in a CFS environment. Using these HP-UX commands under these circumstances is not supported. Use cfsmount and cfsumount instead.
PAGE 18
Overview of SGeRAC Cluster Interconnect Subnet Monitoring In SGeRAC, the Cluster Interconnect Subnet Monitoring feature is used to monitor cluster communication subnets. This feature requires the use of a package configuration parameter known as the CLUSTER_INTERCONNECT_SUBNET. It can be set up to monitor certain subnets used by applications that are configured as Serviceguard multi-node packages.
PAGE 19
The following describes the behavior of cluster interconnect subnet monitoring feature under the following scenarios: • For a multi-node package with CLUSTER_INTERCONNECT_SUBNET configured, upon an explicit request to start the package on a node, no attempt to start the package instance on that node will be made if the subnet is not up on that node.
PAGE 20
adds the package IP address. For subsequent connections for clients configured with basic failover, clients would connect to the next available listener package's IP address and listener. Node Failure RAC cluster configuration is designed so that in the event of a node failure, another node with a separate instance of Oracle can continue processing transactions. Figure 3 shows a typical cluster with instances running on both nodes.
PAGE 21
Figure 4 After Node Failure In the above figure, pkg1 and pkg2 are not instance packages. They are shown to illustrate the movement of the packages. Larger Clusters Serviceguard Extension for RAC supports clusters of up to 16 nodes. The actual cluster size is limited by the type of storage and the type of volume manager used. Up to Four Nodes with SCSI Storage You can configure up to four nodes using a shared F/W SCSI bus; for more than four nodes, FibreChannel must be used.
PAGE 22
Figure 5 Four-Node RAC Cluster In this type of configuration, each node runs a separate instance of RAC and may run one or more high availability packages as well. The figure shows a dual Ethernet configuration with all four nodes connected to a disk array (the details of the connections depend on the type of disk array). In addition, each node has a mirrored root disk (R and R).
PAGE 23
Figure 6 Eight-Node Cluster with EVA, XP or EMC Disk Array FibreChannel switched configurations also are supported using either an arbitrated loop or fabric login topology. For additional information about supported cluster configurations, refer to the HP 9000 Servers Configuration Guide, available through your HP representative.
PAGE 24
4. 5. Restart the Serviceguard cluster. Restart Oracle Clusterware (for Oracle 10g, 11gR1, and 11gR2) and Oracle RAC database instance on all nodes. Use the following steps to disable the GMS authorization: 1. 2. 3. 4. 5. If Oracle RAC database instance and Oracle Clusterware (for Oracle 10g, 11gR1, and 11gR2) are running, shut them down on all nodes. Halt the Serviceguard cluster. Edit /etc/opt/nmapi/nmutils.conf and comment the GMS_USER[] settings on all nodes. Restart the Serviceguard cluster.
PAGE 25
Configuring Clusters with Serviceguard Manager You can configure clusters and packages in Serviceguard Manager. You must have root (UID=0) access to the cluster nodes.
PAGE 26
2 Serviceguard Configuration for Oracle 10g, 11gR1, or 11gR2 RAC This chapter shows the additional planning and configuration that is needed to use Oracle Real Application Clusters 10g/11gR1/11gR2 with Serviceguard.
PAGE 27
CSS Timeout When SGeRAC is on the same cluster as Oracle Cluster Software, the CSS timeout is set to a default value of 600 seconds (10 minutes) at Oracle software installation. This timeout is configurable with Oracle tools and should not be changed without ensuring that the CSS timeout allows enough time for Serviceguard Extension for RAC (SGeRAC) reconfiguration and to allow multipath (if configured) reconfiguration to complete.
PAGE 28
The file /var/opt/oracle/oravg.conf must not be present so Oracle Cluster Software will not activate or deactivate any shared storage. Multipathing Multipathing is automatically configured in HP-UX 11i v3 (this is often called native multipathing). Multipathing is supported through either SLVM pvlinks or CVM Dynamic Multipath (DMP). In some configurations, SLVM or CVM does not need to be configured for multipath as the multipath is provided by the storage array.
PAGE 29
Manual Startup and Shutdown Manual listener startup and shutdown is supported through the following commands: srvctl and lsnrctl. Network Monitoring SGeRAC cluster provides network monitoring. For networks that are redundant and monitored by Serviceguard cluster, Serviceguard cluster provides local failover capability between local network interfaces (LAN) that is transparent to applications utilizing User Datagram Protocol (UDP) and Transport Control Protocol (TCP).
PAGE 30
NOTE: srvctl and sqlplus are Oracle commands. Manual Startup and Shutdown Manual RAC instance startup and shutdown is supported through the following commands: srvctl or sqlplus. Shared Storage It is expected the shared storage is available when the RAC instance is started. Since the RAC instance expects the shared storage to be available, ensure the shared storage is activated. For SLVM, the shared volume groups must be activated and for CVM, the disk group must be activated.
PAGE 31
The most common network configuration is to have all interconnect traffic for cluster communications to go on a single heartbeat network that is redundant so that Serviceguard monitors the network and resolves interconnect failures by cluster reconfiguration. The following are situations when it is not possible to place all interconnect traffic on a single network: • RAC GCS (cache fusion) traffic may be very high, so an additional dedicated heartbeat network for Serviceguard needs to be configured.
PAGE 32
disk arrays in RAID modes. The logical units of storage on the arrays are accessed from each node through multiple physical volume links (PV links, also known as alternate links), which provide redundant paths to each unit of storage. Fill out a Logical Volume worksheet to provide logical volume names for logical volumes that you will create with the lvcreate command. The Oracle DBA and the HP-UX system administrator should prepare this worksheet together. Create entries for shared volumes only.
PAGE 33
worksheets. Make as many copies as you need. Fill out the worksheet and keep it for future reference. ORACLE LOGICAL VOLUME WORKSHEET FOR LVM Page ___ of ____ =============================================================================== RAW LOGICAL VOLUME NAME SIZE (MB) Oracle Cluster Registry: _____/dev/vg_rac/rora_ocr_____100___ (once per cluster) Oracle Cluster Vote Disk: ____/dev/vg_rac/rora_vote_____20___ (once per cluster) Oracle Control File: _____/dev/vg_rac/ropsctl1.
PAGE 34
Prior to installing Serviceguard Extension for RAC, the following must be installed: • Correct version of HP-UX • Correct version of Serviceguard To install Serviceguard Extension for RAC, use the following steps for each node: NOTE: All nodes in the cluster must be either SGeRAC nodes or Serviceguard nodes. For the up-to-date version compatibility for Serviceguard and HP-UX, see the SGeRAC release notes for your version. 1. 2. 3. 4. Mount the distribution media in the tape drive, CD, or DVD reader.
PAGE 35
have legacy DSFs on some nodes and agile addressing on others—this allows you to migrate the names on different nodes at different times, if necessary. NOTE: The examples in this document use legacy naming conventions. About Cluster-wide Device Special Files (cDSFs) Under agile addressing on HP-UX 11i v3, each device has a unique identifier as seen from a given host; this identifier is reflected in the name of the Device Special File (DSF).
PAGE 36
• cDSFs are not supported by CVM, CFS, or any other application that assumes DSFs reside only in /dev/disk and /dev/rdisk. • cDSFs do not support disk partitions. Such partitions can be addressed by a device file using the agile addressing scheme, but not by a cDSF. For more information about Cluster-wide Device Special Files (cDSFs), see the Managing Serviceguard, Eighteenth Edition manual at www.hp.com/go/hpux-serviceguard-docs —> HP Serviceguard .
PAGE 37
NOTE: A package with the CLUSTER_INTERCONNECT_SUBNET parameter is available for both Modular and Legacy packages. A package with this parameter can be configured only when all nodes of the cluster are running SGeRAC version A.11.18 or higher. For more information, see the latest edition of the Managing Serviceguard Eighteenth Edition user guide at www.hp.com/go/hpux-serviceguard-docs —> HP Serviceguard.
PAGE 38
NOTE: Starting with Serviceguard A.11.19, the faster failover capability is in core Serviceguard. This configuration can be used for faster failover. Figure 8 SG-HB/RAC-IC Traffic Separation Each primary and standby pair protects against a single failure. With the SG-HB on more than one subnet, a single subnet failure will not trigger a Serviceguard reconfiguration.
PAGE 39
PACKAGE_TYPE LOCAL_LAN_FAILOVER_ALLOWED NODE_FAIL_FAST_ENABLED DEPENDENCY_NAME DEPENDENCY_CONDITION DEPENDENCY_LOCATION MULTI_NODE YES NO CI-PACKAGE CI-PACKAGE=UP SAME_NODE Oracle Cluster Interconnect Subnet Package: Package to monitor the CSS-HB subnet PACKAGE_NAME CI-PACKAGE PACKAGE_TYPE MULTI_NODE LOCAL_LAN_FAILOVER_ALLOWED YES NODE_FAIL_FAST_ENABLED YES CLUSTER_INTERCONNECT_SUBNET 192.168.1.
PAGE 40
NOTE: Do not configure CLUSTER_INTERCONNECT_SUBNET in the RAC instance package if the RAC-IC network is the same as CSS-HB network. The following is an example of the relevant package configuration parameters: Oracle RAC Instance Package PACKAGE_NAME PACKAGE_TYPE LOCAL_LAN_FAILOVER_ALLOWED NODE_FAIL_FAST_ENABLED CLUSTER_INTERCONNECT_SUBNET RAC_PACKAGE MULTI_NODE YES NO 192.168.2.
PAGE 41
NOTE: 1. The “F” represents the Serviceguard failover time as given by the max_reformation_duration field of cmviewcl –v –f line output. 2. SLVM timeout is documented in the whitepaper, LVM link and Node Failure Recovery Time. Limitations of Cluster Communication Network Monitor The Cluster Interconnect Monitoring feature does not coordinate with any feature handling subnet failures (including self).
PAGE 42
NOTE: When using LVM version 2.x, the volume groups are supported with Serviceguard. The steps shown in the following section are for configuring the volume groups in Serviceguard clusters LVM version 1.0. For more information on using and configuring LVM version 2.x, see the HP-UX 11i Version 3: HP-UX System Administrator's Guide: Logical Volume Management located at www.hp.com/go/hpux-core-docs —> HP-UX 11i v3. For LVM version 2.
PAGE 43
1. Set up the group directory for vgops: # mkdir /dev/vg_rac 2. Create a control file named group in the directory /dev/vg_rac, as follows: # mknod /dev/vg_rac/group c 64 0xhh0000 The major number is always 64, and the hexadecimal minor number has the form 0xhh0000 where hh must be unique to the volume group you are creating. Use the next hexadecimal number that is available on your system, after the volume groups that are already configured.
PAGE 44
NOTE: With LVM 2.1 and above, mirror write cache (MWC) recovery can be set to ON for RAC Redo Logs and Control Files volumes. Example: # lvcreate -m 1 -M y -s g -n redo1.log -L 408 /dev/vg_rac NOTE: The character device file name (also called the raw logical volume name) is used by the Oracle DBA in building the RAC database. Creating Mirrored Logical Volumes for RAC Data Files Following a system crash, the mirrored logical volumes need to be resynchronized, which is known as “resilvering.
PAGE 45
Logical volume “/dev/vg_rac/system.dbf” has been successfully created with character device “/dev/vg_rac/rsystem.dbf” Logical volume “/dev/vg_rac/system.dbf” has been successfully extended NOTE: The character device file name (also called the raw logical volume name) is used by the Oracle DBA in building the OPS database.
PAGE 46
It is only necessary to do this with one of the device file names for the LUN. The -f option is only necessary if the physical volume was previously used in some other volume group. 4. Use the following to create the volume group with the two links: # vgcreate /dev/vg_rac /dev/dsk/c0t15d0 /dev/dsk/c1t3d0 LVM will now recognize the I/O channel represented by/dev/dsk/c0t15d0 as the primary link to the disk.
PAGE 47
Table 1 Required Oracle File Names for Demo Database (continued) Logical Volume Name LV Size (MB) Raw Logical Volume Path Name Oracle File Size (MB)* opsdata2.dbf 208 /dev/vg_rac/ropsdata2.dbf 200 opsdata3.dbf 208 /dev/vg_rac/ropsdata3.dbf 200 opsspfile1.ora 5 /dev/vg_rac/ropsspfile1.ora 5 pwdfile.ora 5 /dev/vg_rac/rpwdfile.ora 5 opsundotbs1.dbf 508 /dev/vg_rac/ropsundotbs1.log 500 opsundotbs2.dbf 508 /dev/vg_rac/ropsundotbs2.log 500 example1.dbf 168 /dev/vg_rac/ropsexample1.
PAGE 48
To set up the volume group on ftsys10 (and other nodes), use the following steps: 1. On ftsys9, copy the mapping of the volume group to a specified file. # vgexport -s -p -m /tmp/vg_rac.map 2. /dev/vg_rac Still on ftsys9, copy the map file to ftsys10 (and to additional nodes as necessary.) # rcp /tmp/vg_rac.map ftsys10:/tmp/vg_rac.map 3. On ftsys10 (and other nodes, as necessary), create the volume group directory and the control file named group.
PAGE 49
Storage Infrastructure with CVM” (page 53)) has information about configuring the Veritas Cluster Volume Manager (CVM) with other filesystems, not CFS. For more information, refer to your version of the Serviceguard Extension for RAC Release Notes and HP Serviceguard Storage Management Suite Release Notes located at www.hp.com/go/hpux-serviceguard-docs.
PAGE 50
The following output will be displayed: CLUSTER ever3_cluster NODE ever3a ever3b 5. STATUS up STATUS up up STATE running running Configure the Cluster Volume Manager (CVM). Configure the system multi-node package, SG-CFS-pkg, to configure and start the CVM/CFS stack. The SG-CFS-pkg does not restrict heartbeat subnets to a single subnet and supports multiple subnets.
PAGE 51
10. Activate the disk group. # cfsdgadm activate cfsdg1 11. Creating volumes and adding a cluster filesystem.
PAGE 52
# cfsmount /cfs/mnt3 14. Check CFS mount points. # bdf | grep cfs /dev/vx/dsk/cfsdg1/vol1 10485760 36455 9796224 0% /cfs/mnt1 /dev/vx/dsk/cfsdg1/vol2 10485760 36455 9796224 0% /cfs/mnt2 /dev/vx/dsk/cfsdg1/vol3 614400 17653 559458 3% /cfs/mnt3 15. View the configuration.
PAGE 53
# cfsmntadm delete /cfs/mnt3 The following output will be generated: Mount point “/cfs/mnt3” was disassociated from the cluster Cleaning up resource controlling shared disk group “cfsdg1” Shared disk group “cfsdg1” was disassociated from the cluster. NOTE: 3. The disk group package is deleted if there is no dependency. Delete disk group multi-node package. # cfsdgadm delete cfsdg1 The following output will be generated: Shared disk group “cfsdg1” was disassociated from the cluster.
PAGE 54
IMPORTANT: Creating a rootdg disk group is only necessary the first time you use the Volume Manager. CVM 5.x or later does not require a rootdg. Using CVM 5.x or later This section has information on how to set up the cluster and the system multi-node package with CVM—without the CFS filesystem, on HP-UX releases that support them. See “About Veritas CFS and CVM from Symantec” (page 16). Preparing the Cluster and the System Multi-node Package for use with CVM 5.
PAGE 55
mode: enabled: cluster active MASTERmaster: ever3b • Converting disks from LVM to CVM. Use the vxvmconvert utility to convert LVM volume groups into CVM disk groups. Before you can do this, the volume group must be deactivated, which means that any package that uses the volume group must be halted. This procedure is described in the Managing Serviceguard Eighteenth Edition user guide Appendix G. • Initializing disks for CVM.
PAGE 56
IMPORTANT: After creating these files, use the vxedit command to change the ownership of the raw volume files to oracle and the group membership to dba, and to change the permissions to 660. Example: # cd /dev/vx/rdsk/ops_dg # vxedit -g ops_dg set user=oracle * # vxedit -g ops_dg set group=dba * # vxedit -g ops_dg set mode=660 * The logical volumes are now available on the primary node, and the raw logical volume names can now be used by the Oracle DBA.
PAGE 57
WARNING! The above file should never be edited. After the above command completes, start the cluster and create disk groups for shared use as described in the following sections. Starting the Cluster and Identifying the Master Node Run the cluster to activate the special CVM package: # cmruncl After the cluster is started, it will run with a special system multi-node package named VxVM-CVM-pkg that is on all nodes.
PAGE 58
NAME rootdg ops_dg STATE ID enabled enabled,shared 971995699.1025.node1 972078742.1084.node2 Creating Volumes Use the vxassist command to create logical volumes. The following is an example: # vxassist -g ops_dg make log_files 1024m This command creates a 1024MB volume named log_files in a disk group named ops_dg. The volume can be referenced with the block device file /dev/vx/dsk/ops_dg/log_files or the raw (character) device file /dev/vx/rdsk/ops_dg/log_files.
PAGE 59
Table 2 Required Oracle File Names for Demo Database (continued) Volume Name Size (MB) Raw Device File Name Oracle File Size (MB) ops1log3.log 128 /dev/vx/rdsk/ops_dg/ops1log3.log 120 ops2log1.log 128 /dev/vx/rdsk/ops_dg/ops2log1.log 120 ops2log2.log 128 /dev/vx/rdsk/ops_dg/ops2log2.log 120 ops2log3.log 128 /dev/vx/rdsk/ops_dg/ops2log3.log 120 opssystem.dbf 508 /dev/vx/rdsk/ops_dg/opssystem.dbf 500 opssysaux.dbf 808 /dev/vx/rdsk/ops_dg/opssysaux.dbf 800 opstemp.
PAGE 60
For more detailed information on the package configuration process, refer to the Managing Serviceguard Eighteenth Edition user’s guide. Prerequisites for Oracle 10g, 11gR1, or 11gR2 (Sample Installation) The following sample steps prepare a SGeRAC cluster for Oracle 10g, 11gR1, or 11gR2. Refer to the Oracle documentation for Oracle installation details. 1. Create inventory groups on each node.
PAGE 61
7. Create Oracle cluster software home directory. For installing Oracle cluster software on local file system, create the directories on each node. # mkdir -p /mnt/app/crs/oracle/product/10.2.0/crs # chown -R oracle:oinstall /mnt/app/crs/oracle/product/10.2.0/crs # chmod -R 775 /mnt/app/crs/oracle/product/10.2.0/crs 8. Create Oracle base directory (for RAC binaries on local file system). If installing RAC binaries on local file system, create the oracle base directory on each node.
PAGE 62
# mkdir -p $ORACLE_BASE/oradata/ver10 # chown -R oracle:oinstall $ORACLE_BASE/oradata # chmod -R 755 $ORACLE_BASE/oradata The following is a sample of the mapping file for DBCA: system=/dev/vg_rac/ropssystem.dbf sysaux=/dev/vg_rac/ropssysaux.dbf undotbs1=/dev/vg_rac/ropsundotbs01.dbf undotbs2=/dev/vg_rac/ropsundotbs02.dbf example=/dev/vg_rac/ropsexample1.dbf users=/dev/vg_rac/ropsusers.dbf redo1_1=/dev/vg_rac/rops1log1.log redo1_2=/dev/vg_rac/rops1log2.log redo2_1=/dev/vg_rac/rops2log1.
PAGE 63
NOTE: The volume groups are supported with Serviceguard. The steps shown in the following section are for configuring the volume groups in Serviceguard clusters LVM version 1.0. For more information on using and configuring LVM version 2.x, see the HP-UX System Administrator's Guide: Logical Volume Management located at www.hp.com/go/hpux-core-docs —> HP-UX 11i v3. Installing Oracle 10g, 11gR1, or 11gR2 Cluster Software The following sample steps for a SGeRAC cluster for Oracle 10g, 11gR1, or 11gR2.
PAGE 64
1. 2. 3. In this example, the path to ORACLE_HOME is on a local file system /mnt/app/oracle/ product//db_1. Select installation for database software only. When prompted, run root.sh on each node. Installing RAC Binaries on Cluster File System Logon as a “oracle” user: $ export ORACLE BASE=/cfs/mnt1/oracle $ export DISPLAY={display}:0.0 $ cd <10g/11g RAC installation disk directory> $ ./runInstaller Use following guidelines when installing on a local file system: 1.
PAGE 65
$ dbca Use following guidelines when installing on a local file system: a. In this sample, the database name and SID prefix are ver10. b. Select the storage option for raw devices.
PAGE 66
5.0.31.5 Veritas ODM manual pages VRTSodm.ODM-RUN 5.0.31.5 Veritas ODM commands 3. Check that libodm.sl is present #ll -L /opt/VRTSodm/lib/libodm.sl output: -r-xr-xr-x 1 root sys 94872 Aug 25 2009 /opt/VRTSodm/lib/libodm.sl Configuring Oracle to Use Oracle Disk Manager Library NOTE: 1. 2. 3. The following steps are specific to CFS 4.1 or later. Login as a Oracle user. Shutdown database. Link the Oracle Disk Manager library into Oracle home.
PAGE 67
io mor cmp: io zro cmp: cl receive: cl ident: cl reserve: cl delete: cl resize: cl same op: cl opt idn: cl opt rsv: **********: 3. 461063 2330 66145 18 8 1 0 0 0 332 17 Verify that the Oracle disk manager is loaded: # kcmodule -P state odm Output: state loaded 4. In the alert log, verify the Oracle instance is running. The log should contain output similar to the following: For CFS 4.1: Oracle instance running with ODM: VERITAS 4.1 ODM Library, Version 1.1 For CFS 5.
PAGE 68
Using Serviceguard Packages to Synchronize with Oracle 10g/11gR1/11gR2 RAC It is recommended to start and stop Oracle Cluster Software in a Serviceguard package—the Oracle Cluster Software will start after SGeRAC is started, and will stop before SGeRAC is halted. Serviceguard packages should also be used to synchronize storage activation and deactivation with Oracle Cluster Software and RAC instances. Preparing Oracle Cluster Software for Serviceguard Packages CRS starts all RAC instances by default.
PAGE 69
• Storage Activation (CFS) When the Oracle Cluster Software required storage is configured on a Cluster File System (CFS), the Serviceguard package should be configured to depend on the CFS multi-node package through package dependency. With package dependency, the Serviceguard package that starts Oracle Cluster Software will not run until its dependent CFS multi-node package is up and will halt before the CFS multi-node package is halted.
PAGE 70
3 Support of Oracle RAC ASM with SGeRAC Introduction This chapter discusses the use of the Oracle 10g Release 2 (10g R2) and11g Release 1 (11g R1) database server feature called Automatic Storage Management (ASM) in configurations of HP Serviceguard Extension for Real Application Clusters (SGeRAC). We begin with a brief review of ASM—functionality, pros, cons, and method of operation. Then, we look in detail at how we configure ASM with SGeRAC (version A.11.17 or later is required).
PAGE 71
for specific types of disk arrays. Other advantages of the "ASM-over-SLVM" configuration are as follows: • ASM-over-SLVM ensures that the HP-UX devices used for disk group members will have the same names (the names of logical volumes in SLVM volume groups) on all nodes, easing ASM configuration. • ASM-over-SLVM protects ASM data against inadvertent overwrites from nodes inside/outside the cluster.
PAGE 72
Figure 10 1-1 mapping between SLVM logical and physical volumes for ASM configuration 4 If the LVM patch PHKL_36745 (or equivalent) is installed in the cluster, a timeout equal to (2* PV timeout) will suffice to try all paths. The SLVM volume groups are marked as shared volume groups and exported across the SGeRAC cluster using standard SGeRAC procedures.
PAGE 73
# vgextend /dev/vgora_asm /dev/dsk/c10t0d1 # vgextend /dev/vgora_asm /dev/dsk/c10t0d2 2. For each of the two PVs, create a corresponding LV. • Create an LV of zero length. • Mark the LV as contiguous. • Extend each LV to the maximum size possible on that PV (the number of extents available in a PV can be determined via vgdisplay -v ). • Configure LV timeouts, based on the PV timeout and number of physical paths, as described in the previous section.
PAGE 74
Step 2 remains the same. Logical volumes are prepared for the new disks in the same way. In step 3, switch the volume group back to shared mode, using SNOR, and export the VG across the cluster, ensuring that the right ownership and access rights are assigned to the raw logical volumes. Activate the volume group, and restart ASM and the database(s) using ASM-managed storage on all nodes (they are already active on node A).
PAGE 75
The advantages of the "ASM-over-SLVM" configuration are as follows: • ASM-over-SLVM ensures that the HP-UX devices used for disk group members will have the same names (the names of logical volumes in SLVM volume groups) on all nodes, easing ASM configuration. • ASM-over-SLVM protects ASM data against inadvertent overwrites from nodes inside/outside the cluster.
PAGE 76
Figure 11 1-1 mapping between SLVM logical and physical volumes for ASM configuration The SLVM volume groups are marked as shared volume groups and exported across the SGeRAC cluster using standard SGeRAC procedures. Please note that, for the case in which the SLVM PVs being used by ASM are disk array LUs, the requirements in this section do not place any constraints on the configuration of the LUs.
PAGE 77
• Extend each LV to the maximum size possible on that PV (the number of extents available in a PV can be determined via vgdisplay -v ) • Configure LV timeouts, based on the PV timeout and number of physical paths, as described in the previous section. If a PV timeout has been explicitly set, its value can be displayed via pvdisplay -v. If not, pvdisplay will show a value of default, indicating that the timeout is determined by the underlying disk driver.
PAGE 78
or later) to support ASM on raw disks/disk array LUs. In HP-UX 11i v3, new DSF is introduced. SGeRAC will support the DSF format that ASM support with the restriction that native multipathing feature is enabled. The advantages for “ASM-over-raw” are as follows: • There is a small performance improvement from one less layer of volume management. • Online disk management (adding disks, deleting disks) is supported with ASM-over-raw.
PAGE 79
of the volume groups is to first shut down the ASM instance and its clients (including all databases that use ASM based storage) on that node. The major implications of this behavior include the following: • Many SGeRAC customers use SGeRAC packages to start and shut down Oracle RAC instances. In the startup and shutdown sequences, the package scripts activate and deactivate the SLVM volume groups used by the instance.
PAGE 80
Additional Documentation on the Web and Scripts 80 • Oracle Clusterware Installation Guide 11g Release 1 (11.1) for HP-UX at www.oracle.com/ pls/db111/portal.portal_db?selected=11&frame= → HP-UX Installation Guides → Clusterware Installation Guide for HP-UX • ASM related sections in Oracle Manuals ◦ Oracle® Database Administrator's Guide 10g R2 (10.2) at www.oracle.com/pls/db102/ portal.
PAGE 81
4 SGeRAC Toolkit for Oracle RAC 10g or later Introduction This chapter discusses how Serviceguard Extension for RAC Toolkit enables a new framework for the integration of Oracle 10g Release 2 (10.2.0.1) or later version of Real Application Clusters (Oracle RAC1) with HP Serviceguard Extension for Real Application Clusters A.11.17 or later (SGeRAC2). SGeRAC Toolkit leverages the multi-node package and simple package dependency features introduced by HP Serviceguard (SG) A.11.
PAGE 82
The responsibilities of Oracle Clusterware in this combined environment include the following: • Management of the database and associated resources (database instances, services, virtual IP addresses (VIPs), listeners, etc.). • Management of Oracle ASM instances, if configured. All pieces of the combined stack must start up and shut down in the proper sequence and we need to be able to automate the startup and shutdown sequences, if desired.
PAGE 83
With these improvements, it became possible, using SGeRAC packages, to meet the sequencing requirements mentioned above for the startup and shutdown of Oracle Clusterware and RAC database instances with respect to the SGeRAC-managed storage used by these entities. Serviceguard/Serviceguard Extension for RAC multi-node packages and simple package dependencies The new features in SG/SGeRAC that enable the framework provided by are multi-node packages (MNPs) and simple package dependencies.
PAGE 84
• The output of cmviewcl shows the current state of each dependency on each node where the package is configured. • A failover or multi-node package may define dependencies on multiple multi-node packages. Multiple failover or multi-node packages may depend on a multi-node package. Multi-level dependencies can exist; for example, A depends on B which in turn depends on C, etc. • If A depends on B and B fails, A (the appropriate instance, if A is of type multi-node) is halted.
PAGE 85
Clusterware as one MNP and each RAC database as another MNP and we set up the database MNPs to depend on the Oracle Clusterware MNP. This is the core concept of SGeRAC. Both Oracle Clusterware and the RAC database are multi-instance applications well suited to being configured as MNPs. Further, the use of MNPs reduces the total package count and simplifies SGeRAC package configuration and administration.
PAGE 86
Figure 12 Resources managed by SGeRAC and Oracle Clusterware and their dependencies Startup and shutdown of the combined Oracle RAC-SGeRAC stack The combined stack is brought up in proper order by cmrunnode or cmruncl as follows. 1. 2. 3. 4. 5. SGeRAC starts up. The SGeRAC package manager starts up Oracle Clusterware via the Oracle Clusterware MNP, ensuring that the storage needed is made available first.
PAGE 87
Next, SGeRAC package manager shuts down Oracle Clusterware via the Oracle Clusterware MNP, followed by the storage needed by Oracle Clusterware (this requires subsequent shutdown of mount point and disk group MNPs in the case of the storage needed by Oracle Clusterware being managed by CFS). It can do this since the dependent RAC database instance MNP is already down. Before shutting itself down, Oracle Clusterware shuts down the ASM instance if configured, and then the node applications.
PAGE 88
without using cmhaltpkg. The service that invokes the function fails at this point and the SGeRAC package manager fails the corresponding ASMDG MNP and the RAC MNP that is dependent on ASMDG MNP. How Serviceguard Extension for RAC Toolkit starts, stops, and checks the RAC database instance Next, the toolkit interaction with the RAC database is discussed.
PAGE 89
In this case, Oracle Clusterware quorum and voting disk and RAC database files are stored in raw logical volumes managed by SLVM or CVM. The management of SLVM or CVM storage for Oracle Clusterware and database is specified in the package configuration of the respective MNPs. Use Case 2: Oracle Clusterware storage and database storage in CFS Figure 14 Use Case 2 Setup In this case, Oracle Clusterware quorum and registry device data is stored in files in a CFS.
PAGE 90
Use case 3: Database storage in ASM over SLVM Figure 15 Use Case 3 Setup The above diagram can be considered as one use case. Here we have one Oracle Clusterware MNP, three ASMDG MNP, and four RAC database MNP. All the ASMDG MNPs should be made dependent on Oracle Clusterware MNP. Disk groups that are exclusively used by a RAC DB should be managed in separate ASM DG MNP. If different RAC Database uses different ASM Disk groups then those, ASM DGs should not be configured in a single ASMDG MNP.
PAGE 91
1. 2. 3. 4. Make sure the MAINTENANCE_FLAG parameter for Oracle Clusterware MNP, is set to yes when these packages are created. If not, shutdown the MNPs first, set the MAINTENANCE_FLAG to yes, and then restart MNPs. On the maintenance node, create a debug file called oc.debug in the Oracle Clusterware MNP working directory. All the three MNPs on this node will go into maintenance mode. The maintenance mode message will appear in the Toolkit package log files, e.g.
PAGE 92
Serviceguard Extension for RAC Toolkit internal file structure There is a set of files in SGeRAC that deal with SGeRAC specific configuration and logic, and a different set of files that deal with Oracle Clusterware, ASMDG MNP, and RAC specific logic, with a bridge in between. On the SGeRAC-specific side is the MNP ASCII configuration file and the control script (for legacy packages), or module script (for modular packages).
PAGE 93
Figure 17 Internal structure of SGeRAC for ASMDG MNP Figure 18 Internal structure of SGeRAC for RAC DB instance Support for the SGeRAC Toolkit NOTE: The content in this chapter was previously in the SGeRAC Toolkit README file. CONTENTS: A. Overview B. SGeRAC Toolkit Required Software C. SGeRAC Toolkit File Structure D. SGeRAC Toolkit Files E. SGeRAC Toolkit Configuration E-1 Package Configuration File Parameters E-2 Toolkit Configuration File Parameters F. SGeRAC Toolkit Configuration Procedures G.
PAGE 94
J. SGeRAC Toolkit Limitation/Restriction K. SGeRAC Toolkit Legacy Package to Modular Package Migration L. Migration of Legacy CFS Disk group and Mount point Packages to Modular CFS Disk group and Mount point Packages (CFS DG-MP). M. SGeRAC Toolkit Adding new ASMDG MNP Package to the existing configured OC MNP and RAC MNP N. SGeRAC Toolkit Package Cleanup O.
PAGE 95
----------------------------| | | | | | | | | OC-MNP |<--------| RAC-MNP | | | | | | | | | ----------------------------The SLVM Volume group used for Oracle Clusterware storage are configured in the OC-MNP package. The SLVM Volume group used for RAC database storage are configured in the RAC-MNP package. 2.
PAGE 96
The SLVM Volume groups used for Oracle Clusterware storage are configured in the OC-MNP package. The SLVM Volume groups used for RAC database storage are configured in the ASMDG MNP package. In case of ASM over HP-UX raw disks Do not specify any HP-UX raw disks information either in OC-MNP package or in ASMDG MNP package. B. SGeRAC Toolkit Required Software To configure and run this version of SGeRAC Toolkit, the following software is required: - HP-UX 11i v2 or HP-UX 11i v3 SG and SGeRAC A.11.
PAGE 97
oc_gen.sh - script to get Toolkit parameters from the SG configuration database and generate the Toolkit configuration file oc.conf at OC MNP configuration time. It is called by the OC MNP module script only. rac_gen.sh - script to get Toolkit parameters from the SG configuration database and generate the Toolkit configuration file rac_dbi.conf at RAC MNP configuration time. It is called by the RAC MNP module script only. asmdg_gen.
PAGE 98
E-1-1: Modular package configuration file parameters: For modular packages, it is not necessary to create a package script file, and the package configuration file template can be created by running the Serviceguard command "cmmakepkg -m sg/multi_node_all -m [-t ]". For the OC MNP: ----------package_name Set to any name desired for the OC MNP.
PAGE 99
Set by default to 300 dependency_name, dependency_condition, dependency_location If CVM or CFS is used for managing the storage of the Oracle Clusterware, and Serviceguard Disk Group (DG) MNP and Mount Point (MP) MNP are used to handle the disk group and file system mount point, configure a dependency for the corresponding DG MNP (for CVM) or MP MNP (for CFS).
PAGE 100
service_fail_fast_enabled Set by default to no service_halt_timeout Default value is 300 dependency_name, dependency_condition, dependency_location Configure a dependency on the OC MNP. For example, DEPENDENCY_NAME DEPENDENCY_CONDITION DEPENDENCY_LOCATION OC-MNP-name OC-MNP-PKG=UP SAME_NODE For the RAC MNP: -----------package_name Set to any name desired for the RAC MNP. package_type Set by default to multi_node.
PAGE 101
service_cmd Set by default to "$SGCONF/scripts/sgerac/erac_tk_rac.sh rac_check" service_restart Set by default to none service_fail_fast_enabled Set by default to no service_halt_timeout Default value is 300 dependency_name, dependency_condition, dependency_location Configure a dependency on the OC MNP.
PAGE 102
and Serviceguard Disk Group (DG) MNP and Mount Point (MP) MNP are used to handle the disk group and file system mount point, configure a dependency for the corresponding DG MNP (for CVM) or MP MNP (for CFS).
PAGE 103
cluster join or on demand. LOCAL_LAN_FAILOVER_ALLOWED Set by default to YES to allow cluster to switch LANs locally in the event of a failure. NODE_FAIL_FAST_ENABLED Set by default to NO. RUN_SCRIPT, HALT_SCRIPT Set to the package control script. RUN_SCRIPT_TIMEOUT, HALT_SCRIPT_TIMEOUT Default value is 600 seconds for a 4 node cluster. This value is suggested as an initial value. It may need to be tuned for your environment.
PAGE 104
- set SERVICE_NAME[0] to the name of service specified in the ASCII configuration file - set SERVICE_CMD[0] to "/toolkit_oc.sh check" - set SERVICE_RESTART[0] to "" In the function customer_defined_run_cmds: - start Oracle Clusterware using the command: /toolkit_oc.sh start In the function customer_defined_halt_cmds: - stop Oracle Clusterware using the command: /toolkit_oc.
PAGE 105
created manually on that node. During this maintenance period the Oracle Clusterware process checking is paused. Even if Oracle Clusterware is brought down on the local node, the OC MNP on that node will not be halted. When the OC MNP is in maintenance mode and the RAC MNP maintenance mode is enabled, the corresponding RAC MNP on the same node will be in maintenance mode as well regardless of the presence of its maintenance debug file.
PAGE 106
instance status by the MNP. ORA_RESTART_TIMEOUT Time interval in seconds (default 180) for the Toolkit script to wait for Oracle to restart an instance which is terminated prematurely before exiting the package. ORA_SHUTDOWN_TIMEOUT Time interval in seconds (default 120) for the Toolkit script to wait for the Oracle abort to complete before killing the Oracle background processes.
PAGE 107
For Oracle 11g: : $ORACLE_HOME/bin/srvctl modify database -d -y MANUAL F-2. OC MNP creation procedures [For Modular Package]: 1. On one node of the cluster, create an OC MNP working directory under /etc/cmcluster. The following step requires root privilege. : mkdir /etc/cmcluster/OCMNP-Dir 2. Go to step 3 if you don't want to test the oc.conf file before configuring OC MNP.
PAGE 108
Cluster Synchronization Services appears healthy Cluster Ready Services appears healthy Event Manager appears healthy For Oracle 11g R2, messages like the following should be seen: CRS-4638: Oracle High Availability Services is online CRS-4537: Cluster Ready Services is online CRS-4529: Cluster Synchronization Services is online CRS-4533: Event Manager is online The RAC instances should not be running.
PAGE 109
NOTE: To configure another ASMDG MNP package to manage the ASM disk group used by a different RAC Database, repeat the steps in F-6 and F-7. F-8. RAC MNP creation procedures [For Modular Package]: 1. On one node of the cluster, create a RAC MNP working directory under /etc/cmcluster. : mkdir /etc/cmcluster/YourOwn-RACMNP-Dir 2. Go to step 3 if you don't want to test the rac_dbi.conf before configuring the RAC MNP.
PAGE 110
on a specified node. To halt the MNP, use cmhaltpkg to halt all instances, or use cmhaltpkg with the option "-n nodeName" to halt a single instance on a specified node. If the package configuration parameter AUTO_RUN is set to yes, the MNP will be started automatically when the cluster starts up or when the node joins the cluster.
PAGE 111
Answer 2: Oracle automatically creates one database service when the database is created. For many installations, using the default service is sufficient and the default service is always started and does not require extra steps to start. If a user needs more flexibility in the management of the workload using the database, and creates some additional services, these services can be started using the SGeRAC Toolkit in 2 ways: 1.
PAGE 112
packages, you may delete the legacy packages using the cmdeleteconf command. L. Migration of Legacy CFS Disk group and Mount point Packages to Modular CFS Disk group and Mount point Packages(CFS DG-MP). Beginning with the SG A.11.20 patch PHSS_41628 and SG CFS A.11.20 patch PHSS_41674, new modular CFS Disk group and Mount point feature has been introduced. It will allow to consolidate all disk group and mount point packages for an application into a single modular package.
PAGE 113
: cmdeleteconf -p < RAC MNP > : cmdeleteconf -p < OC MNP > 13. Delete all legacy style Disk group MNPs and Mount Point MNPs from cluster : cmdeleteconf -p < legacy MP MNP > : cmdeleteconf -p < legacy DG MNP > 14. Apply and run both modular CFS DG-MP packages for Oracle Clusterware and RAC database storage created in step number [1] and [3] : cmapplyconf -P < OC-DGMP-MNP configuration file > : cmapplyconf -P < RAC-DGMP-MNP configuration file > : cmrunpkg < OC-DGMP-MNP > < RAC-DGMP-MNP > 15.
PAGE 114
HP has published a whitepaper "Use of Serviceguard Extension For RAC Toolkit with Oracle RAC 10g Release 2 or later" that contains the SGeRAC Toolkit background and operation information. This whitepaper is posted on http://www.hp.com/go/hpux-serviceguard-docs -> Serviceguard Extension for RAC. Please note that this whitepaper has not been updated to include documentation for the new ASMDG MNP.
PAGE 115
5 Maintenance This chapter includes information about carrying out routine maintenance on a Real Application Cluster configuration. Starting with version SGeRAC A.11.17, all log messages from cmgmsd log to /var/adm/syslog/syslog.log by default. As presented here, these tasks differ in some details from the similar tasks described in the Managing Serviceguard documentation.
PAGE 116
CLUSTER cluster_mo NODE minie STATUS up STATUS up STATE running Quorum_Server_Status: NAME STATUS white up STATE running Network_Parameters: INTERFACE STATUS PRIMARY up PRIMARY up STANDBY up NODE mo PATH 0/0/0/0 0/8/0/0/4/0 0/8/0/0/6/0 STATUS up NAME lan0 lan1 lan3 STATE running Quorum_Server_Status: NAME STATUS white up STATE running Network_Parameters: INTERFACE STATUS PRIMARY up PRIMARY up STANDBY up PATH 0/0/0/0 0/8/0/0/4/0 0/8/0/0/6/0 NAME lan0 lan1 lan3 MULTI_NODE_PACKAGES PACKAGE SG-CF
PAGE 117
NODE_NAME mo STATUS up Dependency_Parameters: DEPENDENCY_NAME SG-CFS-pkg PACKAGE SG-CFS-MP-1 NODE_NAME minie STATUS up STATUS up Dependency_Parameters: DEPENDENCY_NAME SG-CFS-DG-1 NODE_NAME mo STATUS up Dependency_Parameters: DEPENDENCY_NAME SG-CFS-DG-1 PACKAGE SG-CFS-MP-2 NODE_NAME minie STATUS up STATUS up Dependency_Parameters: DEPENDENCY_NAME SG-CFS-DG-1 NODE_NAME mo STATUS up Dependency_Parameters: DEPENDENCY_NAME SG-CFS-DG-1 PACKAGE SG-CFS-MP-3 NODE_NAME minie STATUS up STATUS up Dependenc
PAGE 118
Cluster Status The status of a cluster may be one of the following: • Up. At least one node has a running cluster daemon, and reconfiguration is not taking place. • Down. No cluster daemons are running on any cluster node. • Starting. The cluster is in the process of determining its active membership. At least one cluster daemon is running. • Unknown. The node on which the cmviewcl command is issued cannot communicate with other nodes in the cluster.
PAGE 119
Package Switching Attributes Packages also have the following switching attributes: • Package Switching. Enabled—the package can switch to another node in the event of failure. • Switching Enabled for a Node. Enabled—the package can switch to the referenced node. Disabled—the package cannot switch to the specified node until the node is enabled for the package using the cmmodpkg command. Every package is marked Enabled or Disabled for each node that is either a primary or adoptive node for the package.
PAGE 120
Network Status The network interfaces have only status, as follows: • Up. • Down. • Unknown—Whether the interface is up or down cannot be determined. This can happen when the cluster is down. A standby interface has this status. NOTE: Serial Line Status has been de-supported as of Serviceguard A.11.18.
PAGE 121
ftsys10 up running Network_Parameters: INTERFACE STATUS PRIMARY up STANDBY up PATH 28.1 32.
PAGE 122
SYSTEM_MULTI_NODE_PACKAGES: PACKAGE STATUS VxVM-CVM-pkg up NODE ftsys8 STATE running STATUS down NODE STATUS ftsys9 up Script_Parameters: ITEM STATUS Service up STATE halted STATE running MAX_RESTARTS 0 RESTARTS 0 NAME VxVM-CVM-pkg.
PAGE 123
NODE_TYPE Primary Alternate NODE ftsys10 STATUS up up STATUS up SWITCHING enabled enabled NAME ftsys10 ftsys9 (current) STATE running Network_Parameters: INTERFACE STATUS PRIMARY up STANDBY up PATH 28.1 32.1 NAME lan0 lan1 Now pkg2 is running on node ftsys9. Note that it is still disabled from switching.
PAGE 124
PKG3 down halted enabled unowned Policy_Parameters: POLICY_NAME CONFIGURED_VALUE Failover min_package_node Failback automatic Script_Parameters: ITEM STATUS Resource up Subnet up Resource up Subnet up Resource up Subnet up Resource up Subnet up NODE_NAME manx manx burmese burmese tabby tabby persian persian NAME /resource/random 192.8.15.0 /resource/random 192.8.15.0 /resource/random 192.8.15.0 /resource/random 192.8.15.
PAGE 125
NOTE: • All of the checks below are performed when you run cmcheckconf without any arguments (or with only -v, with or without -k or -K). cmcheckconf validates the current cluster and package configuration, including external scripts and pre-scripts for modular packages, and runs cmcompare to check file consistency across nodes. (This new version of the command also performs all of the checks that were done in previous releases.) See “Checking Cluster Components” (page 125) for details.
PAGE 126
NOTE: The table includes all the checks available as of A.11.20, not just the new ones.
PAGE 127
Table 5 Verifying Cluster Components (continued) Component (Context) Tool or Command; More Information Comments File consistency (cluster) cmcheckconf (1m), cmcompare (1m). To check file consistency across all nodes in the cluster, do the IMPORTANT: See the manpage for following: differences in return codes from 1. Customize /etc/cmcluster/ cmcheckconf without options versus cmfiles2check cmcheckconf -C. 2. Distribute it to all nodes using cmsysnc (1m) 3.
PAGE 128
NOTE: The job must run on one of the nodes in the cluster. Because only the root user can run cluster verification, and cron (1m) sets the job’s user and group ID’s to those of the user who submitted the job, you must edit the file /var/spool/cron/crontabs/root as the root user. Example The short script that follows runs cluster verification and sends an email to admin@hp.com when verification fails. #!/bin/sh cmcheckconf -v >/tmp/cmcheckconf.
PAGE 129
Use the following steps for adding a node using online node reconfiguration. 1. 2. 3. 4. 5. Export the mapfile for the volume groups that needs to be visible in the new node ( vgexport -s -m mapfile -p ). Copy the mapfile to the new node. Import the volume groups into the new node (vgimport -s -m mapfile ). Add node to the cluster online—edit the cluster configuration file to add the node details and run cmapplyconf. Make the new node join the cluster (cmrunnode) and run the services.
PAGE 130
Making a Volume Group Unshareable Use the following steps to unmark a previously marked shared volume group: 1. 2. Remove the volume group name from the ASCII cluster configuration file. Enter the following command: # vgchange -S n -c n /dev/volumegroup The above example marks the volume group as non-shared, and not associated with a cluster. Activating an LVM Volume Group in Shared Mode Activation and deactivation of shared volume groups is normally done through a control script.
PAGE 131
4. From node 1, use the vgchange command to deactivate the volume group: # vgchange -a n /dev/vg_rac 5. Use the vgchange command to mark the volume group as unshareable: # vgchange -S n -c n /dev/vg_rac 6. Prior to making configuration changes, activate the volume group in normal (non-shared) mode: # vgchange -a y /dev/vg_rac 7. 8. Use normal LVM commands to make the needed changes. Be sure to set the raw logical volume device file's owner to oracle and group to dba with a mode of 660.
PAGE 132
If you are adding or removing shared LVM volume groups, make sure that you modify the cluster configuration file and any package control script that activates and deactivates the shared LVM volume groups. Changing the CVM Storage Configuration To add new CVM disk groups, the cluster must be running. If you are creating new CVM disk groups, be sure to determine the master node on which to do the creation by using the following command: # vxdctl -c mode One node will identify itself as the master.
PAGE 133
• LAN cards • Power sources • All cables • Disk interface cards Some monitoring can be done through simple physical inspection, but for the most comprehensive monitoring, you should examine the system log file (/var/adm/syslog/syslog.log) periodically for reports on all configured HA devices. The presence of errors relating to a device will show the need for maintenance.
PAGE 134
NOTE: As you add new disks to the system, update the planning worksheets (described in Appendix B: “Blank Planning Worksheets”), so as to record the exact configuration you are using. Replacing Disks The procedure for replacing a faulty disk mechanism depends on the type of disk configuration you are using and on the type of Volume Manager software.
PAGE 135
Online Replacement of a Mechanism in an HA Enclosure Configured with Shared LVM (SLVM) If you are using software mirroring for shared concurrent activation of Oracle RAC data with MirrorDisk/UX and the mirrored disks are mounted in a high-availability disk enclosure, use the following LVM command options to change/replace disks via OLR (On Line Replacement). NOTE: This procedure supports either LVM or SLVM VG and is “online” (activated) and uses an “online disk replacement” mechanism.
PAGE 136
NOTE: After executing one of the commands above, any I/O queued for the device will restart. If the device replaced in step #2 was a mirror copy, then it will begin the resynchronization process that may take a significant amount of time to complete. The progress of the resynchronization process can be observed using the vgdisplay(1M), lvdisplay(1M) or pvdisplay(1M) commands.
PAGE 137
termination. (Nodes attached to the middle of a bus using a Y cable also can be detached from the bus without harm.) When using inline terminators and Y cables, ensure that all orange-socketed termination packs are removed from the controller cards. NOTE: You cannot use inline terminators with internal FW/SCSI buses on D and K series systems, and you cannot use the inline terminator with single-ended SCSI buses. You must not use an inline terminator to connect a node to a Y cable.
PAGE 138
7. 8. Reconnect power and reboot the node. If AUTOSTART_CMCLD is set to 1 in the /etc/ rc.config.d/cmcluster file, the node will rejoin the cluster. If necessary, move packages back to the node from their alternate locations and restart them. Replacement of I/O Cards After an I/O card failure, you can replace the card using the following steps. It is not necessary to bring the cluster down to do this if you are using SCSI inline terminators or Y cables at each node. 1.
PAGE 139
configuration file, and it will notify the other nodes in the cluster of the new MAC address. The cluster will operate normally after this. It is also recommended that you update the new MAC address in the cluster binary configuration file by re-applying the cluster configuration. Use the following steps for online reconfiguration: 1. Use the cmgetconf command to obtain a fresh ASCII configuration file, as follows: # cmgetconf config.ascii 2.
PAGE 140
6 Troubleshooting Go to www.hp.com/go/hpux-serviceguard-docs, and then click HP Serviceguard . In the User Guide section, click on the latest Managing Serviceguard manual and see the “Troubleshooting your Cluster” chapter. NOTE: 140 Troubleshooting All messages from cmgmsd log to /var/adm/syslog/syslog.log by default.
PAGE 141
A Software Upgrades Serviceguard Extension for RAC (SGeRAC) software upgrades can be done in the two following ways: • rolling upgrade • non-rolling upgrade Instead of an upgrade, moving to a new version can be done with: • migration with cold install Rolling upgrade is a feature of SGeRAC that allows you to perform a software upgrade on a given node without bringing down the entire cluster. SGeRAC supports rolling upgrades on version A.11.
PAGE 142
For more information on support, compatibility, and features for SGeRAC, refer to the Serviceguard Compatibility and Feature Matrix, located at www.hp.com/go/hpux-serviceguard-docs —> HP Serviceguard Extension for RAC. Steps for Rolling Upgrades Use the following steps when performing a rolling SGeRAC software upgrade: 1. Halt Oracle (RAC, Clusterware) software on the local node (if running). 2. Halt Serviceguard/SGeRAC on the local node by issuing the Serviceguard cmhaltnode command. 3. Edit the /etc/rc.
PAGE 143
NOTE: While you are performing a rolling upgrade, warning messages may appear while the node is determining what version of software is running. This is a normal occurrence and not a cause for concern. Figure 20 Running Cluster Before Rolling Upgrade Step 1. 1. 2. Halt Oracle (RAC, Clusterware) software on node 1. Halt node 1. This will cause the node’s packages to start up on an adoptive node.
PAGE 144
Step 2. Upgrade node 1 and install the new version of Serviceguard and SGeRAC (A.11.16), as shown in Figure 22. NOTE: If you install Serviceguard and SGeRAC separately, Serviceguard must be installed before installing SGeRAC. Figure 22 Node 1 Upgraded to SG/SGeRAC 11.16 Step 3. 1. If you prefer, restart the cluster on the upgraded node (node 1). You can do this in Serviceguard Manager, or from the command line issue the following: # cmrunnode node1 2. 3.
PAGE 145
Step 4. 1. 2. Halt Oracle (RAC, Clusterware) software on node 2. Halt node 2. You can do this in Serviceguard Manager, or from the command line issue the following: # cmhaltnode -f node2 This causes both packages to move to node 1. See Figure A-5. 3. 4. Upgrade node 2 to Serviceguard and SGeRAC (A.11.16) as shown in Figure A-5. When upgrading is finished, enter the following command on node 2 to restart the cluster on node 2: # cmrunnode node2 5. Start Oracle (Clusterware, RAC) software on node 2.
PAGE 146
Figure 25 Running Cluster After Upgrades Limitations of Rolling Upgrades The following limitations apply to rolling upgrades: 146 • During a rolling upgrade, you should issue Serviceguard/SGeRAC commands (other than cmrunnode and cmhaltnode) only on a node containing the latest revision of the software. Performing tasks on a node containing an earlier revision of the software will not work or will cause inconsistent results.
PAGE 147
Non-Rolling Software Upgrades A non-rolling upgrade allows you to perform a software upgrade from any previous revision to any higher revision or between operating system versions. For example, you may do a non-rolling upgrade from SGeRAC A.11.14 on HP-UX 11i v1 to A.11.16 on HP-UX 11i v2, given both are running the same architecture. The cluster cannot be running during a non-rolling upgrade, therefore it is necessary to halt the entire cluster in order to perform the upgrade.
PAGE 148
7. Recreate the network and storage configurations (Set up stationary IP addresses and create LVM volume groups and/or CVM disk groups required for the cluster). 8. Recreate the SGeRAC cluster. 9. Restart the cluster. 10. Reinstall the cluster applications, such as RAC. 11. Restore the data. Upgrade Using DRD DRD stands for Dynamic Root Disk.
PAGE 149
B Blank Planning Worksheets This appendix reprints blank planning worksheets used in preparing the RAC cluster. You can duplicate any of these worksheets that you find useful and fill them in as a part of the planning process.
PAGE 150
Instance 1 Redo Log: _____________________________________________________ Instance 2 Redo Log 1: _____________________________________________________ Instance 2 Redo Log 2: _____________________________________________________ Instance 2 Redo Log 3: _____________________________________________________ Instance 2 Redo Log: _____________________________________________________ Instance 2 Redo Log: _____________________________________________________ Data: System _________________________
PAGE 151
Index A activation of volume groups in shared mode, 130 administration cluster and package states, 115 array replacing a faulty mechanism, 134, 135, 136 B building a cluster CVM infrastructure, 53 building an RAC cluster displaying the logical volume infrastructure, 47 logical volume infrastructure, 41 building logical volumes for RAC, 46 C CFS, 48, 52 cluster state, 120 status options, 118 Cluster Communication Network Monitoring, 36 cluster volume group creating physical volumes, 42 creating a storage i
PAGE 152
monitoring hardware, 132 N network status, 120 node halting status, 123 in an RAC cluster, 13 status and state, 118 non-rolling upgrade DRD, 148 O online hardware maintenance by means of in-line SCSI terminators, 136 Online node addition and deletion, 128 Online reconfiguration, 128 opsctl.ctl Oracle demo database files, 46, 58 opslog.
PAGE 153
Oracle demo database files, 46, 59 troubleshooting monitoring hardware, 132 replacing disks, 134 U upgrade DRD, 148 upgrade restrictions DRD, 148 V volume group creating for a cluster, 42 creating physical volumes for clusters, 42 volume groups adding shared volume groups, 131 displaying for RAC, 47 exporting to other nodes, 47 making changes to shared volume groups, 130 making shareable, 129 making unshareable, 130 W worksheet logical volume planning, 31, 32 worksheets physical volume planning, 149 work