Sample Configurations with SGeRAC and Oracle RAC 10gR2 Version 1.6, March 2009 Introduction ................................................................................................................................... 3 Audience................................................................................................................................... 3 Terms and definitions...................................................................................................................
Configuring Oracle RAC 10g on CFS........................................................................................... 29 Assumptions for this sample configuration .................................................................................. 29 CFS and ODM requirement..................................................................................................... 29 Creating a SGeRAC cluster with CFS for Oracle 10g...................................................................
Introduction This document discusses the various aspects of architecting, planning, and implementing an Oracle Real Application Cluster (RAC) 10g Release 2 solution on HP-UX 11i with Serviceguard and Serviceguard Extension for RAC (SGeRAC).
• • • • • • • • • • • OCR – Oracle Cluster Registry is shared storage used to keep Oracle cluster and configuration information. ODM – Oracle Disk Manager is a standard API specified by Oracle for database I/O. RAC – Real Application Cluster enables a multi-instances concurrent shared access database. RAC-DB-IC – Real Application Cluster Interconnect traffic for both Global Cache Service and Global Enqueue Service.
Memory Sufficient physical memory should be available for all processes. Insufficient memory may result in swapping activities that affect the CPU processor availability to components that have timed heartbeat communications. Generally, one should consider sufficient memory to reduce paging activities, fit Oracle System Global Area (SGA) into main memory, and allow for user processes. Network: clients Insufficient bandwidth on the client network affects availability to the client.
§ With RAC traffic, if the interconnect is not configured to be monitored and acted upon by the other components, RAC discovers the interconnect failure within the Instance Membership Recovery (IMR) timeout. The failover time requirement determines important timeouts, such as Serviceguard heartbeat timeout, network polling intervals, and cluster interconnect monitoring.
Local LAN failover using APA When APA is used where the network interface cards are bonded, APA provides traffic distribution and load balancing capability among multiple physical network interface cards (NIC) or links. Load balance may be a benefit which is desirable to configurations where a single interface is insufficient to handle the network traffic. When a physical NIC or link fails, APA provides HA by distributing traffic among remaining NIC or links.
Using Serviceguard primary and standby links is the preferred HA model to provide HA for the cluster communications interconnect network HA. With redundancy through Serviceguard primary and standby, Serviceguard monitors the network and performs local failover if the primary network becomes unavailable.
has sufficient bandwidth. If the primary network fails, Serviceguard performs a local LAN failover to use the standby network. Node failure is detected when Serviceguard misses heartbeats. Configurations with heavy RAC-DB-IC traffic may place a limit on how aggressive the Serviceguard heartbeat timeout can be used since SG-HB may not be processed in time. Therefore, a longer Serviceguard heartbeat timeout may be needed to avoid false cluster reconfigurations. Figure 2.
Figure 3. Single SG-HB with CSS-HB and RAC-DB-IC on separate subnet RAC-DB-IC LAN 1 Private (standby) LAN 2 LAN 3 Private (primary) SG-HB CSS-HB Private (primary) LAN 4 Node A LAN 1 LAN 2 LAN 3 LAN 4 Private (standby) Node B Figure 3 is a variation of figure 2 and shows where the CSS-HB residing on the same subnet as SGHB. The RAC-DB-IC is on a separate network and thus do not affect the HB traffic. If the primary (lan1) fails, Serviceguard performs local LAN failover.
Figure 4. Preferred: single subnet with Ethernet primary and standby including GAB/LLT SG-HB CSS-HB RAC-DB-IC GAB/LLT Private (primary) LAN 1 LAN 2 LAN 1 GAB/LLT Private (standby) Node A LAN 2 Node B Figure 4 shows a common configuration where SG-HB, CSS-HB, RAC-DB-IC, and GAB/LLT share the same network for cluster communications.
Figure 6. Dual primary and standby Ethernet including GAB/LLT RAC-DB-IC LAN 1 LAN 2 Private (standby) LAN 3 SG-HB CSS-HB GAB/LLT LAN 4 GAB/LLT Node A Private (primary) Private (primary) LAN 1 LAN 2 LAN 3 LAN 4 Private (standby) Node B Figure 6 shows the same variation as figure 3 except this configuration is for CFS and CVM. This configuration is for heavily loaded configurations where RAC-DB-IC traffic interferes with heartbeats and other cluster communications.
Storage Oracle Clusterware (OC) assumes the required storage is available when OC starts. Therefore, OC does not perform any storage activation and leaves it up to the platform or users to activate the storage prior to starting OC. For SGeRAC configurations, Serviceguard packages are used as the mechanism to activate storage prior to starting OC. For SLVM and CVM configurations, the shared storage activation is performed by the Serviceguard package that starts OC.
system that can fail over within a Serviceguard package. Prerequisites In the sample configurations, the following prerequisites apply: Software • • • • • HP-UX 11i v2 0505 Enterprise Operating Environment Serviceguard A.11.16 or A.11.17/A.11.18/A.11.19 (A.11.17 or later required for CFS support) Serviceguard Extension for RAC A.11.16 or A.11.17/A.11.18/A.11.19 (A.11.17 or later required for CFS support) HP Serviceguard Management Suites Bundles A.01.00 or later.
Cluster for SLVM (eenie and meenie) Figure 7.
Cluster for CFS (minie and mo) Figure 8.
Assumptions for this sample configuration 1. Cluster hardware configured. 2. HP-UX 11i v2 0505 Enterprise Operating Environment. 3. Serviceguard and Serviceguard Extension for RAC installed. 4. Same private interconnect used for all inter-node traffic (Serviceguard, RAC, CSS) 5. One shared disk for shared volume group. 6.
Creating volume group and logical volumes 1. Initialize LVM disk on node (“eenie”) # pvcreate /dev/rdsk/c4t3d0 2. Create the volume group on node (“eenie”). # mkdir /dev/vg_ops # mknod /dev/vg_ops/group c 64 0x070000 Note: <0x070000> is the minor number on this sample configuration. # vgcreate /dev/vg_ops /dev/dsk/c4t3d0 # vgextend /dev/vg_ops /dev/dsk/c5t3d0 Note: is a redundant link to 3. Create logical volumes on node (“eenie”).
FIRST_CLUSTER_LOCK_VG /dev/vg_ops NODE_NAME eenie NETWORK_INTERFACE lan0 STATIONARY_IP 15.13.170.64 NETWORK_INTERFACE lan3 NETWORK_INTERFACE lan1 HEARTBEAT_IP 192.1.1.1 NETWORK_INTERFACE lan2 FIRST_CLUSTER_LOCK_PV /dev/dsk/c4t3d0 NODE_NAME meenie NETWORK_INTERFACE lan0 STATIONARY_IP 15.13.170.80 NETWORK_INTERFACE lan3 NETWORK_INTERFACE lan1 HEARTBEAT_IP 192.1.1.
Create groups on each node Create the Oracle Inventory group if one does not exist, create the OSDBA group, and create the Operator Group (optional). # /usr/sbin/groupadd oinstall # /usr/sbin/groupadd dba # /usr/sbin/groupadd oper Create Oracle user on each node # /usr/bin/useradd –u 203 –g oinstall –G dba,oper oracle Change password on each node # passwd oracle Enable remote access (ssh or remsh) for Oracle user on all nodes For remsh, add oracle user to the .rhosts file or host.equiv file.
/mnt/app/crs/oracle/product/10.2.0/crs When installing Oracle Cluster Software, you should set the ORACLE_HOME environment to specify this directory. Please note at installation and before running the root.sh script, the parent directories of the Oracle Cluster Software home directory must be changed to permit only the root user to write to those directories.
undotbs1=/dev/vg_ops/ropsundotbs01.dbf undotbs2=/dev/vg_ops/ropsundotbs02.dbf example=/dev/vg_ops/ropsexample1.dbf users=/dev/vg_ops/ropsusers.dbf redo1_1=/dev/vg_ops/rops1log1.log redo1_2=/dev/vg_ops/rops1log2.log redo2_1=/dev/vg_ops/rops2log1.log redo2_2=/dev/vg_ops/rops2log2.log control1=/dev/vg_ops/ropsctl1.ctl control2=/dev/vg_ops/ropsctl2.ctl control3=/dev/vg_ops/ropsctl3.ctl temp=/dev/vg_ops/ropstmp.dbf spfile=/dev/vg_ops/ropsspfile1.
Creating a RAC demo database on SLVM Export environment variables for “oracle” user. export ORACLE_BASE=/mnt/app/oracle export DBCA_RAW_CONFIG=/mnt/app/oracle/oradata/ver10/ver10_raw.conf export ORACLE_HOME=$ORACLE_BASE/product/10.2.0/db_1 export ORA_CRS_HOME=/mnt/app/crs/oracle/product/10.2.
12. Choose Recovery Configuration a. In this sample, use default parameters (no flash recovery and archiving.). b. Flash Recovery Area and archiving can be configured. When configuring archiving, choose Enable Archive Mode Parameter and specify where to place archive logs. If Flash Recovery Area is configured, archive logs default to the Flash Recovery area. c.
Creating Serviceguard package for Oracle Clusterware 1. Create package directory and copy toolkit files. # mkdir /etc/cmcluster/crsp-slvm # cd /etc/cmcluster/crsp-slvm # cp /opt/cmcluster/SGeRAC/toolkit/crsp/* ./ 2. Create package files # cmmakepkg –p crsp-slvm.conf # cmmakepkg –s crsp-slvm.ctl 3. Edit the package configuration file crsp-slvm.conf.
6. Add the package to the cluster. Distribute Oracle Clusterware multi-node package (MNP) directory to all nodes. # cd /etc/cmcluster # rcp –r crsp-slvm root@meenie:/etc/cmcluster Add package to cluster. # cd /etc/cmcluster/crsp-slvm # cmapplyconf –P crsp-slvm.conf Modify the cluster configuration ([y]/n)? y Completed the cluster creation Starting and stopping Serviceguard packages and Oracle RAC On each node, halt Oracle Clusterware if running.
Stop Oracle Clusterware on each node For 10g 10.2.0.1: # crsctl stop crs Wait until Oracle Cluster Software completely stops. Check CRS logs or check for Oracle processes, for example ps –ef | grep ocssd.bin Change Oracle Cluster Software from starting at boot time on each node For 10g 10.2.0.1: # crsctl disable crs Creating Serviceguard Packages In this configuration, each node is configured with one Serviceguard package that will start and stop Oracle Clusterware.
/etc/cmcluster/pkg/crs_eenie_pkg/cssd.sh start test_return 51 } function customer_defined_halt_cmds { # ADD customer defined halt commands. /etc/cmcluster/pkg/crs_eenie_pkg/cssd.sh stop test_return 52 } Note: The cssd.sh script is a sample script that is in the Appendix for starting, monitoring, and stopping OC. 5. Add the package to the cluster. # cmapplyconf –P crs_eenie_pkg.
Configuring Oracle RAC 10g on CFS The following sections describe the process for configuring Oracle RAC 10g on CFS. Assumptions for this sample configuration 1. Cluster hardware configured 2. HP-UX 11i v2 0505 Enterprise Operating Environment 3. HP Serviceguard Storage Management Suite (A.01.00 or later) Installed 4. Same private interconnect used for all inter-node traffic (Serviceguard, RAC, CSS, GAB/LLT) 5. One shared disk for CFS 6.
NETWORK_INTERFACE lan2 AUTO_START_TIMEOUT 600000000 NETWORK_POLLING_INTERVAL 2000000 NETWORK_FAILURE_DETECTION INOUT MAX_CONFIGURED_PACKAGES 150 For A.11.18 and prior, the heartbeat timeout is HEARTBEAT_INTERVAL NODE_TIMEOUT 1000000 5000000 For A.11.19 and later, the heartbeat timeout is MEMBER_TIMEOUT 14000000 Create cluster (sample) # cmapplyconf –C clm.
Create disk groups for RAC Use the vxdg command to create disk groups. Use the –s option to specify shared mode, as in the following example: # vxdg –s init cfsdg1 c4t1d0 Create disk group multi-node package Add the disk group to the cluster.
7168000 /dev/vx/dsk/cfsdg1/vol2 7168000 /dev/vx/dsk/cfsdg1/vol3 307200 35644 6686584 0% /cfs/mnt1 25644 6686584 0% /cfs/mnt2 3264 1% /cfs/mnt3 284657 Viewing configuration # cmviewcl CLUSTER cluster_mo NODE minie mo STATUS up STATUS up up STATE running running MULTI_NODE_PACKAGES PACKAGE SG-CFS-pkg SG-CFS-DG-1 SG-CFS-MP-1 SG-CFS-MP-2 SG-CFS-MP-3 STATUS up up up up up STATE running running running running running AUTO_RUN enabled enabled enabled enabled enabled SYSTEM yes no no no no Prerequi
# ln -s /usr/lib/libXt.3 /usr/lib/libXt.sl # ln -s /usr/lib/libXtst.2 /usr/lib/libXtst.sl Create file system for Oracle directories In the following samples, /mnt/app is a mounted file system for Oracle software. Assume there is a private disk c2t0d0 at 18 GB size on all nodes. Create the local file system on each node.
# chmod 775 /cfs Create directory for Oracle demo database on Cluster File System Create the CFS directory to store Oracle database files. Run commands only on one node. # # # # # chmod 775 /cfs/mnt2 cd /cfs/mnt2 mkdir oradata chown oracle:oinstall oradata chmod 775 oradata Change directory permission on each node (if needed). # chmod 775 /cfs Installing and configuring Oracle 10g Clusterware on local file system Login as “oracle” user. $ export DISPLAY=:0.
Configuring ODM ODM is required when using Oracle RAC with SGeRAC and CFS. For this sample configuration, the ODM libraries are included with the HP Serviceguard Storage Management Suite bundle for RAC. Previously, there was a confirmed problem with creating an Oracle database with dbca after enabling ODM (linking the ODM library). The Oracle bug # is 5103839. The workaround was to create the database (see §4.2.8) first and then link ODM (§4.2.7). Starting with Oracle 10.2.0.
For HP 9000 systems: $ rm ${ORACLE_HOME}/lib/libodm10.sl $ ln -s ${ORACLE_HOME}/lib/libodmd10.sl ${ORACLE_HOME}/lib/libodm10.sl For HP Integrity systems: $ rm ${ORACLE_HOME}/lib/libodm10.so $ ln -s ${ORACLE_HOME}/lib/libodmd10.so ${ORACLE_HOME}/lib/libodm10.so Creating RAC demo database on CFS Export environment variables for “oracle” user. export ORACLE_BASE=/cfs/mnt1/oracle export ORACLE_HOME=$ORACLE_BASE/product/10.2.0/db_1 export ORA_CRS_HOME=/mnt/app/crs/oracle/product/10.2.
8. Provide passwords for user accounts. 9. Select Listeners to register database. a. In this sample, the listeners used are “LISTENER_MO” and “LISTENER_MINIE”. 10. Select Storage Options. a. In this sample, Select the storage option for Cluster File System. 11. Provide Database File Locations. a. In this sample, choose “Use Common Location for all Database Files” and enter /cfs/mnt2/oradata as the common directory. 12. Choose Recovery Configuration. a.
# crsctl disable crs Creating Serviceguard packages In this configuration, the cluster is configured with one Serviceguard multi-node package that will start and stop Oracle Clusterware. Creating Serviceguard package for Oracle Clusterware 1. Create package directory and copy toolkit files # mkdir /etc/cmcluster/crsp # cd /etc/cmcluster/crsp # cp /opt/cmcluster/SGeRAC/toolkit/crsp/* ./ 2. Create Package Files # cmmakepkg –p crsp.conf # cmmakepkg –s crsp.ctl 3. Edit the package configuration file crsp.
# ADD customer defined run commands. /etc/cmcluster/crsp/toolkit_oc.sh start test_return 51 } function customer_defined_halt_cmds { # ADD customer defined halt commands. /etc/cmcluster/crsp/toolkit_oc.sh stop test_return 52 } 5. Edit the toolkit configuration file oc.conf. ORA_CRS_HOME=/mnt/app/crs/oracle/product/10.2.0/crs 6. Add the package to the cluster. Distribute Oracle Clusterware multi-node package (MNP) directory to all nodes.
Verify Oracle Clusterware status. # $ORA_CRS_HOME/bin/crsctl check crs CSS appears healthy CRS appears healthy EVM appears healthy Cluster start and stop The following sections describe the process for starting and stopping Oracle 10g Clusterware. Start and stop Oracle Clusterware 10g Placing the start and stop of Oracle Clusterware in Serviceguard packages ensures that the shared storage required by Oracle Clusterware is available.
Appendix Sample configuration for SLVM with Serviceguard Extension for RAC Cluster configuration File for Cluster (eenie and meenie) CLUSTER_NAME cluster_eenie FIRST_CLUSTER_LOCK_VG /dev/vg_ops NODE_NAME eenie NETWORK_INTERFACE lan0 STATIONARY_IP 15.13.170.64 NETWORK_INTERFACE lan3 NETWORK_INTERFACE lan1 HEARTBEAT_IP 192.1.1.1 NETWORK_INTERFACE lan2 FIRST_CLUSTER_LOCK_PV /dev/dsk/c4t3d0 NODE_NAME meenie NETWORK_INTERFACE lan0 STATIONARY_IP 15.13.170.
SERVICE_CMD[0]="/etc/cmcluster/crsp-slvm/toolkit_oc.sh check" SERVICE_RESTART[0]="" function customer_defined_run_cmds { # ADD customer defined run commands. /etc/cmcluster/crsp-slvm/toolkit_oc.sh start test_return 51 } function customer_defined_halt_cmds { # ADD customer defined halt commands. /etc/cmcluster/crsp-slvm/toolkit_oc.sh stop test_return 52 } Serviceguard Extension for RAC configuration for Oracle Clustware ORA_CRS_HOME=/mnt/app/crs/oracle/product/10.2.
MEMBER_TIMEOUT 14000000 Package configuration file for node eenie for SLVM (“crs_eenie_pkg.conf”) PACKAGE_NAME NODE_NAME RUN_SCRIPT HALT_SCRIPT crs_eenie_pkg eenie /etc/cmcluster/pkg/crs_eenie_pkg/crs_eenie_pkg.sh /etc/cmcluster/pkg/crs_eenie_pkg/crs_eenie_pkg.
VG[0]="vg_ops" SERVICE_NAME[0]="css_check_meenie" SERVICE_CMD[0]="/etc/cmcluster/pkg/crs_meenie_pkg/cssd.sh monitor" SERVICE_RESTART[0]="" function customer_defined_run_cmds { # ADD customer defined run commands. /etc/cmcluster/pkg/crs_meenie_pkg/cssd.sh start test_return 51 } function customer_defined_halt_cmds { # ADD customer defined halt commands. /etc/cmcluster/pkg/crs_meenie_pkg/cssd.sh stop test_return 52 } Note: The cssd.
Node_Switching_Parameters: NODE_TYPE STATUS SWITCHING Primary up enabled NODE meenie STATUS up Cluster_Lock_LVM: VOLUME_GROUP /dev/vg_ops STATE running PHYSICAL_VOLUME /dev/dsk/c4t3d0 STATUS up PATH 0/0/0/0 0/8/0/0/4/0 0/8/0/0/6/0 0/8/0/0/5/0 NAME lan0 lan1 lan3 lan2 Network_Parameters: INTERFACE STATUS PRIMARY up PRIMARY up STANDBY up STANDBY up PACKAGE crs_meenie_pkg NAME eenie (current) STATUS up STATE running AUTO_RUN disabled NODE meenie Policy_Parameters: POLICY_NAME CONFIGURED_VALUE Fai
Sample configuration for CFS The following sections describe sample configurations with CFS: Cluster configuration Ffile for cluster (minie and mo) CLUSTER_NAME QS_HOST QS_POLLING_INTERVAL QS_TIMEOUT_EXTENSION cluster_mo white 120000000 2000000 NODE_NAME NETWORK_INTERFACE STATIONARY_IP NETWORK_INTERFACE NETWORK_INTERFACE HEARTBEAT_IP NETWORK_INTERFACE minie lan0 15.13.170.82 lan3 lan1 192.1.1.
DEPENDENCY_LOCATION SAME_NODE SERVICE_NAME SERVICE_FAIL_FAST_ENABLED SERVICE_HALT_TIMEOUT crsp-srv NO 300 Package control script for CFS (Oracle Clusterware MNP) SERVICE_NAME[0]="crsp-srv" SERVICE_CMD[0]="/etc/cmcluster/crsp/toolkit_oc.sh check" SERVICE_RESTART[0]="" function customer_defined_run_cmds { # ADD customer defined run commands. /etc/cmcluster/crsp/toolkit_oc.sh start test_return 51 } function customer_defined_halt_cmds { # ADD customer defined halt commands. /etc/cmcluster/crsp/toolkit_oc.
# Enable the Oracle Cluster Software to autostart # Disable the Oracle Cluster Software from autostart ######################################################################### ### # Function: log_message # # This function log any message with date, time and node name affixed # to it. It accepts just one parameter. # Parameter: # 1.
then break fi sleep $MONITOR_INTERVAL done } ######################################################################### #### # Function: cssd_stop_cmds # # Stop cssd daemons ######################################################################### #### function cssd_stop_cmds { typeset -i n=0 # Grab the PID of the CSS daemon for i in ${CSSD_MONITOR_PROCESSES[@]} do CSSD_MONITOR_PROCESSES_PID[$n]=`ps -fu $ORACLE_USER | awk '/'${i}$'/ { print $2 }'` print "Monitored process = ${i}, pid = ${CSSD_MONITOR_PROCESS
# Monitor cssd daemons ######################################################################### #### function monitor_processes { typeset -i n=0 # Grab the PID of the CSS daemon for i in ${CSSD_MONITOR_PROCESSES[@]} do CSSD_MONITOR_PROCESSES_PID[$n]=`ps -fu $ORACLE_USER | awk '/'${i}$'/ { print $2 }'` print "Monitored process = ${i}, pid = ${CSSD_MONITOR_PROCESSES_PID[$n]} " if [[ ${CSSD_MONITOR_PROCESSES_PID[$n]} = "" ]] then print "\n\n" ps -ef print "\n *** ${i} is not running ***" return 0 fi (( n = n
######################################################################### #### function css_disable_cmds { $ORA_CRS_HOME/bin/crsctl disable crs } ######################################################################### #### # MAIN # Check the command-line option and take the appropriate action. ######################################################################### #### PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/lbin ORA_ver=10.2.0.
Document revision history Revision Date Description Comment 1.0 Dec 6,2005 First version 1.1 Jan 23, 2006 Minor update OC files on CFS 1.2 Feb 13, 2006 Minor update From extended team feedback 1.2.1 Feb 21, 2006 Minor update Directory ownership 1.2.2 Feb 28, 2006 Minor update Directory ownership 1.2.3 Apri 7, 2006 Minor update Add ODM issue and oracle bug # 1.3.0 May 10, 2006 Minor update Update IB, add RIP/VIP co-existence 1.3.
Oracle documentation All of the following materials can be found on the Oracle Technical Documentation web site at http://www.oracle.com/technology/documentation/database10gr2.html. • • • • • Oracle Clusterware and Oracle Real Application Clusters Installation Guide version 10g Release 2 (10.2) for HP-UX http://download-west.oracle.com/docs/cd/B19306_01/install.102/b14202.pdf Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide version 10g Release 2 (10.