Using Serviceguard Extension for RAC HP Part Number: T1859-90054 Published: April 2008
Legal Notices © Copyright 2003-2008 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use, or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor’s standard commercial license. The information contained herein is subject to change without notice.
Table of Contents Printing History ...........................................................................................................................15 Preface.......................................................................................................................................17 1 Introduction to Serviceguard Extension for RAC...........................................................................21 What is a Serviceguard Extension for RAC Cluster? .............................
Monitoring...............................................................................................................37 Allowed Characters for Oracle 10g RAC Cluster Names........................................37 Shared Storage...............................................................................................................37 Multipath .................................................................................................................37 OCR and Vote Device.....................
Creating Volume Groups and Logical Volumes .....................................................56 Selecting Disks for the Volume Group...............................................................56 Creating Physical Volumes.................................................................................57 Creating a Volume Group with PVG-Strict Mirroring.......................................57 Building Mirrored Logical Volumes for RAC with LVM Commands..........................
Verifying Oracle Disk Manager is Configured...................................................................88 Configuring Oracle to Use Oracle Disk Manager Library.................................................88 Verify that Oracle Disk Manager is Running.....................................................................89 Configuring Oracle to Stop Using Oracle Disk Manager Library......................................90 Using Serviceguard Packages to Synchronize with Oracle 10g RAC.................
Creating a SGeRAC Cluster with CFS for Oracle 9i....................................................111 Initializing the Veritas Volume Manager....................................................................112 Deleting CFS from the Cluster....................................................................................116 Creating a Storage Infrastructure with CVM...................................................................117 Initializing the Veritas Volume Manager............................
4 Maintenance and Troubleshooting...........................................................................................139 Reviewing Cluster and Package States with the cmviewcl Command..........................139 Types of Cluster and Package States...........................................................................140 Examples of Cluster and Package States................................................................140 Types of Cluster and Package States....................................
Replacing a Lock Disk.................................................................................................161 On-line Hardware Maintenance with In-line SCSI Terminator .................................161 Replacement of I/O Cards.................................................................................................163 Replacement of LAN Cards..............................................................................................163 Off-Line Replacement............................
List of Figures 1-1 1-2 1-3 1-4 1-5 1-6 2-1 2-2 2-3 2-4 4-1 A-1 A-2 A-3 A-4 A-5 A-6 Overview of Oracle RAC Configuration on HP-UX ..................................................22 Group Membership Services.......................................................................................23 Before Node Failure....................................................................................................30 After Node Failure.........................................................................
List of Tables 1 2-1 2-2 3-1 3-2 3-3 Document Edition and Printing Date.........................................................................15 Required Oracle File Names for Demo Database ......................................................61 Required Oracle File Names for Demo Database ......................................................79 RAC Software, Archive, Datafiles, SRVM...................................................................95 Required Oracle File Names for Demo Database .......
Printing History Table 1 Document Edition and Printing Date Printing Date Part Number Edition June 2003 T1859-90006 First Edition Print, CD-ROM (Instant Information), and Web (http://www.docs.hp.com/) June 2004 T1859-90017 Second Edition Print, CD-ROM (Instant Information), and Web (http://www.docs.hp.com/) February 2005 T1859-90017 Second Edition February 2005 Update Web (http://www.docs.hp.com/) October 2005 T1859-90033 Third Edition Print, CD-ROM (Instant Information), and Web (http://www.
New editions of this manual will incorporate all material updated since the previous edition. To ensure that you receive the new editions, you should subscribe to the appropriate product support service. See your HP sales representative for details.
Preface This Sixth edition of the manual includes information for Serviceguard Extension for RAC (Oracle Real Application Cluster) Version A.11.18 on HP-UX 11i v2 and 11i v3, Veritas Cluster File System (CFS)/Cluster Volume Manager (CVM) from Symantec version 5.0, Cluster Interconnect Subnet Monitoring feature, SGeRAC Toolkit, and Logical Volume Manager version 2 (See “Creating a Storage Infrastructure with LVM” (page 55)).
• • • • • • Using High Availability Monitors (B5736-90046) Using the Event Monitoring Service (B7612-90015) Using Advanced Tape Services (B3936-90032) Managing Serviceguard Extension for SAP (T1859-90043) Managing Systems and Workgroups (5990-8172) Managing Serviceguard NFS (B5140-90017) Before attempting to use VxVM storage with Serviceguard, please refer to the following: • VERITAS Volume Manager Administrator’s Guide. This contains a glossary of VERITAS terminology.
Book Title KeyCap Emphasis Emphasis Term ComputerOut UserInput Command Variable [] {} ... | The title of a book. On the web and on the Instant Information CD, it may be a hot link to the book itself. The name of a keyboard key. Note that Return and Enter both refer to the same key. Text that is emphasized. Text that is strongly emphasized. The defined use of an important word or phrase. Text displayed by the computer. Commands and other text that you type. A command name or qualified command phrase.
1 Introduction to Serviceguard Extension for RAC Serviceguard Extension for RAC (SGeRAC) enables the Oracle Real Application Cluster (RAC), formerly known as Oracle Parallel Server RDBMS, to run on HP high availability clusters under the HP-UX operating system. This chapter introduces Serviceguard Extension for RAC and shows where to find different kinds of information in this book.
Figure 1-1 Overview of Oracle RAC Configuration on HP-UX In the figure, two loosely coupled systems (each one known as a node) are running separate instances of Oracle software that read data from and write data to a shared set of disks. Clients connect to one node or the other via LAN. RAC on HP-UX lets you maintain a single database image that is accessed by the HP servers in parallel, thereby gaining added processing power without the need to administer separate databases.
of RAC each on Node 3 and Node 4. The RAC processes accessing the Sales database constitute one group, and the RAC processes accessing the HR database constitute another group. Figure 1-2 Group Membership Services Using Packages in a Cluster In order to make other important applications highly available (in addition to the Oracle Real Application Cluster), you can configure your RAC cluster to use packages.
for use with the Veritas Cluster Volume Manager (CVM) and the Veritas Cluster File (CFS) System (on HP-UX releases that support Veritas CFS and CVM; see “About Veritas CFS and CVM from Symantec” (page 25)). A system multi-node package must run on all nodes that are active in the cluster. If it fails on one active node, that node halts. A multi-node package can be configured to run on one or more cluster nodes. It is considered UP as long as it is running on any of its configured nodes.
Overview of SGeRAC and Cluster File System (CFS)/Cluster Volume Manager (CVM) SGeRAC supports Veritas Cluster File System (CFS)/Cluster Volume Manager (CVM) from Symantec through Serviceguard. CFS and CVM are not supported on all versions of HP-UX (on HP-UX releases that support Veritas CFS and CVM; see “About Veritas CFS and CVM from Symantec” (page 25)). For information on configuring CFS and CVM with Serviceguard, refer to the Managing Serviceguard Fifteenth Edition user’s guide at http://docs.hp.
version of Serviceguard for up-to-date information at http://www.docs.hp.com -> High Availability -> Serviceguard. Overview of SGeRAC and Oracle 10g RAC Starting with Oracle 10g RAC, Oracle has bundled its own cluster software. The initial release is called Oracle Cluster Ready Service (CRS). CRS is used both as a generic term referring to the Oracle cluster software and as a specific term referring to a component within the Oracle clusters software.
The following describes the characteristics of subnets that can be monitored by using the CLUSTER_INTERCONNECT_SUBNETparameter: • A subnet used only for the communications among instances of an application configured as a multi-node package. • A subnet whose health does not matter if there is only one instance of an application (package) running in the cluster.
• • • If more than one instance of pkgA is running in the cluster and SubnetA fails on one of the nodes where the instance of pkgA is running, the failure is handled by halting the instance of pkgA on the node where the subnet has failed. If pkgA is running on only one node of the cluster and SubnetA fails on that node, pkgA will continue to run on that node after the failure.
How Serviceguard Works with Oracle 9i RAC Serviceguard provides the cluster framework for Oracle, a relational database product in which multiple database instances run on different cluster nodes. A central component of Real Application Clusters is the distributed lock manager (DLM), which provides parallel cache management for database instances.
For example, on a two node cluster with one database, each node can have one RAC instance and one listener package. Oracle clients can be configured to connect to either package IP address (or corresponding hostname) using Oracle Net Services. When a node failure occurs, existing client connection to the package IP address will be reset after the listener package fails over and adds the package IP address.
2 can now access both Package 1’s disk and Package 2’s disk. Oracle instance 2 now handles all database access, since instance 1 has gone down. Figure 1-4 After Node Failure In the above figure, pkg1 and pkg2 are not instance packages. They are shown to illustrate the movement of packages in general. Larger Clusters Serviceguard Extension for RAC supports clusters of up to 16 nodes. The actual cluster size is limited by the type of storage and the type of volume manager used.
Figure 1-5 Four-Node RAC Cluster In this type of configuration, each node runs a separate instance of RAC and may run one or more high availability packages as well. The figure shows a dual Ethernet configuration with all four nodes connected to a disk array (the details of the connections depend on the type of disk array). In addition, each node has a mirrored root disk (R and R').
array configured with 16 I/O ports. Each node is connected to the array using two separate Fibre channels configured with PV Links. Each channel is a dedicated bus; there is no daisy-chaining. Figure 1-6 Eight-Node Cluster with XP or EMC Disk Array FibreChannel switched configurations also are supported using either an arbitrated loop or fabric login topology.
configurations that use basic Serviceguard technology with software mirroring (using MirrorDisk/UX or CVM) and Fibre Channel).
2 Serviceguard Configuration for Oracle 10g RAC This chapter shows the additional planning and configuration that is needed to use Oracle Real Application Clusters 10g with Serviceguard.
NOTE: HP and Oracle support SGeRAC to provide group membership to CSS. Serviceguard Cluster Timeout The Serviceguard cluster heartbeat timeout is set according to user requirements for availability. The Serviceguard cluster reconfiguration time is determined by the cluster timeout, configuration, the reconfiguration algorithm, and activities during reconfiguration.
Automated Oracle Cluster Software Startup and Shutdown The preferred mechanism that allows Serviceguard to notify Oracle Cluster Software to start and to request Oracle Cluster Software to shutdown is the use of Serviceguard packages. Monitoring Oracle Cluster Software daemon monitoring is performed through programs initiated by the HP-UX init process. SGeRAC monitors Oracle Cluster Software to the extent that CSS is a NMAPI2 group membership client and group member.
Mirroring and Resilvering On node and cluster wide failures, when SLVM mirroring is used and Oracle resilvering is available, the recommendation for the logical volume mirror recovery policy is set to full mirror resynchronization (NOMWC) for control and redo files and no mirror resynchronization (NONE) for the datafiles since Oracle would perform resilvering on the datafiles based on the redo log.
NOTE: Serviceguard can not be responsible for networks or connection endpoints that it is not configured to monitor. SGeRAC Heartbeat Network Serviceguard supports multiple heartbeat networks, private or public. Serviceguard heartbeat network can be configured as a single network connection with redundant LAN or multiple connections with multiple LANs (single or redundant). CSS Heartbeat Network The CSS IP addresses for peer communications are fixed IP addresses.
NOTE: srvctl and sqlplus are Oracle commands. Manual Startup and Shutdown Manual RAC instance startup and shutdown is supported through the following commands: srvctlor sqlplus. Shared Storage It is expected the shared storage is available when the RAC instance is started. Since the RAC instance expects the shared storage to be available, ensure the shared storage is activated. For SLVM, the shared volume groups must be activated and for CVM, the disk group must be activated.
communicates over link level protocol (DLPI) and supported over Serviceguard heartbeat subnet networks, including primary and standby links. The most common network configurations is to have all interconnect traffic for cluster communications to go on a single heartbeat network that is redundant so that Serviceguard monitors the network and resolves interconnect failures by cluster reconfiguration.
Planning Storage for Oracle 10g RAC Volume Planning with SLVM Storage capacity for the Oracle database must be provided in the form of logical volumes located in shared volume groups. The Oracle software requires at least two log files for each Oracle instance, several Oracle control files and data files for the database itself. For all these files, Serviceguard Extension for RAC uses HP-UX raw logical volumes, which are located in volume groups that are shared between the nodes in the cluster.
Instance 1 Redo Log 3: ___/dev/vg_ops/rops1log3.log_____120_______ Instance 1 Redo Log: __________________________________________________ Instance 1 Redo Log: __________________________________________________ Instance 2 Redo Log 1: ___/dev/vg_ops/rops2log1.log____120________ Instance 2 Redo Log 2: ___/dev/vg_ops/rops2log2.log____120________ Instance 2 Redo Log 3: ___/dev/vg_ops/rops2log3.
Data: System ___/dev/vx/rdsk/ops_dg/opssystem.dbf___500__________ Data: Sysaux ___/dev/vx/rdsk/ops_dg/opssysaux.dbf___800__________ Data: Temp ___/dev/vx/rdsk/ops_dg/opstemp.dbf______250_______ Data: Users ___/dev/vx/rdsk/ops_dg/opsusers.dbf_____120_________ Data: User data ___/dev/vx/rdsk/ops_dg/opsdata1.dbf_200__________ Data: User data ___/dev/vx/rdsk/ops_dg/opsdata2.dbf__200__________ Data: User data ___/dev/vx/rdsk/ops_dg/opsdata3.dbf__200__________ Parameter: spfile1 ___/dev/vx/rdsk/ops_dg/opsspfile1.
truly cluster-aware, obtaining information about cluster membership from Serviceguard directly. Cluster information is provided via a special system multi-node package, which runs on all nodes in the cluster. The cluster must be up and must be running this package before you can configure VxVM disk groups for use with CVM. Disk groups must be created from the CVM Master node. The Veritas CVM package for version 3.5 is named VxVM-CVM-pkg; the package for CVM version 4.1 and later is named SG-CFS-pkg.
/dev/dsk/c3t15d0 would indicate SCSI controller instance 3, SCSI target 15, and SCSI LUN 0. HP-UX 11i v3 introduces a new nomenclature for device files, known as agile addressing (sometimes also called persistent LUN binding). Under the agile addressing convention, the hardware path name is no longer encoded in a storage device’s name; instead, each device file name reflects a unique instance number, for example /dev/[r]disk/disk3, that does not need to change when the hardware path does.
concurrency control provided by Oracle RAC. Such disks are considered cluster aware. Volume groups listed under this parameter are marked for activation in shared mode. The entry can contain up to 40 characters. STORAGE_GROUP This parameter is used for CVM disk groups. Enter the names of all the CVM disk groups the package will use. In the ASCII package configuration file, this parameter is called STORAGE_GROUP.
NOTE: A package with the CLUSTER_INTERCONNECT_SUBNET parameter is available for both Modular and Legacy packages. A package with this parameter can be configured only when all nodes of the cluster are running SGeRAC version A.11.18 or higher. For more information, see the latest edition of the Managing Serviceguard Fifteenth Edition user’s guide at http://docs.hp.com. -> High Availability.
Figure 2-2 SG-HB/RAC-IC Traffic Separation Each primary and standby pair protects against a single failure. With the SG-HB on more than one subnet, a single subnet failure will not trigger a Serviceguard reconfiguration. If the subnet with CSS-HB fails, unless subnet monitoring is used, CSS will resolve the interconnect subnet failure with a CSS cluster reconfiguration.
NODE_FAIL_FAST_ENABLED parameter is set to NO for the Oracle Clusterware package, and is set to YES for the package monitoring CSS-HB subnet (Oracle Cluster Interconnect Subnet Package as shown in the package configuration parameters examples below). NOTE: Do not configure CLUSTER_INTERCONNECT_SUBNET in the RAC Instance package due to the RAC-IC network being the same as CSS-HB network.
As shown in Figure 2-3, each primary and standby pair protects against a single failure. If the subnet with SG-HB (lan1/lan2) fails, Serviceguard will resolve the subnet failure with a Serviceguard cluster reconfiguration. If the 192.168.2.0 subnet (lan3 and lan4) fails, Oracle instance membership recovery (IMR) will resolve the interconnect failure subnet, unless Serviceguard subnet monitoring is used. Oracle will wait for IMR time interval prior to resolving the subnet failure.
Figure 2-4 Faster Failover Configuration As shown in Figure 2-4, the Faster Failover configuration uses two SG-HBs on two primary networks with no standby, which enables the quickest determination of node failure and faster failover. • First network with primary for SG-HB #1 (lan1). • Second network with primary for SG-HB #2 (lan2). • Third network with primary and standby for CSS-HB and RAC-IC (lan3/lan4). • Single failure is protected by primary/standby.
of the CSS-HB subnet on a node will bring down the instance of the multi-node package and the node where the subnet has failed. A failure of CSS-HB subnet on all nodes will result in the multi-node package failing on the nodes one by one (resulting in that node going down), and one instance of the multi-node package and node will remain providing services to the clients. NOTE: Do not configure CLUSTER_INTERCONNECT_SUBNET in the RAC Instance package since the RAC-IC network is the same as CSS-HB network.
1. 2.
NOTE: To reduce the risk of failure of multiple subnets simultaneously, each subnet must have its own networking infrastructure (including networking switches). • A double switch failure resulting in the simultaneous failure of CSS-HB subnet and RAC-IC network on all nodes may result in loss of services (Assuming the CSS-HB subnet is different from RAC-IC network).
The Event Monitoring Service HA Disk Monitor provides the capability to monitor the health of LVM disks. If you intend to use this monitor for your mirrored disks, you should configure them in physical volume groups. For more information, refer to the manual Using HA Monitors. NOTE: When using LVM version 2.0, the volume groups are supported with Serviceguard. The steps for configuring the volume groups in Serviceguard clusters is the same for both LVM version 1.0 and 2.0.
In the following examples, we use /dev/rdsk/c1t2d0 and /dev/rdsk/c0t2d0, which happen to be the device names for the same disks on both ftsys9 and ftsys10. In the event that the device file names are different on the different nodes, make a careful note of the correspondences. Creating Physical Volumes On the configuration node (ftsys9), use the pvcreate command to define disks as physical volumes. This only needs to be done on the configuration node.
Building Mirrored Logical Volumes for RAC with LVM Commands After you create volume groups and define physical volumes for use in them, you define mirrored logical volumes for data, logs, and control files. It is recommended that you use a shell script to issue the commands described in the next sections. The commands you use for creating logical volumes vary slightly depending on whether you are creating logical volumes for RAC redo log files or for use with Oracle data.
The -m 1 option specifies single mirroring; the -M n option ensures that mirror write cache recovery is set off; the -c y means that mirror consistency recovery is enabled; the -s g means that mirroring is PVG-strict, that is, it occurs between different physical volume groups; the -n system.dbfoption lets you specify the name of the logical volume; and the -L 408 option allocates 408 megabytes.
8/0.15.0 8/0.15.1 8/0.15.2 8/0.15.3 8/0.15.4 8/0.15.5 /dev/dsk/c0t15d0 /dev/dsk/c0t15d1 /dev/dsk/c0t15d2 /dev/dsk/c0t15d3 /dev/dsk/c0t15d4 /dev/dsk/c0t15d5 /* /* /* /* /* /* I/O I/O I/O I/O I/O I/O Channel Channel Channel Channel Channel Channel 0 0 0 0 0 0 (8/0) (8/0) (8/0) (8/0) (8/0) (8/0) 10/0.3.0 10/0.3.1 10/0.3.2 10/0.3.3 10/0.3.4 10/0.3.
to add additional disks to the volume group, specifying the appropriate physical volume name for each PV link. Repeat the entire procedure for each distinct volume group you wish to create. For ease of system administration, you may wish to use different volume groups to separate logs from data and control files. NOTE: The default maximum number of volume groups in HP-UX is 10.
Table 2-1 Required Oracle File Names for Demo Database (continued) Logical Volume Name LV Size (MB) Raw Logical Volume Path Name Oracle File Size (MB)* opssysaux.dbf 808 /dev/vg_ops/ropssysaux.dbf 800 opstemp.dbf 258 /dev/vg_ops/ropstemp.dbf 250 opsusers.dbf 128 /dev/vg_ops/ropsusers.dbf 120 opsdata1.dbf 208 /dev/vg_ops/ropsdata1.dbf 200 opsdata2.dbf 208 /dev/vg_ops/ropsdata2.dbf 200 opsdata3.dbf 208 /dev/vg_ops/ropsdata3.dbf 200 opsspfile1.ora 5 /dev/vg_ops/ropsspfile1.
NOTE: Serviceguard Manager is the graphical user interface for Serviceguard. It is available as a “plug-in” to the System Management Homepage (SMH). SMH is a web-based graphical user interface (GUI) that replaces SAM as the system administration GUI as of HP-UX 11i v3 (but you can still run the SAM terminal interface; see “Using SAM” on page 32 of the Managing Serviceguard Fifteenth Edition user’s guide).
Installing Oracle Real Application Clusters NOTE: Some versions of Oracle RAC requires installation of additional software. Refer to your version of Oracle for specific requirements. Before installing the Oracle Real Application Cluster software, make sure the storage cluster is running. Login as the oracle user on one node and then use the Oracle installer to install Oracle software and to build the correct Oracle runtime executables.
# # # # # Lock Disk Parameters. Use the FIRST_CLUSTER_LOCK_VG and FIRST_CLUSTER_LOCK_PV parameters to define a lock disk. The FIRST_CLUSTER_LOCK_VG is the LVM volume group that holds the cluster lock. This volume group should not be used by any other cluster as a cluster lock device. # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Quorum Server Parameters. Use the QS_HOST, QS_POLLING_INTERVAL, and QS_TIMEOUT_EXTENSION parameters to define a quorum server.
# # # # # # # # # The NODE_TIMEOUT parameter defaults to 2000000 (2 seconds). This default setting yields the fastest cluster reformations. However, the use of the default value increases the potential for spurious reformations due to momentary system hangs or network load spikes. For a significant portion of installations, a setting of 5000000 to 8000000 (5 to 8 seconds) is more appropriate. The maximum value recommended for NODE_TIMEOUT is 30000000 (30 seconds).
# USER_NAME # USER_HOST # USER_ROLE john noir FULL_ADMIN # # # # # # List of cluster aware LVM Volume Groups. These volume groups will be used by package applications via the vgchange -a e command. Neither CVM or VxVM Disk Groups should be used here. For example: VOLUME_GROUP /dev/vgdatabase VOLUME_GROUP /dev/vg02 # # # # # # # # List of OPS Volume Groups. Formerly known as DLM Volume Groups, these volume groups will be used by OPS or RAC cluster applications via the vgchange -a s command.
In the example below, both the Oracle RAC software and datafiles reside on CFS. There is a single Oracle home. Three CFS file systems are created for Oracle home, Oracle datafiles, and for the Oracle Cluster Registry (OCR) and vote device. The Oracle Cluster Software home is on a local file system.
# cfscluster config -s The following output will be displayed: CVM is now configured[LINEBREAK]Starting CVM...[LINEBREAK]It might take a few minutes to complete When CVM starts up, it selects a master node, which is the node from which you must issue the disk group configuration commands.
Package name “SG-CFS-DG-1” was generated to control the resource shared disk group “cfsdg1” is associated with the cluster. 10. Activate the Disk Group # cfsdgadm activate cfsdg1 11.
Package name “SG-CFS-MP-2” was generated to control the resource. Mount point “/cfs/mnt2” was associated with the cluster. # cfsmntadm add cfsdg1 vol3 /cfs/mnt3 all=rw The following output will be displayed: Package name “SG-CFS-MP-3” was generated to control the resource. Mount point “/cfs/mnt3” that was associated with the cluster. NOTE: The diskgroup and mount point multi-node packages (SG-CFS-DG_ID# and SG-CFS-MP_ID#) do not monitor the health of the disk group and mount point.
SG-CFS-MP-2 SG-CFS-MP-3 up up running running enabled enabled no no CAUTION: Once you create the disk group and mount point packages, it is critical that you administer the cluster with the cfs commands, including cfsdgadm, cfsmntadm, cfsmount, and cfsumount. If you use the general commands such as mount and umount, it could cause serious problems, such as writing to the local file system instead of the cluster file system.
The following output will be generated: Shared disk group “cfsdg1” was disassociated from the cluster. NOTE: “cfsmntadm delete” also deletes the disk group if there is no dependent package. To ensure the disk group deletion is complete, use the above command to delete the disk group package. 4. De-configure CVM # cfscluster stop The following output will be generated: Stopping CVM...
IMPORTANT: Creating a rootdg disk group is only necessary the first time you use the Volume Manager. CVM 4.1 or later does not require a rootdg. Using CVM 4.x or later This section has information on how to set up the cluster and the system multi-node package with CVM (without the CFS filesystem); (on HP-UX releases that support them; see “About Veritas CFS and CVM from Symantec” (page 25)). Preparing the Cluster and the System Multi-node Package for use with CVM 4.
When CVM starts up, it selects a master node, which is the node from which you must issue the disk group configuration commands. To determine the master node, issue the following command from each node in the cluster: # vxdctl -c mode The following output will be displayed: mode: enabled: cluster active - SLAVEmaster: ever3b or mode: enabled: cluster active - MASTERslave: ever3b • Converting Disks from LVM to CVM Use the vxvmconvert utility to convert LVM volume groups into CVM disk groups.
NODE ever3a ever3b STATUS up up STATE running running MULTI_NODE_PACKAGES PACKAGE SG-CFS-pkg STATUS up STATE running AUTO_RUN enabled SYSTEM yes IMPORTANT: After creating these files, use the vxedit command to change the ownership of the raw volume files to oracle and the group membership to dba, and to change the permissions to 660.
NOTE: Cluster configuration is described in the previous section. To prepare the cluster for CVM disk group configuration, you need to ensure that only one heartbeat subnet is configured. Then use the following command, which creates the special package that communicates cluster information to CVM: # cmapplyconf -P /etc/cmcluster/cvm/VxVM-CVM-pkg.conf WARNING! The above file should never be edited.
Initializing Disks for CVM You need to initialize the physical disks that will be employed in CVM disk groups. If a physical disk has been previously used with LVM, you should use the pvremove command to delete the LVM header data from all the disks in the volume group (this is not necessary if you have not previously used the disk with LVM).
IMPORTANT: After creating these files, use the vxedit command to change the ownership of the raw volume files to oracle and the group membership to dba, and to change the permissions to 660. Example: # cd /dev/vx/rdsk/ops_dg # vxedit -g ops_dg set user=oracle * # vxedit -g ops_dg set group=dba * # vxedit -g ops_dg set mode=660 * The logical volumes are now available on the primary node, and the raw logical volume names can now be used by the Oracle DBA.
Table 2-2 Required Oracle File Names for Demo Database (continued) Volume Name Size (MB) Raw Device File Name Oracle File Size (MB) ops2log1.log 128 /dev/vx/rdsk/ops_dg/ops2log1.log 120 ops2log2.log 128 /dev/vx/rdsk/ops_dg/ops2log2.log 120 ops2log3.log 128 /dev/vx/rdsk/ops_dg/ops2log3.log 120 opssystem.dbf 508 /dev/vx/rdsk/ops_dg/opssystem.dbf 500 opssysaux.dbf 808 /dev/vx/rdsk/ops_dg/opssysaux.dbf 800 opstemp.dbf 258 /dev/vx/rdsk/ops_dg/opstemp.dbf 250 opsusers.
Adding Disk Groups to the Cluster Configuration For CVM 4.x or later, if the multi-node package was configured for disk group activation, the application package should be configured with package dependency to ensure the CVM disk group is active. For CVM 3.5 and CVM 4.x or later (without using multi-node package) after creating units of CVM storage with VxVM commands, you need to specify the disk groups in each package configuration ASCII file.
# ln -s /usr/lib/libXt.3 /usr/lib/libXt.sl # ln -s /usr/lib/libXtst.2 /usr/lib/libXtst.sl 5. 6. Enable Remote Access (ssh and remsh) for Oracle User on all Nodes Create File System for Oracle Directories In the following samples, /mnt/app is a mounted file system for Oracle software. Assume there is a private disk c4t5d0 at 18 GB size on all nodes. Create the local file system on each node.
# mkdir -p /cfs/mnt1/oracle # chown -R oracle:oinstall /cfs/mnt1/oracle # chmod -R 775 /cfs/mnt1/oracle # chmod 775 /cfs # chmod 775 /cfs/mnt1 Modify oracle user to use new home directory on each node. # usermod -d /cfs/mnt1/oracle oracle 10. Prepare Shared Storage on SLVM This section assumes the OCR, Vote device, and database files are created on SLVM volume group vg_ops. a.
redo2_2=/dev/vg_ops/rops2log2.log control1=/dev/vg_ops/ropsctl1.ctl control2=/dev/vg_ops/ropsctl2.ctl control3=/dev/vg_ops/ropsctl3.ctl temp=/dev/vg_ops/ropstmp.dbf spfile=/dev/vg_ops/ropsspfile1.ora In this sample, create the DBCA mapping file and place at: /mnt/app/oracle/oradata/ver10/ver10_raw.conf. 11. Prepare Shared Storage on CFS This section assumes the OCR, Vote device, and database files are created on CFS directories.
NOTE: When using LVM version 2.0, the volume groups are supported with Serviceguard. The steps for configuring the volume groups in Serviceguard clusters is the same for both LVM version 1.0 and 2.0. For more information on using and configuring LVM version 2.0, see the HP-UX 11i Version 3: HP-UX System Administrator's Guide: Logical Volume Management located at: http://docs.hp.com -> Core HP-UX 11iv3 -> LVM Volume Manager LVM version 2 is only supported with SGeRAC version A.11.
Installing Oracle 10g RAC Binaries The following sample steps for a SGeRAC cluster for Oracle 10g. Refer to the Oracle documentation for Oracle installation details. Installing RAC Binaries on a Local File System Logon as a “oracle” user: $ export ORACLE BASE=/mnt/app/oracle $ export DISPLAY={display}:0.0 $ cd <10g RAC installation disk directory> $ ./runInstaller Use following guidelines when installing on a local file system: 1.
export ORA_CRS_HOME=/mnt/app/crs/oracle/product/10.2.0/crs LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib:$ORACLE_HOME/rdbms/lib SHLIB_PATH=$ORACLE_HOME/lib32:$ORACLE_HOME/rdbms/lib32 export LD_LIBRARY_PATH SHLIB_PATH export \ PATH=$PATH:$ORACLE_HOME/bin:$ORA_CRS_HOME/bin:/usr/local/bin: CLASSPATH=$ORACLE_HOME/jre:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib:$ORACLE_HOME/network/jlib export CLASSPATH export DISPLAY={display}:0.0 1.
a. In this sample, the database name and SID prefix are ver10. b. Select the storage option for Cluster File System. c. Enter /cfs/mnt2/oradata as the common directory. Verifying Oracle Disk Manager is Configured NOTE: 1. The following steps are specific to CFS 4.1 or later. Check the license for CFS 4.1 or later. #/opt/VRTS/bin/vxlictest -n “VERITAS Storage Foundation for Oracle” -f “ODM” output: ODM feature is licensed 2. Check that the VRTSodm package is installed: #swlist VRTSodm output for CFS 4.
For Integrity Systems: $ rm ${ORACLE_HOME}/lib/libodm10.so $ ln -s /opt/VRTSodm/lib/libodm.sl \ ${ORACLE_HOME}/lib/libodm10.so 4. Start Oracle database Verify that Oracle Disk Manager is Running NOTE: 1. 2. The following steps are specific to CFS 4.1 or later.
Output: state loaded 4. In the alert log, verify the Oracle instance is running. The log should contain output similar to the following: For CFS 4.1: Oracle instance running with ODM: VERITAS 4.1 ODM Library, Version 1.1 For CFS 5.0: Oracle instance running with ODM: VERITAS 5.0 ODM Library, Version 1.0 Configuring Oracle to Stop Using Oracle Disk Manager Library NOTE: 1. 2. 3. The following steps are specific to CFS 4.1 or later.
Preparing Oracle Cluster Software for Serviceguard Packages • Stopping the Oracle Cluster Software on each Node For 10g 10.1.0.04 or later: # /sbin/init.d/init.crs stop For 10g 10.2.0.01 or later: # /bin/crsctl stop crs Wait until Oracle Cluster Software completely stops. (Check CRS logs or check for Oracle processes, ps -ef | grep ocssd.bin) • Change Oracle Cluster Software from Starting at Boot Time on each Node For 10g 10.1.0.04 or later: # /sbin/init.d/init.crs disable For 10g 10.2.0.
• Storage Activation (CFS) When the Oracle Cluster Software required storage is configured on a Cluster File System (CFS), the Serviceguard package should be configured to depend on the CFS multi-node package through package dependency. With package dependency, the Serviceguard package that starts Oracle Cluster Software will not run until its dependent CFS multi-node package is up and will halt before the CFS multi-node package is halted.
3 Serviceguard Configuration for Oracle 9i RAC This chapter shows the additional planning and configuration that is needed to use Oracle Real Application Clusters 9i with Serviceguard.
RAW LOGICAL VOLUME NAME SIZE (MB) Oracle Control File 1:_____/dev/vg_ops/ropsctl1.ctl_______108______ Oracle Control File 2: ___/dev/vg_ops/ropsctl2.ctl______108______ Oracle Control File 3: ___/dev/vg_ops/ropsctl3.ctl______104______ Instance 1 Redo Log 1: ___/dev/vg_ops/rops1log1.log_____20______ Instance 1 Redo Log 2: ___/dev/vg_ops/rops1log2.log_____20_______ Instance 1 Redo Log 3: ___/dev/vg_ops/rops1log3.
Table 3-1 RAC Software, Archive, Datafiles, SRVM Configuration RAC Software, Archive RAC Datafiles, SRVM 1 CFS CFS 2 CFS Raw (SLVM or CVM) 3 Local FS CFS 4 Local FS Raw (SLVM or CVM) NOTE: Mixing files between CFS database files and raw volumes is allowable, but not recommended. RAC datafiles on CFS requires Oracle Disk Manager (ODM). Using Single CFS Home or Local Home With a single CFS home, the software installs once and all the files are visible on all nodes.
• CVM 4.x or later — Disk group activation performed by disk group multi-node package. — Disk group activation performed by application package (without the HP Serviceguard Storage Management Suite bundle). • CVM 3.x — Disk group activation is performed by application package. Volume Planning with CVM Storage capacity for the Oracle database must be provided in the form of volumes located in shared disk groups.
Instance 2 undotbs2: /dev/vx/rdsk/ops_dg/undotbs2.dbf___312___ Data: example1__/dev/vx/rdsk/ops_dg/example1.dbf__________160____ data: cwmlite1__/dev/vx/rdsk/ops_dg/cwmlite1.dbf__100____ Data: indx1__/dev/vx/rdsk/ops_dg/indx1.dbf____70___ Data: drsys1__/dev/vx/rdsk/ops_dg/drsys1.
Cluster information is provided via a special system multi-node package, which runs on all nodes in the cluster. The cluster must be up and must be running this package before you can configure VxVM disk groups for use with CVM. Disk groups must be created from the CVM Master node. The Veritas CVM package for version 3.5 is named VxVM-CVM-pkg; the package for CVM version 4.1 and later is named SG-CFS-pkg.
addressing convention, the hardware path name is no longer encoded in a storage device’s name; instead, each device file name reflects a unique instance number, for example /dev/[r]disk/disk3, that does not need to change when the hardware path does. Agile addressing is the default on new 11i v3 installations, but the I/O subsystem still recognizes the pre-11i v3 nomenclature.
• • • • Building Volume Groups for RAC on Mirrored Disks Building Mirrored Logical Volumes for RAC with LVM Commands Creating RAC Volume Groups on Disk Arrays Creating Logical Volumes for RAC on Disk Arrays The Event Monitoring Service HA Disk Monitor provides the capability to monitor the health of LVM disks. If you intend to use this monitor for your mirrored disks, you should configure them in physical volume groups. For more information, refer to the manual Using HA Monitors.
In the following examples, we use /dev/rdsk/c1t2d0 and /dev/rdsk/c0t2d0, which happen to be the device names for the same disks on both ftsys9 and ftsys10. In the event that the device file names are different on the different nodes, make a careful note of the correspondences. Creating Physical Volumes On the configuration node (ftsys9), use the pvcreate command to define disks as physical volumes. This only needs to be done on the configuration node.
NOTE: For more information on using LVM, refer to the HP-UX Managing Systems and Workgroups manual. Building Mirrored Logical Volumes for RAC with LVM Commands After you create volume groups and define physical volumes for use in them, you define mirrored logical volumes for data, logs, and control files. It is recommended that you use a shell script to issue the commands described in the next sections.
Create logical volumes for use as Oracle data files by using the same options as in the following example: # lvcreate -m 1 -M n -c y -s g -n system.dbf -L 408 /dev/vg_ops The -m 1 option specifies single mirroring; the -M n option ensures that mirror write cache recovery is set off; the -c y means that mirror consistency recovery is enabled; the -s g means that mirroring is PVG-strict, that is, it occurs between different physical volume groups; the -n system.
The following example shows how to configure alternate links using LVM commands. The following disk configuration is assumed: 8/0.15.0 8/0.15.1 8/0.15.2 8/0.15.3 8/0.15.4 8/0.15.5 /dev/dsk/c0t15d0 /dev/dsk/c0t15d1 /dev/dsk/c0t15d2 /dev/dsk/c0t15d3 /dev/dsk/c0t15d4 /dev/dsk/c0t15d5 /* /* /* /* /* /* I/O I/O I/O I/O I/O I/O Channel Channel Channel Channel Channel Channel 0 0 0 0 0 0 (8/0) (8/0) (8/0) (8/0) (8/0) (8/0) 10/0.3.0 10/0.3.1 10/0.3.2 10/0.3.3 10/0.3.4 10/0.3.
alternate I/O channel represented by /dev/dsk/c1t3d0. Use the vgextend command to add additional disks to the volume group, specifying the appropriate physical volume name for each PV link. Repeat the entire procedure for each distinct volume group you wish to create. For ease of system administration, you may wish to use different volume groups to separate logs from data and control files. NOTE: The default maximum number of volume groups in HP-UX is 10.
Table 3-2 Required Oracle File Names for Demo Database (continued) Logical Volume Name LV Size (MB) Raw Logical Volume Path Name Oracle File Size (MB)* opstemp.dbf 108 /dev/vg_ops/ropstemp.dbf 100 opsusers.dbf 128 /dev/vg_ops/ropsusers.dbf 120 opstools.dbf 24 /dev/vg_ops/ropstools.dbf 15 opsdata1.dbf 208 /dev/vg_ops/ropsdata1.dbf 200 opsdata2.dbf 208 /dev/vg_ops/ropsdata2.dbf 200 opsdata3.dbf 208 /dev/vg_ops/ropsdata3.dbf 200 opsrollback.dbf 308 /dev/vg_ops/ropsroolback.
Exporting the Logical Volume Infrastructure Before the Oracle volume groups can be shared, their configuration data must be exported to other nodes in the cluster. This is done either in Serviceguard Manager or by using HP-UX commands, as shown in the following sections. NOTE: Serviceguard Manager is the graphical user interface for Serviceguard. It is available as a “plug-in” to the System Management Homepage (SMH).
Installing Oracle Real Application Clusters NOTE: Some versions of Oracle RAC requires installation of additional software. Refer to your version of Oracle for specific requirements. Before installing the Oracle Real Application Cluster software, make sure the cluster is running. Login as the oracle user on one node and then use the Oracle installer to install Oracle software and to build the correct Oracle runtime executables.
# # # # recommended. For a cluster of more than four nodes, a cluster lock is recommended. If you decide to configure a lock for a cluster of more than four nodes, it must be a quorum server. # # # # # Lock Disk Parameters. Use the FIRST_CLUSTER_LOCK_VG and FIRST_CLUSTER_LOCK_PV parameters to define a lock disk. The FIRST_CLUSTER_LOCK_VG is the LVM volume group that holds the cluster lock. This volume group should not be used by any other cluster as a cluster lock device.
# Cluster Timing Parameters (microseconds). # # # # # # # # # The NODE_TIMEOUT parameter defaults to 2000000 (2 seconds). This default setting yields the fastest cluster reformations. However, the use of the default value increases the potential for spurious reformations due to momentary system hangs or network load spikes. For a significant portion of installations, a setting of 5000000 to 8000000 (5 to 8 seconds) is more appropriate.
# # # # # # # policies that can be configured in the cluster is 200. # # # # # # List of cluster aware LVM Volume Groups. These volume groups will be used by package applications via the vgchange -a e command. Neither CVM or VxVM Disk Groups should be used here. For example: VOLUME_GROUP /dev/vgdatabase VOLUME_GROUP /dev/vg02 # # # # # # # # List of OPS Volume Groups. Formerly known as DLM Volume Groups, these volume groups will be used by OPS or RAC cluster applications via the vgchange -a s command.
With CFS, the database software and database files (control, redo, data files), and archive logs may reside on a cluster file system, which is visible by all nodes. In the following example, both the Oracle software and datafiles reside on CFS. There is a single Oracle home.
5. Configure the Cluster Volume Manager (CVM) Configure the system multi-node package, SG-CFS-pkg, to configure and start the CVM/CFS stack. Unlike VxVM-CVM-pkg, the SG-CFS-pkg does not restrict heartbeat subnets to a single subnet and supports multiple subnets. # cfscluster config -s The following output will be displayed: CVM is now configured[LINEBREAK]Starting CVM...
9. Create the Disk Group Multi-Node package. Use the following command to add the disk group to the cluster: # cfsdgadm add cfsdg1 all=sw The following output will be displayed: Package name “SG-CFS-DG-1” was generated to control the resource shared disk group “cfsdg1” is associated with the cluster. 10. Activate the Disk Group # cfsdgadm activate cfsdg1 11.
Package name “SG-CFS-MP-1” was generated to control the resource. Mount point “/cfs/mnt1” was associated with the cluster. # cfsmntadm add cfsdg1 vol2 /cfs/mnt2 all=rw The following output will be displayed: Package name “SG-CFS-MP-2” was generated to control the resource. Mount point “/cfs/mnt2” was associated with the cluster. # cfsmntadm add cfsdg1 volsrvm /cfs/cfssrvm all=rw The following output will be displayed: Package name “SG-CFS-MP-3” was generated to control the resource.
Deleting CFS from the Cluster Use the following steps to halt the applications that are using CFS file systems: 1. Unmount CFS Mount Points # cfsumount /cfs/mnt1 # cfsumount /cfs/mnt2 # cfsumount /cfs/cfssrvm 2.
CVM is now unconfigured Creating a Storage Infrastructure with CVM In addition to configuring the cluster, you create the appropriate logical volume infrastructure to provide access to data from different nodes. This is done with Logical Volume Manager (LVM), Veritas Volume Manager (VxVM), or Veritas Cluster Volume Manager (CVM).
NOTE: To prepare the cluster for CVM configuration, you need to be sure MAX_CONFIGURED_PACKAGES to minimum of 3 (the default value for MAX_CONFIGURED_PACKAGES for Serviceguard A.11.17 is 150) cluster configuration file. In the sample set the value to 10. 2. Create the Cluster # cmapplyconf -C clm.asc • Start the Cluster # cmruncl # cmviewcl The following output will be displayed: CLUSTER ever3_cluster NODE ever3a ever3b 3.
procedure is described in the Managing Serviceguard Fifteenth Edition user’s guide Appendix G. • Initializing Disks for CVM You need to initialize the physical disks that will be employed in CVM disk groups. If a physical disk has been previously used with LVM, you should use the pvremove command to delete the LVM header data from all the disks in the volume group (this is not necessary if you have not previously used the disk with LVM).
IMPORTANT: After creating these files, use the vxedit command to change the ownership of the raw volume files to oracle and the group membership to dba, and to change the permissions to 660. Example: # cd /dev/vx/rdsk/ops_dg # vxedit -g ops_dg set user=oracle * # vxedit -g ops_dg set group=dba * # vxedit -g ops_dg set mode=660 * The logical volumes are now available on the primary node, and the raw logical volume names can now be used by the Oracle DBA.
WARNING! The above file should never be edited. After the above command completes, start the cluster and create disk groups for shared use as described in the following sections. Starting the Cluster and Identifying the Master Node Run the cluster, which will activate the special CVM package: # cmruncl After the cluster is started, it will now run with a special system multi-node package named VxVM-CVM-pkg, which is on all nodes.
To initialize a disk for CVM, log on to the master node, then use the vxdiskadm program to initialize multiple disks, or use the vxdisksetup command to initialize one disk at a time, as in the following example: # /usr/lib/vxvm/bin/vxdisksetup -i /dev/dsk/c0t3d2 Creating Disk Groups for RAC Use the vxdg command to create disk groups.
IMPORTANT: After creating these files, use the vxedit command to change the ownership of the raw volume files to oracle and the group membership to dba, and to change the permissions to 660. Example: # cd /dev/vx/rdsk/ops_dg # vxedit -g ops_dg set user=oracle * # vxedit -g ops_dg set group=dba * # vxedit -g ops_dg set mode=660 * The logical volumes are now available on the primary node, and the raw logical volume names can now be used by the Oracle DBA.
Table 3-3 Required Oracle File Names for Demo Database (continued) Volume Name Size (MB) Raw Device File Name Oracle File Size (MB) ops2log1.log 28 /dev/vx/rdsk/ops_dg/ops2log1.log 20 ops2log2.log 28 /dev/vx/rdsk/ops_dg/ops2log2.log 20 ops2log3.log 28 /dev/vx/rdsk/ops_dg/ops2log3.log 20 opssystem.dbf 408 /dev/vx/rdsk/ops_dg/opssystem.dbf 400 opstemp.dbf 108 /dev/vx/rdsk/ops_dg/opstemp.dbf 100 opsusers.dbf 128 /dev/vx/rdsk/ops_dg/opsusers.dbf 120 opstools.
control1=/dev/vx/rdsk/ops_dg/opsctl1.ctl or control1=/u01/ORACLE/db001/ctrl01_1.ctl 2. Set the following environment variable where filename is the name of the ASCII file created. # export DBCA_RAW_CONFIG=/filename Adding Disk Groups to the Cluster Configuration For CVM 4.x or later, if the multi-node package was configured for disk group activation, the application package should be configured with package dependency to ensure the CVM disk group is active. For CVM 3.5 and CVM 4.
# passwd oracle b. Set up for remote commands Setup user equivalence for all nodes by adding node name entries to /etc/hosts.equiv or add entries to the .rhosts of oracle account. c. Set up CFS directory for Oracle datafiles. # cd /cfs/mnt2 # mkdir oradata # chown oracle:dba oradata # chmod 755 oradata # ll total 0drwxr-xr-x oracle dba 2 root root 96 Jun 96 Jun 3 13:45 oradat 3 11:43 lost+founddrwxr-xr-x 2 d. Set up CFS directory for Server Management.
CLASSPATH=/opt/java1.3/libCLASSPATH=$CLASSPATH:$ORACLE_HOME/JRE:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib:$ORACLE_HOME/network/jlibexport CLASSPATHexport DISPLAY={display}:0.0 2. Set up Listeners with Oracle Network Configuration Assistant $ netca 3. Start GSD on all Nodes $ gsdctl start Output: Successfully started GSD on local node 4.
Configure Oracle to use Oracle Disk Manager Library NOTE: 1. 2. 3. The following steps are specific to CFS 4.1 or later. Logon as Oracle user Shutdown database Link the Oracle Disk Manager library into Oracle home using the following commands: For HP 9000 systems: $ rm ${ORACLE_HOME}/lib/libodm9.sl $ ln -s /opt/VRTSodm/lib/libodm.sl \ ${ORACLE_HOME}/lib/libodm9.sl For Integrity systems: $ rm ${ORACLE_HOME}/lib/libodm9.so $ ln -s /opt/VRTSodm/lib/libodm.sl \ ${ORACLE_HOME}/lib/libodm9.so 4.
comp calls: io mor cmp: io zro cmp: cl receive: cl ident: cl reserve: cl delete: cl resize: cl same op: cl opt idn: cl opt rsv: **********: 3. 5439560 461063 2330 66145 18 8 1 0 0 0 332 17 Verify that Oracle Disk Manager is loaded with the following command: # kcmodule -P state odm The following output will be displayed:[LINEBREAK]state loaded 4. In the alert log, verify the Oracle instance is running. The log should contain output similar to the following: For CFS 4.
$ ln -s ${ORACLE_HOME}/lib/libodmd9.so \ ${ORACLE_HOME}/lib/libodm9.so 5. Restart the database Using Packages to Configure Startup and Shutdown of RAC Instances To automate the startup and shutdown of RAC instances on the nodes of the cluster, you can create packages which activate the appropriate volume groups and then run RAC. Refer to the section “Creating Packages to Launch Oracle RAC Instances” NOTE: The maximum number of RAC instances for Oracle 9i is 127 per cluster.
Creating Packages to Launch Oracle RAC Instances To coordinate the startup and shutdown of RAC instances with cluster node startup and shutdown, you create a one-node package for each node that runs an RAC instance. In the package configuration file, you should specify only the single node on which the instance will run and specify the control script that is to be executed every time the instance node or the entire RAC cluster starts up or shuts down.
1. 2. In the ASCII package configuration file, set the AUTO_RUNparameter to NO, or if you are using Serviceguard Manager to configure packages, set Automatic Switching to Disabled. This keeps the package from starting up immediately when the node joins the cluster, and before RAC is running. You can then manually start the package using the cmmodpkg -e packagename command after RAC is started.
You may customize the script, as described in the section, ““Customizing the Package Control Script”.” Customizing the Package Control Script Check the definitions and declarations at the beginning of the control script using the information in the Package Configuration worksheet. You need to customize as follows: • • • • • • • • • • • • Update the PATH statement to reflect any required paths needed to start your services.
NOTE: Use care in defining service run commands. Each run command is executed by the control script in the following way: • The cmrunserv command executes each run command and then monitors the process id of the process created by the run command. • When the command started by cmrunserv exits, Serviceguard determines that a failure has occurred and takes appropriate action, which may include transferring the package to an adoptive node.
OPS_VOLUME_GROUP parameters.) Be sure to specify shared activation with the vgchange command by setting the VGCHANGE parameter as follows: VGCHANGE="vgchange -a s” If your disks are mirrored with LVM mirroring on separate physical paths and you want to override quorum, use the following setting: VGCHANGE="vgchange -a s -q n” Enter the names of the CVM disk groups you wish to activate in shared mode in the CVM_DG[] array. Use a different array element for each RAC disk group.
For an ORACLE RAC Instance for a two-node cluster, each node would have an SID_NAME. 2. Gather the RAC Instance package name for each node, which should be the same as the SID_NAME for each node Example: ORACLE_TEST0 3. Gather the shared volume group name for the RAC database. In Serviceguard Manager, see cluster Properties. Example: /dev/vgora92db 4. Create the Oracle RAC Instance Package directory[LINEBREAK][LINEBREAK]/etc/cmcluster/pkg/${SID_NAME} Example:/etc/cmcluster/pkg/ORACLE_TEST0 5.
Using Serviceguard Manager to Configure Oracle RAC Instance Package Serviceguard Manager can be used to configure an Oracle RAC instance. Refer to the Serviceguard Manager documentation for specific configuration information. NOTE: Serviceguard Manager is the graphical user interface for Serviceguard. For version A.11.18, it is available as a “plug-in” to the System Management Homepage (SMH). For more information see “About the Serviceguard Manager SMH Plug-In” in the Serviceguard Version A.11.
4 Maintenance and Troubleshooting This chapter includes information about carrying out routine maintenance on an Real Application Cluster configuration. As presented here, these tasks differ in some details from the similar tasks described in the Managing Serviceguard documentation.
Types of Cluster and Package States A cluster or its component nodes may be in several different states at different points in time. The following sections describe many of the common conditions the cluster or package may be in.
Service Service Service NODE_NAME mo up up up 5 0 0 STATUS up Script_Parameters: ITEM STATUS Service up Service up Service up Service up Service up PACKAGE SG-CFS-DG-1 NODE_NAME minie MAX_RESTARTS 0 5 5 0 0 STATUS up STATUS up STATUS up Dependency_Parameters: DEPENDENCY_NAME SG-CFS-pkg PACKAGE SG-CFS-MP-1 NODE_NAME minie STATUS up STATUS up Dependency_Parameters: DEPENDENCY_NAME SG-CFS-DG-1 NODE_NAME mo STATUS up Dependency_Parameters: DEPENDENCY_NAME SG-CFS-DG-1 PACKAGE SG-CFS-MP-2 STATUS up S
NODE_NAME minie STATUS up Dependency_Parameters: DEPENDENCY_NAME SG-CFS-DG-1 NODE_NAME mo STATUS up Dependency_Parameters: DEPENDENCY_NAME SG-CFS-DG-1 PACKAGE SG-CFS-MP-3 NODE_NAME minie STATUS up STATUS up Dependency_Parameters: DEPENDENCY_NAME SG-CFS-DG-1 NODE_NAME mo STATUS up Dependency_Parameters: DEPENDENCY_NAME SG-CFS-DG-1 STATE running SWITCHING enabled SATISFIED yes STATE running SWITCHING enabled SATISFIED yes STATE running STATE running AUTO_RUN enabled SYSTEM no SWITCHING enabled
Node Status and State The status of a node is either up (active as a member of the cluster) or down (inactive in the cluster), depending on whether its cluster daemon is running or not. Note that a node might be down from the cluster perspective, but still up and running HP-UX. A node may also be in one of the following states: • • • • • Failed. A node never sees itself in this state.
specified node until the node is enabled for the package using the cmmodpkg command. Every package is marked Enabled or Disabled for each node that is either a primary or adoptive node for the package. For multi-node packages, node switching Disabled means the package cannot start on that node. Status of Group Membership The state of the cluster for Oracle RAC is one of the following: • • Up. Services are active and being monitored. The membership appears in the output of cmviewcl -l group. Down.
Network Status The network interfaces have only status, as follows: • • • Up. Down. Unknown. We cannot determine whether the interface is up or down. This can happen when the cluster is down. A standby interface has this status. Serial Line Status The serial line has only status, as follows: • • • • Up. Heartbeats are received over the serial line. Down. Heartbeat has not been received over the serial line within 2 times the NODE_TIMEOUT value. Recovering.
Normal Running Status Everything is running normally; both nodes in a two-node cluster are running, and each Oracle RAC instance package is running as well. The only packages running are Oracle RAC instance packages. CLUSTER example NODE ftsys9 STATUS up STATUS up STATE running Network_Parameters: INTERFACE STATUS PRIMARY up STANDBY up PATH 56/36.
Quorum Server Status If the cluster is using a quorum server for tie-breaking services, the display shows the server name, state and status following the entry for each node, as in the following excerpt from the output of cmviewcl -v: CLUSTER example NODE ftsys9 STATUS up STATUS up STATE running Quorum Server Status: NAME STATUS lp-qs up ...
NODE STATUS ftsys9 up Script_Parameters: ITEM STATUS Service up STATE running MAX_RESTARTS 0 RESTARTS 0 NAME VxVM-CVM-pkg.srv Status After Moving the Package to Another Node After issuing the following command: # cmrunpkg -n ftsys9 pkg2 the output of the cmviewcl -v command is as follows: CLUSTER example NODE ftsys9 STATUS up STATUS up STATE running Network_Parameters: INTERFACE STATUS PRIMARY up STANDBY up PATH 56/36.
Script_Parameters: ITEM STATUS NAME MAX_RESTARTS Service up service2.1 0 Subnet up 15.13.168.0 0 Node_Switching_Parameters: NODE_TYPE STATUS SWITCHING Primary up enabled Alternate up enabled NODE ftsys10 STATUS up Network_Parameters: INTERFACE STATUS PRIMARY up STANDBY up NAME ftsys10 ftsys9 RESTARTS 0 0 (current) STATE running PATH 28.1 32.1 NAME lan0 lan1 Now pkg2 is running on node ftsys9. Note that it is still disabled from switching.
ftsys9 up PACKAGE pkg1 pkg2 NODE ftsys10 running STATUS up up STATUS down STATE running running AUTO_RUN enabled enabled NODE ftsys9 ftsys9 STATE halted This output is seen on both ftsys9 and ftsys10. Viewing Data on Unowned Packages The following example shows packages that are currently unowned, that is, not running on any configured node.
Online Node Addition and Deletion Online node addition enables the addition or deletion of nodes in a SGeRAC cluster to or from another running cluster.Node(s) can be added and/or deleted by changing the cluster configuration. This is done by editing the cluster specification file and re-applying the configuration to the already running cluster. For deleting online node(s), the node(s) needs to be halted before deleting them from the cluster.
NOTE: For more information, see the Serviceguard Version A.11.18 Release Notes at http://docs.hp.com -> High Availability -> Serviceguard Making LVM Volume Groups Shareable Normally, volume groups are marked to be activated in “shared” mode when they are listed with the OPS_VOLUME_GROUP parameter in the cluster configuration file or in Serviceguard Manager. which occurs when the configuration is applied. However, in some cases you may want to manually make a volume group sharable.
The following message is displayed: Activated volume group in shared mode. This node is the Server. When the same command is entered on the second node, the following message is displayed: Activated volume group in shared mode. This node is a Client. NOTE: Do not share volume groups that are not part of the RAC configuration unless shared access is controlled.
6. Prior to making configuration changes, activate the volume group in normal (non-shared) mode: # vgchange -a y /dev/vg_ops 7. 8. Use normal LVM commands to make the needed changes. Be sure to set the raw logical volume device file's owner to oracle and group to dba, with a mode of 660. Next, still from node 1, deactivate the volume group: # vgchange -a n /dev/vg_ops 9. Use the vgexport command with the options shown in the example to create a new map file: # vgexport -p -m /tmp/vg_ops.
Adding Additional Shared LVM Volume Groups To add capacity or to organize your disk resources for ease of management, you may wish to create additional shared volume groups for your Oracle RAC databases. If you decide to use additional shared volume groups, they must conform to the following rules: • • • Volume groups should include different PV links to each logical unit on the disk array. Volume group names must be the same on all nodes in the cluster.
NOTE: For CVM without CFS, if you are adding a disk group to the cluster configuration, make sure you also modify any package or create the package control script that imports and deports this disk group. If you are adding a CVM disk group, be sure to add the STORAGE_GROUP entry for the disk group to the package ASCII file. For CVM with CFS, if you are adding a disk group to the cluster configuration, make sure you also create the corresponding multi-node package.
• • All cables Disk interface cards Some monitoring can be done through simple physical inspection, but for the most comprehensive monitoring, you should examine the system log file (/var/adm/syslog/syslog.log) periodically for reports on all configured HA devices. The presence of errors relating to a device will show the need for maintenance. Using Event Monitoring Service Event Monitoring Service (EMS) allows you to configure monitors of specific devices and system resources.
NOTE: As you add new disks to the system, update the planning worksheets (described in Appendix B: “Blank Planning Worksheets”, so as to record the exact configuration you are using. Replacing Disks The procedure for replacing a faulty disk mechanism depends on the type of disk configuration you are using and on the type of Volume Manager software.
3. On the node on which the volume group is currently activated, use the following command for each logical volume that has extents on the failed physical volume: # lvreduce -m 0 /dev/vg_sg01/lvolname /dev/dsk/c2t3d0 4. 5. At this point, remove the failed disk and insert a new one. The new disk will have the same HP-UX device name as the old one.
# pvchange -a N [pv path] Alternatively, use the pvchange -a N [pv path] command to detach a disk (all paths to the disk) and close it. Use this to allow diagnostics or replace a multi-ported disk. NOTE: If the volume group is mirrored, applications can continue accessing data on mirror copies after the commands above. If the volume is not mirrored, then any access attempts to the device may hang indefinitely or time out. This depends upon the LV timeout value configured for the logical volume. 2.
If you are using software mirroring for shared concurrent activation of Oracle RAC data with MirrorDisk/UX and the mirrored disks are mounted in a high availability disk enclosure, use the following steps to carry out offline replacement: 1. 2. Make a note of the physical volume name of the failed mechanism (for example, /dev/dsk/c2t3d0). Deactivate the volume group on all nodes of the cluster: # vgchange -a n vg_ops 3. 4. Replace the bad disk mechanism with a good one.
of the F/W SCSI bus without breaking the bus's termination. (Nodes attached to the middle of a bus using a Y cable also can be detached from the bus without harm.) When using in-line terminators and Y cables, ensure that all orange-socketed termination packs are removed from the controller cards. NOTE: You cannot use inline terminators with internal FW/SCSI buses on D and K series systems, and you cannot use the inline terminator with single-ended SCSI buses.
1. 2. 3. 4. 5. 6. 7. 8. Move any packages on the node that requires maintenance to a different node. Halt the node that requires maintenance. The cluster will re-form, and activity will continue on other nodes. Packages on the halted node will switch to other available nodes if they are configured to switch. Disconnect the power to the node. Disconnect the node from the in-line terminator cable or Y cable if necessary.
1. 2. 3. 4. 5. 6. Halt the node by using the cmhaltnode command. Shut down the system using /etc/shutdown, then power down the system. Remove the defective LAN card. Install the new LAN card. The new card must be exactly the same card type, and it must be installed in the same slot as the card you removed. Power up the system. If necessary, add the node back into the cluster by using the cmrunnode command. (You can omit this step if the node is configured to join the cluster automatically.
Monitoring RAC Instances The DB Provider provides the capability to monitor RAC databases. RBA (Role Based Access) enables a non-root user to have the capability to monitor RAC instances using Serviceguard Manager.
A Software Upgrades Serviceguard Extension for RAC (SGeRAC) software upgrades can be done in the two following ways: • rolling upgrade • non-rolling upgrade Instead of an upgrade, moving to a new version can be done with: • migration with cold install Rolling upgrade is a feature of SGeRAC that allows you to perform a software upgrade on a given node without bringing down the entire cluster. SGeRAC supports rolling upgrades on version A.11.
• • • “Rolling Software Upgrades” — “Steps for Rolling Upgrades ” — “Example of Rolling Upgrade ” — “Limitations of Rolling Upgrades ” “Non-Rolling Software Upgrades” — “Steps for Non-Rolling Upgrades ” — “Limitations of Non-Rolling Upgrades ” “Migrating a SGeRAC Cluster with Cold Install” Rolling Software Upgrades SGeRAC version A.11.
NOTE: It is optional to set this parameter to “1”. If you want the node to join the cluster at boot time, set this parameter to “1”, otherwise set it to “0”. 6. 7. 8. Restart the cluster on the upgraded node (if desired). You can do this in Serviceguard Manager, or from the command line, issue the Serviceguard cmrunnode command. Restart Oracle (RAC, CRS, Clusterware, OPS) software on the local node. Repeat steps 1-7 on the other nodes, one node at a time until all nodes have been upgraded.
NOTE: While you are performing a rolling upgrade, warning messages may appear while the node is determining what version of software is running. This is a normal occurrence and not a cause for concern. Figure A-1 Running Cluster Before Rolling Upgrade Step 1. 1. 2. Halt Oracle (RAC, CRS, Clusterware, OPS) software on node 1. Halt node 1. This will cause the node’s packages to start up on an adoptive node.
Figure A-2 Running Cluster with Packages Moved to Node 2 Step 2. Upgrade node 1 and install the new version of Serviceguard and SGeRAC (A.11.16), as shown in Figure A-3.
NOTE: If you install Serviceguard and SGeRAC separately, Serviceguard must be installed before installing SGeRAC. Figure A-3 Node 1 Upgraded to SG/SGeRAC 11.16 Step 3. 1. Restart the cluster on the upgraded node (node 1) (if desired). You can do this in Serviceguard Manager, or from the command line issue the following: # cmrunnode node1 2. 3. 172 At this point, different versions of the Serviceguard daemon (cmcld) are running on the two nodes, as shown in Figure A-4.
Figure A-4 Node 1 Rejoining the Cluster Step 4. 1. 2. Halt Oracle (RAC, CRS, Clusterware, OPS) software on node 2. Halt node 2. You can do this in Serviceguard Manager, or from the command line issue the following: # cmhaltnode -f node2 This causes both packages to move to node 1; see Figure A-5. 3. 4. Upgrade node 2 to Serviceguard and SGeRAC (A.11.16) as shown in Figure A-5. When upgrading is finished, enter the following command on node 2 to restart the cluster on node 2: # cmrunnode node2 5.
Figure A-5 Running Cluster with Packages Moved to Node 1 Step 5. Move PKG2 back to its original node. Use the following commands: # cmhaltpkg pkg2 # cmrunpkg -n node2 pkg2 # cmmodpkg -e pkg2 The cmmodpkg command re-enables switching of the package, which is disabled by the cmhaltpkg command. The final running cluster is shown in Figure A-6.
Figure A-6 Running Cluster After Upgrades Limitations of Rolling Upgrades The following limitations apply to rolling upgrades: • During a rolling upgrade, you should issue Serviceguard/SGeRAC commands (other than cmrunnode and cmhaltnode) only on a node containing the latest revision of the software. Performing tasks on a node containing an earlier revision of the software will not work or will cause inconsistent results.
• • • You can perform a rolling upgrade only on a configuration that has not been modified since the last time the cluster was started. Rolling upgrades are not intended as a means of using mixed releases of Serviceguard and SGeRAC within the same cluster. SGeRAC requires the compatible version of Serviceguard. Upgrade all cluster nodes as quickly as possible to the new release level.
Migrating a SGeRAC Cluster with Cold Install There may be circumstances when you prefer a cold install of the HP-UX operating system rather than an upgrade. The cold install process erases the pre-existing operating system and data and then installs the new operating system and software; you must then restore the data. CAUTION: The cold install process erases the pre-existing software, operating system, and data.
B Blank Planning Worksheets This appendix reprints blank planning worksheets used in preparing the RAC cluster. You can duplicate any of these worksheets that you find useful and fill them in as a part of the planning process.
Physical Volume Name:______________________________________________________ Physical Volume Name:______________________________________________________ Physical Volume Name:______________________________________________________ Physical Volume Name: _____________________________________________________ Physical Volume Name: _____________________________________________________ Physical Volume Name: _____________________________________________________ Physical Volume Name: __________________________________
Instance 2 Redo Log 3: _____________________________________________________ Instance 2 Redo Log: _____________________________________________________ Instance 2 Redo Log: _____________________________________________________ Data: System _____________________________________________________ Data: Rollback _____________________________________________________ Data: Temp _____________________________________________________ Data: Users _____________________________________________________ Data
Index A D activation of volume groups in shared mode, 152 adding packages on a running cluster, 132 administration cluster and package states, 139 array replacing a faulty mechanism, 158, 159, 160 AUTO_RUN parameter, 131 AUTO_START_TIMEOUT in sample configuration file, 64, 108 deactivation of volume groups, 153 deciding when and where to run packages, 29 deleting from the cluster, 72 deleting nodes while the cluster is running, 155 demo database files, 61, 79, 105, 123 disk choosing for volume groups, 56
group membership services, 29 group membership services define, 29 H hardware adding disks, 157 monitoring, 156 heartbeat subnet address parameter in cluster manager configuration, 47 HEARTBEAT_INTERVAL in sample configuration file, 64, 108 HEARTBEAT_IP in sample configuration file, 64, 108 high availability cluster defined, 21 I in-line terminator permitting online hardware maintenance, 161 installing Oracle RAC, 64, 108 installing software Serviceguard Extension for RAC, 44, 97 IP in sample package cont
switching status, 149 package configuration service name parameter, 47 writing the package control script, 132 package control script generating with commands, 132 packages accessing OPS database, 131 deciding where and when to run, 29 launching OPS instances, 131 startup and shutdown volume groups, 130 parameter AUTO_RUN, 131 NODE_FAILFAST_ENABLED, 131 performance optimizing packages for large numbers of storage units, 134 physical volumes creating for clusters, 57, 101 filled in planning worksheet, 179 pl
in package control script, 134 volume group creating for a cluster, 57, 101 creating physical volumes for clusters, 57, 101 volume groups adding shared volume groups, 155 displaying for RAC, 62, 106 exporting to other nodes, 62, 107 making changes to shared volume groups, 153 making shareable, 152 making unshareable, 152 OPS startup and shutdown, 130 VOLUME_GROUP in sample configuration file, 64, 108 VxVM-CVM-pkg, 77, 120 VXVM_DG in package control script, 134 W worksheet logical volume planning, 42, 43, 9