Best Practices for SGeRAC and Oracle RAC on HP-UX 11i, March 2009

8
When both the primary and standby interfaces fail, Serviceguard resolves the interconnect failure by
performing a cluster reconfiguration. After Serviceguard completes its reconfiguration, SGeRAC
notifies CSS and CSS updates RAC.
Serviceguard allows addition and deletion of subnets. When subnets are changed, be sure to keep
the SG-HB on the same network as CSS-HB and RAC-DB-IC to maintain the monitoring behavior.
Timeouts
The SG-HB timeout (MEMBER_TIMEOUT)
4
should be set based on service availability requirements.
The usual determining factor is how soon the service must be available when a node failure or a
complete heartbeat network failure has occurred. On installation, CSS-HB and RAC-DB-IC use default
timeouts of 10 minutes and 17 minutes respectively. ASM-IC default timeout is predefined on
installation and should not be changed. GAB/LLT timeout is set automatically at CVM/CFS
configuration time to coordinate with Serviceguard MEMBER_TIMEOUT and should not require
manual change.
Serviceguard MEMBER_TIMEOUT is in the Serviceguard cluster configuration file. The CSS-HB
timeout is the CSS MISSCOUNT. The RAC-DB-IC timeout is the timeout for Instance Membership
Recovery (IMR) and is specified by the Oracle parameter _dlm_send_timeout.
Note:
The _dlm_send_timeout parameter is unavailable in 11g.
Since Serviceguard resolves the interconnect failure, the CSS-HB timeout should be greater than the
Serviceguard reconfiguration time. The default values for CSS-HB timeout and RAC-DB-IC timeout
should be sufficient. If there is a need to tune any timeouts, the CSS-HB timeout should be tuned to
provide an opportunity for Serviceguard to complete reconfiguration and update CSS through group
membership service (GMS) prior to CSS timeout. The RAC-DB-IC timeout should be 15 seconds
above CSS-HB timeout. For maximum availability, tune Serviceguard MEMBER_TIMEOUT, CSS-HB
timeout, and RAC-DB-IC timeout together.
Alternate configuration multiple RAC databases
When RAC-DB-IC traffic is very high, there is a possibility that it may interfere with other traffic.
When there are multiple independent RAC databases in the same cluster, if there is insufficient
bandwidth over a single network, a second network can be used for other database interconnect
traffic.
5
If ASM is used, ASM-IC traffic will be on the same network as CSS-HB (LAN1/lan2). If
CFS/CVM is used, GAB/LLT traffic will be one the same network as SG-HB (LAN1/lan2).
Each primary and standby pair protects against single failure. If the subnet with SG-HB (LAN1/lan2)
fails, Serviceguard will resolve the subnet interconnect failure with a Serviceguard cluster
reconfiguration. If the subnet with RAC-DB2-IC (LAN3/4) fails, unless subnet monitoring is used, IMR
will resolve the subnet interconnect failure.
4
Serviceguard A.11.19 introduced includes a new cluster manager protocol and a new parameter MEMBER_TIMEOUT to specify heartbeat
timeout.
5
See Oracle RAC Administration documentation on how to specify additional RAC-DB-IC.