Best Practices for SGeRAC and Oracle RAC on HP-UX 11i, March 2009
6
Remote failover
In the case of a catastrophic failure such as node failure or network failure, OC fails over the VIP
address to a surviving node.
Network for cluster communication
The network for cluster communication is used for private internal cluster communications. Although
one network adapter per node is sufficient for the private network, HP recommends that a minimum of
two network adapters for each network to be used for higher availability.
Serviceguard, OC, and each RAC instance maintain communication with peers on other nodes.
When communication is broken, either through network partition or node failure, each of these
components needs to reform its membership and eject non-members as needed.
The categories of traffic between nodes are distinguished as follows:
• SG-HB – Serviceguard heartbeat and communications traffic. Supported over single or multiple
subnet networks.
• CSS-HB – Cluster Synchronization Service (CSS) heartbeat traffic and communications traffic for
Oracle Clusterware. CSS-HB uses a single logical connection over a single subnet network.
• RAC-DB-IC – RAC instance peer to peer traffic and communications for Global Cache Service (GCS)
and Global Enqueue Service (GES), formerly Cache Fusion (CF) and Distributed Lock Manager
(DLM). Per RAC database. Network HA is provided by the HP-UX 11i platform (Serviceguard or
APA bonding).
• ASM-IC – Applicable only when using Automatic Storage Management (ASM). ASM instance peer
to peer traffic. When it exists, ASM-IC should be on the same network as CSS-HB. Network HA is
required either through Serviceguard failover or APA bonding.
• GAB/LLT – Applicable only when using CFS/CVM, Cluster File System and Cluster Volume
Manager. Symantec cluster heartbeat and communications traffic. GAB/LLT communicates over
link level protocol (DLPI) and supported over Serviceguard heartbeat subnet networks, including
primary and standby links. GAB/LLT is not supported over APA or virtual LANs (VLAN).
Note that each category maintains its own timeout for which members may be evicted from its
respective membership.
The interconnect network requires HA configurations. When a single network failure occurs, for
example LAN card or switch failures, all the cluster nodes continue to operate. Without HA, a single
network failure results in a network partition between the nodes and evicted nodes are halted.
Using Serviceguard primary and standby links is the preferred HA model to provide HA for the cluster
communications interconnect network HA. With redundancy through Serviceguard primary and
standby, Serviceguard monitors the network and performs local failover if the primary network
becomes unavailable.
General principles
It is preferred to have all interconnect traffic for cluster communications to go on a single heartbeat
network that is redundant so that Serviceguard will monitor the network and resolve interconnect
failures by cluster reconfiguration. This preferred configuration is the recommended common
configuration.
The following examples are instances when it is not possible to place all interconnect traffic on the
same network: