Sample Configuration with HP Serviceguard Extension for RAG and Oracle Real Application Clusters 11g release 2 using Cluster File System

ManualsBrandsHP ManualsSoftwareHP Serviceguard Extension for RAC (SGeRAC)

Service (RAC-DB-IC) failure is discovered, the speed of recovery actions by impacted components—

for example, SG, Group Membership Service (GMS), Cluster Synchronization Service (CSS),

and/or RAC)—and database recovery time.

– For a complete SG cluster interconnect failure, SG sees the failure within the

MEMBER_TIMEOUT

timeframe.

– With SG/CFS, Group Membership Service (Atomic Broadcast) (GAB)/Low Latency Transport (LLT)

and SG share the same networks and SG sees the interconnect failure within the

MEMBER_TIMEOUT

timeframe.

– HP recommends configuring the CSS heartbeat (CSS-HB) on the same network as the

Serviceguard heartbeat (SG-HB) In configurations where the CSS-HB and the SG-HB share the

same interconnect network, SG will react to failures within the

MEMBER_TIMEOUT timeframe (sooner

than the CSS timeout) and fail the node with a transfer of control (TOC).

If the CSS traffic is on a SG monitored network (but not on a SG SG-HB network), SG packages

can be configured with cluster interconnect subnet monitoring to detect failure of the CSS network

and TOC the node sooner than the CSS timeout. If CSS traffic is not sharing the SG-HB network,

and SG is not configured to monitor the CSS-HB network, CSS will detect the interconnect failure

within the CSS timeout and TOC the node. HP does not recommend this architecture because it

can cause inconsistencies in cluster membership.

– With RAC-DB-IC, on configurations where the CSS-HB, the SG-HB, and the RAC-DB-IC share the

same interconnect network, SG sees the failure within the

MEMBER_TIMEOUT timeframe (sooner

than instance membership recovery [IMR] timeout). SG packages can be configured with cluster

interconnect subnet monitoring to monitor the RAC-DB-IC and detect a failure before the IMR

timeout is completed. If the RAC-DB-IC is not sharing the SG-HB network and SG is not configured

to monitor the RAC-DB-IC network, RAC discovers the interconnect failure within the IMR timeout.

The failover time requirement determines important timeouts, such as SG

MEMBER_TIMEOUT, network

polling intervals, and cluster interconnect monitoring.

Note: Cluster interconnect subnet monitoring provides better availability by detecting and resolving

RAC-DB-IC subnet failures quickly, and providing services (on one node) when the Oracle CSS-

HB/RAC-IC subnet fails on all nodes.

Planning for high availability

A properly configured high availability (HA) configuration should survive a single point of failure and

continue to operate.

Public network HA

There are two ways that client public network high availability is sustained: redundant components

and client failover.

• Redundant network interfaces and switches with local LAN failover provided by SG (or bonding by

Auto-Port Aggregation (APA) protect against single point network failures.

• Client failover protects against failure of existing or new client sessions. These failures include node

failures (such as those caused by a power failure) and network failures (for example, failure of all

redundant network interface/links). Protection is available at three levels: Oracle FAN, remote VIP

failover, and client connection timeout. Clients that are FAN integrated—or are using the FAN

API—may interrupt existing sessions and failover. Remote VIP failover is useful for non-FAN clients

attempting to connect to the local node to avoid a TCP connection timeout. The client connection

timeout is useful when client connection takes a long time for any reason.