Install guide

Tip
The time a node takes to reboot depends on several factors, including BIOS settings. Many
servers scan all of memory and then scan PCI buses for boot candidates from NICs or HBAs (of
which there should only be one). Disabling these scans and any other steps in the BIOS that take
time, will improve recovery performance. The grub.conf file often continues a built-in 5-second
delay for screen hold. Sometimes, every second counts.
4.1.1. Cluster Recovery Time
In RAC/GFS, the road to transaction resumption starts with GFS filesystem recovery, and this is nearly
instantaneous once fencing is complete. Oracle RAC must wait for CRS to recover the state of the
cluster, and then the RDBMS can start to recover the locks for the failed instance (LMS recovery). Once
complete, the redo logs from the failed instance must be processed. One of the surviving nodes must
acquire the redo logs of the failed node, and determine which objects need recovery. Oracle activity is
partially resumed as soon as RECO (DB recovery process) determines the list of embargoed objects
that need recovery. Once roll-forward is complete, all non-embargoed and recovered objects are
available. Oracle (and especially RAC) recovery is a complex subject, but its performance tuning can
result in reduced downtime. And that could mean $Ms in recovered revenue.
Tip
It is possible to push the CSS T imeout below 300 seconds, if the nodes can boot in 60 seconds
or less.
4.2. Network Topology
Clusterware requires a heartbeat network, and an inter-node network for moving database block
between nodes (GCS). T hese are usually the same network, and often the same network as the Red
Hat Cluster Suite network.
It is critical that Red Hat Cluster Suite operates heartbeat services over the private, bonded network and
not the public network. If the private network fails for a node, then this node must be removed from the
cluster. If the public network fails, the application tier cannot access the database on the node, but the
CRS VIP service is responsible for the public network.
Chapter 4. RAC/GFS Cluster Configuration
31