Managing Serviceguard Extension for SAP, December 2007

SAP Supply Chain Management
More About Hot Standby
Chapter 4 205
A hot standby liveCache is a second liveCache instance that runs with
the same System ID as the original master liveCache. It will be waiting
on the secondary node of the cluster during normal operation. A failover
of the liveCache cluster package does not require any time consuming
filesystem move operations or instance restarts. The hot standby simply
gets notified to promote itself to become the new master. The
Serviceguard cluster software will make sure that the primary system is
shut down already in order to prevent a split-brain situation in which
two liveCache systems try to serve the same purpose. Thus, a hot
standby scenario provides extremely fast and reliable failover. The delay
caused by a failover becomes predictable and tunable. No liveCache data
inconsistencies can occur during failover.
The hot standby mechanism also includes data replication. The standby
maintains its own set of liveCache data on storage at all times.
SGeSAP provides a runtime library to liveCache that allows to
automatically create a valid local set of liveCache devspace data via
Storageworks XP Business Copy volume pairs (pvol/svol BCVs) as part of
the standby startup. If required, the master liveCache can remain
running during this operation. The copy utilizes fast storage replication
mechanisms within the storage array hardware to keep the effect on the
running master liveCache minimal. Once the volume pairs are
synchronized, they get split up immediately. During normal operation,
each of the two liveCache instances operates on a set of LUNs in
SIMPLE (SMPL) state.
The detection of volumes that need replication as part of the standby
startup is dynamically identified within the startup procedure of the
standby. It does not require manual maintenance steps to trigger volume
pair synchronizations and subsequent split operations. Usually,
synchronizations occur only in rare cases, for example for the first
startup of a standby or if a standby got intentionally shut down for a
longer period of time. In all other cases, the liveCache logging devspaces
will contain enough delta information to update the standby data
without the requirement to do hardware replications of full LUNs.
The ongoing operation of the standby as well as the master failover does
not require the business copy mechanisms. The standby synchronizes the
data regularly by accessing the master log files, which therefore reside
on CVM/CFS volumes. No liveCache content data needs to be transferred
via LAN at any point in time.