Managing Serviceguard Fifteenth Edition, reprinted May 2008
Designing Highly Available Cluster Applications
Controlling the Speed of Application Failover
Appendix C 459
Use Raw Volumes
If your application uses data, use raw volumes rather than filesystems.
Raw volumes do not require an fsck of the filesystem, thus eliminating
one of the potentially lengthy steps during a failover.
Evaluate the Use of JFS
If a file system must be used, a JFS offers significantly faster file system
recovery as compared to an HFS. However, performance of the JFS may
vary with the application.
Minimize Data Loss
Minimize the amount of data that might be lost at the time of an
unplanned outage. It is impossible to prevent some data from being lost
when a failure occurs. However, it is advisable to take certain actions to
minimize the amount of data that will be lost, as explained in the
following discussion.
Minimize the Use and Amount of Memory-Based Data
Any in-memory data (the in-memory context) will be lost when a failure
occurs. The application should be designed to minimize the amount of
in-memory data that exists unless this data can be easily recalculated.
When the application restarts on the standby node, it must recalculate or
reread from disk any information it needs to have in memory.
One way to measure the speed of failover is to calculate how long it takes
the application to start up on a normal system after a reboot. Does the
application start up immediately? Or are there a number of steps the
application must go through before an end-user can connect to it?
Ideally, the application can start up quickly without having to
reinitialize in-memory data structures or tables.
Performance concerns might dictate that data be kept in memory rather
than written to the disk. However, the risk associated with the loss of
this data should be weighed against the performance impact of posting
the data to the disk.
Data that is read from a shared disk into memory, and then used as
read-only data can be kept in memory without concern.