Managing Serviceguard Sixteenth Edition, March 2009

A Enterprise Cluster Master Toolkit .............................................................................................385
B Designing Highly Available Cluster Applications .......................................................................387
Automating Application Operation ................................................................................387
Insulate Users from Outages ......................................................................................388
Define Application Startup and Shutdown ................................................................388
Controlling the Speed of Application Failover ................................................................389
Replicate Non-Data File Systems ...............................................................................389
Use Raw Volumes .......................................................................................................389
Evaluate the Use of JFS ...............................................................................................389
Minimize Data Loss ....................................................................................................389
Minimize the Use and Amount of Memory-Based Data ......................................390
Keep Logs Small ....................................................................................................390
Eliminate Need for Local Data ..............................................................................390
Use Restartable Transactions ......................................................................................390
Use Checkpoints .........................................................................................................391
Balance Checkpoint Frequency with Performance ...............................................391
Design for Multiple Servers ........................................................................................391
Design for Replicated Data Sites ................................................................................392
Designing Applications to Run on Multiple Systems .....................................................392
Avoid Node-Specific Information ..............................................................................393
Obtain Enough IP Addresses ................................................................................393
Allow Multiple Instances on Same System ...........................................................393
Avoid Using SPU IDs or MAC Addresses .................................................................394
Assign Unique Names to Applications ......................................................................394
Use DNS ................................................................................................................394
Use uname(2) With Care ............................................................................................395
Bind to a Fixed Port ....................................................................................................395
Bind to Relocatable IP Addresses ...............................................................................395
Call bind() before connect() ...................................................................................396
Give Each Application its Own Volume Group .........................................................396
Use Multiple Destinations for SNA Applications ......................................................397
Avoid File Locking ......................................................................................................397
Restoring Client Connections ..........................................................................................397
Handling Application Failures ........................................................................................398
Create Applications to be Failure Tolerant .................................................................399
Be Able to Monitor Applications ................................................................................399
Minimizing Planned Downtime ......................................................................................400
Reducing Time Needed for Application Upgrades and Patches ...............................400
Provide for Rolling Upgrades ...............................................................................400
Do Not Change the Data Layout Between Releases .............................................401
Providing Online Application Reconfiguration .........................................................401
Documenting Maintenance Operations .....................................................................401
16 Table of Contents