Managing Serviceguard A.11.20, March 2013
A Enterprise Cluster Master Toolkit ..............................................................346
B Designing Highly Available Cluster Applications ........................................347
Automating Application Operation ........................................................................................347
Insulate Users from Outages .............................................................................................348
Define Application Startup and Shutdown ..........................................................................348
Controlling the Speed of Application Failover ..........................................................................348
Replicate Non-Data File Systems .......................................................................................349
Use Raw Volumes ...........................................................................................................349
Evaluate the Use of JFS ....................................................................................................349
Minimize Data Loss .........................................................................................................349
Minimize the Use and Amount of Memory-Based Data ....................................................349
Keep Logs Small ........................................................................................................349
Eliminate Need for Local Data .....................................................................................349
Use Restartable Transactions .............................................................................................350
Use Checkpoints .............................................................................................................350
Balance Checkpoint Frequency with Performance ...........................................................350
Design for Multiple Servers ..............................................................................................350
Design for Replicated Data Sites .......................................................................................351
Designing Applications to Run on Multiple Systems ..................................................................351
Avoid Node-Specific Information .......................................................................................351
Obtain Enough IP Addresses .......................................................................................352
Allow Multiple Instances on Same System ......................................................................352
Avoid Using SPU IDs or MAC Addresses ............................................................................352
Assign Unique Names to Applications ...............................................................................352
Use DNS ..................................................................................................................352
Use uname(2) With Care .................................................................................................353
Bind to a Fixed Port .........................................................................................................353
Bind to Relocatable IP Addresses ......................................................................................353
Call bind() before connect() .........................................................................................354
Give Each Application its Own Volume Group ....................................................................354
Use Multiple Destinations for SNA Applications ..................................................................354
Avoid File Locking ...........................................................................................................354
Using a Relocatable Address as the Source Address for an Application that is Bound to
INADDR_ANY.....................................................................................................................355
Restoring Client Connections .................................................................................................356
Handling Application Failures ...............................................................................................357
Create Applications to be Failure Tolerant ..........................................................................357
Be Able to Monitor Applications .......................................................................................357
Minimizing Planned Downtime ..............................................................................................358
Reducing Time Needed for Application Upgrades and Patches .............................................358
Provide for Rolling Upgrades .......................................................................................358
Do Not Change the Data Layout Between Releases .........................................................358
Providing Online Application Reconfiguration .....................................................................359
Documenting Maintenance Operations ..............................................................................359
C Integrating HA Applications with Serviceguard..........................................360
Checklist for Integrating HA Applications ................................................................................360
Defining Baseline Application Behavior on a Single System ..................................................360
Integrating HA Applications in Multiple Systems ..................................................................361
Testing the Cluster ...........................................................................................................362
D Software Upgrades ...............................................................................363
Special Considerations for Upgrade to Serviceguard A.11.20.....................................................363
Special Considerations for Upgrade to Serviceguard A.11.19.......................................................363
How To Tell when the Cluster Re-formation Is Complete.........................................................364
Contents 15