Managing HP Serviceguard A.12.00.00 for Linux, June 2014

10.8.7.1 Authorization File Problems..............................................................................280
10.8.7.2 Timeout Problems...........................................................................................280
10.8.7.3 Messages.....................................................................................................281
10.8.8 Lock LUN Messages...............................................................................................281
10.9 Troubleshooting serviceguard-xdc package.......................................................................281
A Designing Highly Available Cluster Applications .......................................283
A.1 Automating Application Operation ...................................................................................283
A.1.1 Insulate Users from Outages .....................................................................................283
A.1.2 Define Application Startup and Shutdown ..................................................................284
A.2 Controlling the Speed of Application Failover ....................................................................284
A.2.1 Replicate Non-Data File Systems ...............................................................................284
A.2.2 Evaluate the Use of a Journaled Filesystem (JFS)..........................................................285
A.2.3 Minimize Data Loss ................................................................................................285
A.2.3.1 Minimize the Use and Amount of Memory-Based Data .........................................285
A.2.3.2 Keep Logs Small .............................................................................................285
A.2.3.3 Eliminate Need for Local Data .........................................................................285
A.2.4 Use Restartable Transactions ....................................................................................285
A.2.5 Use Checkpoints ....................................................................................................286
A.2.5.1 Balance Checkpoint Frequency with Performance ................................................286
A.2.6 Design for Multiple Servers .....................................................................................286
A.2.7 Design for Replicated Data Sites ..............................................................................287
A.3 Designing Applications to Run on Multiple Systems ............................................................287
A.3.1 Avoid Node Specific Information ..............................................................................287
A.3.1.1 Obtain Enough IP Addresses .............................................................................288
A.3.1.2 Allow Multiple Instances on Same System ...........................................................288
A.3.2 Avoid Using SPU IDs or MAC Addresses ...................................................................288
A.3.3 Assign Unique Names to Applications ......................................................................288
A.3.3.1 Use DNS .......................................................................................................288
A.3.4 Use uname(2) With Care ........................................................................................289
A.3.5 Bind to a Fixed Port ................................................................................................289
A.3.6 Bind to Relocatable IP Addresses .............................................................................289
A.3.6.1 Call bind() before connect() ..............................................................................290
A.3.7 Give Each Application its Own Volume Group ...........................................................290
A.3.8 Use Multiple Destinations for SNA Applications .........................................................290
A.3.9 Avoid File Locking ..................................................................................................290
A.4 Restoring Client Connections ...........................................................................................290
A.5 Handling Application Failures .........................................................................................291
A.5.1 Create Applications to be Failure Tolerant ..................................................................291
A.5.2 Be Able to Monitor Applications ..............................................................................292
A.6 Minimizing Planned Downtime ........................................................................................292
A.6.1 Reducing Time Needed for Application Upgrades and Patches .....................................292
A.6.1.1 Provide for Rolling Upgrades .............................................................................292
A.6.1.2 Do Not Change the Data Layout Between Releases ..............................................293
A.6.2 Providing Online Application Reconfiguration ............................................................293
A.6.3 Documenting Maintenance Operations .....................................................................293
B Integrating HA Applications with Serviceguard...........................................295
B.1 Checklist for Integrating HA Applications ...........................................................................295
B.1.1 Defining Baseline Application Behavior on a Single System ...........................................296
B.1.2 Integrating HA Applications in Multiple Systems ..........................................................296
B.1.3 Testing the Cluster ...................................................................................................296
Contents 13