Managing Serviceguard Nineteenth Edition, Reprinted June 2011
Using the cmcheckconf Command......................................................................................318
Using the cmviewconf Command.......................................................................................319
Reviewing the LAN Configuration ......................................................................................319
Solving Problems .................................................................................................................319
Serviceguard Command Hangs.........................................................................................319
Networking and Security Configuration Errors.....................................................................320
Cluster Re-formations Caused by Temporary Conditions........................................................320
Cluster Re-formations Caused by MEMBER_TIMEOUT Being Set too Low.................................320
System Administration Errors .............................................................................................321
Package Control Script Hangs or Failures ......................................................................321
Problems with Cluster File System (CFS)...............................................................................323
Problems with VxVM Disk Groups......................................................................................323
Force Import and Deport After Node Failure...................................................................324
Package Movement Errors ................................................................................................324
Node and Network Failures .............................................................................................324
Troubleshooting the Quorum Server....................................................................................325
Authorization File Problems...........................................................................................325
Timeout Problems........................................................................................................325
Messages...................................................................................................................325
A Enterprise Cluster Master Toolkit ..............................................................326
B Designing Highly Available Cluster Applications ........................................327
Automating Application Operation ........................................................................................327
Insulate Users from Outages .............................................................................................328
Define Application Startup and Shutdown ..........................................................................328
Controlling the Speed of Application Failover ..........................................................................328
Replicate Non-Data File Systems .......................................................................................329
Use Raw Volumes ...........................................................................................................329
Evaluate the Use of JFS ....................................................................................................329
Minimize Data Loss .........................................................................................................329
Minimize the Use and Amount of Memory-Based Data ....................................................329
Keep Logs Small ........................................................................................................329
Eliminate Need for Local Data .....................................................................................329
Use Restartable Transactions .............................................................................................330
Use Checkpoints .............................................................................................................330
Balance Checkpoint Frequency with Performance ...........................................................330
Design for Multiple Servers ..............................................................................................330
Design for Replicated Data Sites .......................................................................................331
Designing Applications to Run on Multiple Systems ..................................................................331
Avoid Node-Specific Information .......................................................................................331
Obtain Enough IP Addresses .......................................................................................332
Allow Multiple Instances on Same System ......................................................................332
Avoid Using SPU IDs or MAC Addresses ............................................................................332
Assign Unique Names to Applications ...............................................................................332
Use DNS ..................................................................................................................332
Use uname(2) With Care .................................................................................................333
Bind to a Fixed Port .........................................................................................................333
Bind to Relocatable IP Addresses ......................................................................................333
Call bind() before connect() .........................................................................................334
Give Each Application its Own Volume Group ....................................................................334
Use Multiple Destinations for SNA Applications ..................................................................334
Avoid File Locking ...........................................................................................................334
Using a Relocatable Address as the Source Address for an Application that is Bound to
INADDR_ANY.....................................................................................................................335
Restoring Client Connections .................................................................................................336
14 Contents