HP Serviceguard A.11.20- Managing Serviceguard Twentieth Edition, August 2011

Replacing Disks....................................................................................................................324
Replacing a Faulty Array Mechanism..................................................................................324
Replacing a Faulty Mechanism in an HA Enclosure..............................................................324
Replacing a Lock Disk.......................................................................................................325
Replacing a Lock LUN......................................................................................................325
Online Hardware Maintenance with In-line SCSI Terminator .................................................326
Replacing I/O Cards............................................................................................................326
Replacing SCSI Host Bus Adapters.....................................................................................326
Replacing LAN or Fibre Channel Cards...................................................................................327
Offline Replacement.........................................................................................................327
Online Replacement.........................................................................................................327
After Replacing the Card..................................................................................................328
Replacing a Failed Quorum Server System...............................................................................328
Troubleshooting Approaches .................................................................................................329
Reviewing Package IP Addresses .......................................................................................329
Reviewing the System Log File ...........................................................................................329
Sample System Log Entries ...........................................................................................330
Reviewing Object Manager Log Files .................................................................................330
Reviewing Serviceguard Manager Log Files ........................................................................331
Reviewing the System Multi-node Package Files....................................................................331
Reviewing Configuration Files ...........................................................................................331
Reviewing the Package Control Script ................................................................................331
Using the cmcheckconf Command......................................................................................331
Using the cmviewconf Command.......................................................................................332
Reviewing the LAN Configuration ......................................................................................332
Solving Problems .................................................................................................................332
Serviceguard Command Hangs.........................................................................................332
Networking and Security Configuration Errors.....................................................................333
Cluster Re-formations Caused by Temporary Conditions........................................................333
Cluster Re-formations Caused by MEMBER_TIMEOUT Being Set too Low.................................333
System Administration Errors .............................................................................................334
Package Control Script Hangs or Failures ......................................................................334
Problems with Cluster File System (CFS)...............................................................................336
Problems with VxVM Disk Groups......................................................................................337
Force Import and Deport After Node Failure...................................................................337
Package Movement Errors ................................................................................................337
Node and Network Failures .............................................................................................337
Troubleshooting the Quorum Server....................................................................................338
Authorization File Problems...........................................................................................338
Timeout Problems........................................................................................................338
Messages...................................................................................................................339
A Enterprise Cluster Master Toolkit ..............................................................340
B Designing Highly Available Cluster Applications ........................................341
Automating Application Operation ........................................................................................341
Insulate Users from Outages .............................................................................................342
Define Application Startup and Shutdown ..........................................................................342
Controlling the Speed of Application Failover ..........................................................................342
Replicate Non-Data File Systems .......................................................................................343
Use Raw Volumes ...........................................................................................................343
Evaluate the Use of JFS ....................................................................................................343
Minimize Data Loss .........................................................................................................343
Minimize the Use and Amount of Memory-Based Data ....................................................343
Keep Logs Small ........................................................................................................343
Eliminate Need for Local Data .....................................................................................343
14 Contents