Managing Serviceguard 12th Edition, March 2006
Troubleshooting Your Cluster
Replacement of I/O Cards
Chapter 8368
Replacement of I/O Cards
Replacement of SCSI host bus adapters
After a SCSI Host Bus Adapter (HBA) card failure, you can replace the
card using the following steps.
Normally disconnecting any portion of the SCSI bus will leave the SCSI
bus in an unterminated state, which will cause I/O errors for other nodes
connected to that SCSI bus, so the cluster would need to be halted before
disconnecting any portion of the SCSI bus. However, it is not necessary to
bring the cluster down to do this if you are using a SCSI configuration
that allows disconnection of a portion of the SCSI bus without losing
termination.
SCSI bus configurations using SCSI inline terminators or Y cables at
each node, or using a SCSI device which auto-terminates its ports when
disconnected (such as the MSA30 MI) can allow online repair.
1. Halt the node. In Serviceguard Manager, select the node; from the
Actions menu, choose Administering Serviceguard -> Halt node.
Or, from the Serviceguard command line, use the cmhaltnode
command. Packages should fail over normally to other nodes.
2. Remove the SCSI cable from the card.
3. Using SAM, select the option to do an on-line replacement of an I/O
card.
4. Remove the defective SCSI card.
5. Install the new SCSI card. The new card must be exactly the same
card type, and it must be installed in the same slot as the card you
removed. You must set the SCSI ID for the new card to be the same
as the card it is replacing.
6. In SAM, select the option to attach the new SCSI card.
7. Add the node back into the cluster. In Serviceguard Manager, select
the node; from the Actions menu, choose Administering Serviceguard
-> Run Node. Or, from the Serviceguard command line, issue the
cmrunnode command.