Managing Serviceguard A.11.20, March 2013

Normally disconnecting any portion of the SCSI bus will leave the SCSI bus in an unterminated
state, which will cause I/O errors for other nodes connected to that SCSI bus, so the cluster would
need to be halted before disconnecting any portion of the SCSI bus. However, it is not necessary
to bring the cluster down to do this if you are using a SCSI configuration that allows disconnection
of a portion of the SCSI bus without losing termination.
SCSI bus configurations using SCSI in-line terminators or Y cables at each node, or using a SCSI
device which auto-terminates its ports when disconnected (such as the MSA30 MI), can allow
online repair.
1. Halt the node. You can use Serviceguard Manager to do this, or use the cmhaltnode
command. Packages should fail over normally to other nodes.
2. Remove the SCSI cable from the card.
3. Remove the defective SCSI card.
4. Install the new SCSI card. The new card must be exactly the same card type, and it must be
installed in the same slot as the card you removed. You must set the SCSI ID for the new card
to be the same as the card it is replacing.
5. Attach the new SCSI card.
6. Add the node back into the cluster. You can use Serviceguard Manager to do this, or use the
cmrunnode command.
Revoking Persistent Reservations after a Failure
For information about persistent reservations (PR) and how they work, see “iSCSI Storage and
Persistent Reservations” (page 89).
Under normal circumstances, Serviceguard clears all persistent reservations when a package halts.
In the case of a package failure or a cluster failure however, you may need to do the cleanup
yourself as part of the recovery. Use the /usr/sbin/pr_cleanup script provided by Serviceguard
to do this by specifying either a list of LUNs or volume groups. For more information, see the
pr_cleanup(1m) man page.
Examples
The following command clears all the PR registrations and reservations using the key key01 on
the set of LUNs listed in the file /tmp/pr_device_list.
pr_cleanup lun -k key01 -f /tmp/pr_device_list
/tmp/pr_device_list contains entries such as the following:
/dev/rdsk/c3t0d0
/dev/rdsk/c7t0d0
Alternatively, you can enter the device-file names on the command line:
pr_cleanup lun -k key01 /dev/rdsk/c3t0d0 /dev/rdsk/c7t0d0
The following command clears all the PR registrations and reservations using the PR key key02
on the underlying LUNs of the volume group vg01:
pr_cleanup -k key02 vg01
NOTE: Because the keyword lun is not included, the device is assumed to be a volume group.
Replacing LAN or Fibre Channel Cards
If a LAN or fibre channel card fails and the card has to be replaced, you can replace it online or
offline depending on the type of hardware and operating system you are running. It is not necessary
to bring the cluster down to do this.
Revoking Persistent Reservations after a Failure 333