User guide
DataDirect Networks SFA™ OS 2.0.0 Release Notes Revision A1 | 15
• To improve failover time and to prevent I/O errors, the following settings in
multipath.conf are recommended:
(These settings can be changed for the defaults section which will apply to all
devices, or just under the SFA devices):
checker_timeout 5
dev_loss_tmo 10
fast_io_fail_tmo 5
These settings are included in the DDN multipath package version 1.5-5 and above.
• When the controller is preparing to shutdown, it will first put all its pools into write-
through mode and attempt to flush all the dirty cache. On SFA platforms with
multiple RAID processors (RP), in the case where one RP finishes flushing its cache
before the other RP, the first RP to finish flushing its cache will not service I/O from
the host until the SFA reboots. This may cause I/O errors on the host and cause
applications on the hosts to hang and eventually time out. To work around this issue,
reduce I/O load during planned maintenance activities such as firmware upgrades
and reboots of the controllers so that the flush activities will complete quickly.
• With RHEL6.2 and OFED 1.5.4.0 in an IB switch attached environment, it is possible
that a virtual disk on a controller may not be added back to the multipath device map
after a failover.
To find the offline device, issue the command:
lsscsi | awk -F/ '{print $NF}' | while read a; do printf "%s " $a; cat
,/sys/block/$a/device/state; done
To bring the device back online, issue the command:
#echo running > /sys/block/<sd??>/device/state and #multipath -r
where you replace <sd??> with the appropriate sd, for example sdaf, found from the
previous command.
A workaround for this issue is to update these packages:
o device-mapper:1.02.74-10.e16
o device-mapper-multipath: 0.4.9-56.el6_3.1.x86_64
DDN recommends that you install these as soon as possible.
• In an InfiniBand switch environment, there is a small chance that if a cable between
the switch and the controller is pulled, both the physical link and the logical link will
be lost. This has occurred with a Mellanox 6025F Switch and a Mellanox HCA.
• In an InfiniBand switch environment running RHEL 5.7, if a cable is pulled from
either an initiator or a target, a failover occurs as expected; however, once the
connection is reestablished, it does not fail back to the original controller.
o In order to resolve this issue, you must issue the command,
# udevtrigger.
o The Linux man pages state that the command will simply “request kernel
devices events for coldplug”. This will force udev to send a notification for
newly discovered path, which will allow the multipath daemon to detect that
the path has returned. After the multipath discovery takes place, I/Os can be
rebalanced back to allow the preferred paths to be used.