Specifications

DATA CENTER BEST PRACTICES
SAN Design and Best Practices 21 of 84
Sources of high latencies include:
•Storage devices that are not optimized or where performance has deteriorated over time
•Distance links where the number of allocated buffers has been miscalculated or where the average frame
sizes of the ows traversing the links has changed over time
•Hosts where the application performance has deteriorated to the point that the host can no longer respond to
incoming frames in a sufciently timely manner
•Incorrectly congured HBAs
•Massive oversubscription on target ports and ISLs
•Tape devices
Other contributors to frame congestion include behaviors where short frames are generated in large numbers
such as:
•Clustering software that veries the integrity of attached storage
•Clustering software that uses control techniques such as SCSI RESERVE/RELEASE to serialize access to
shared le systems
•Host-based mirroring software that routinely sends SCSI control frames for mirror integrity checks
•Virtualizing environments, both workload and storage, that use in-band Fibre Channel for other
control purposes
Mitigating Congestion
Frame congestion cannot be corrected in the fabric. Devices exhibiting high latencies, whether servers or storage
arrays, must be examined and the source of poor performance eliminated. Since these are the major sources of
frame congestion, eliminating them typically addresses the vast majority of cases of frame congestion in fabrics.
Brocade has introduced a new control mechanism in an attempt to minimize the effect of some latencies in
the fabric. Edge Hold Time (EHT) is a new timeout value that can cause some blocked frames to be discarded
earlier by an ASIC in an edge switch where the devices typically are provisioned. EHT is available from Brocade
FOS v6.3.1b or later and allows for frame drops for shorter timeout intervals than the 500 milliseconds typically
dened in the Fibre Channel Standard. EHT accepts values from 500 all the way down to 80 milliseconds. The
EHT default setting for F_Ports is 220 milliseconds and the default EHT setting for E_Ports is 500 milliseconds.
Note that an I/O retry is required for each of the dropped frames, so this solution does not completely address
high-latency device issues.
EHT applies to all the F_Ports on a switch and all the E_Ports that share the same ASIC as F_Ports. It is a good
practice to place servers and ISLs on different ASICs since the EHT value applies to the entire ASIC, and it is
recommended that the ISL EHT stay at 500 ms.
Note: EHT applies to the switch and is activated on any ASIC that contains a F_Port. (For example, if EHT is set to
250ms and the ASIC contains F_Ports and E_Ports, the timeout value for all the ports is 250 ms.
Behaviors that generate frequent large numbers of short frames cannot typically be changed—they are part
of the standard behavior of some fabric-based applications or products. As long as the major latencies are
controlled, fabrics tolerate this behavior well.
Monitoring
A recent Brocade FOS feature, Bottleneck Detection, was introduced to directly identify device and link latencies
and high link utilization.
Bottleneck Detection, when applied to F_Ports (devices) detects high-latency devices and provides notication on
the nature and duration of the latency. This is a huge advantage to the storage administrator, because there is
now a centralized facility that can potentially detect storage latencies while they are still intermittent.