Help

Table Of Contents
Guidelines
Front-end performance should be compared to baseline numbers (native host to storage array) when metro node
performance issues arise. Poor throughput might be caused by poor storage array performance. Before adding metro node
to your environment, know what your application throughput was beforehand.
Front-end performance in metro node depends heavily upon the available back-end storage array performance, and in metro
node Metro configurations, the WAN performance for distributed-devices.
Any running distributed rebuilds or data migrations could negatively affect available host throughput.
Since metro node Local and Metro implement write through caching, a small amount of write latency overhead (typically
<1msec) is expected with metro node. This latency could affect applications that serialize their I/O and do not take
advantage of multiple outstanding operations. These types of applications may see a throughput and IOPS drop with metro
node in the data path.
Understand that in a metro node Metro environment you will incur extra WAN round-trip time on your write latency since
writes need to be successfully written to each cluster's storage before the host is acknowledged. Again, this extra latency
could impact the throughput and IOPS of serialized-type applications.
Corrective actions
Check CPU busy: If the CPU is overly busy, metro node becomes limited on the amount of throughput it can provide.
Check back-end latency: If on average the back-end latency is large, or there are large spikes, there could be a poorly
performing back-end fabric or an unhealthy, un-optimized, or over-loaded storage-array. Perform a back-end fabric analysis,
and a performance analysis of all storage-arrays in question.
Check front-end aborts: The presence of these indicate that metro node is taking too long to respond to the host. This
might indicate problems with the front-end fabric, or slow SCSI reservations.
Check back-end errors: If the metro node back-end is required to retry an operation because it is aborted, then this will add
to the delay in completing the operation to the host.
Check front-end queue depth: If this counter is large, this may explain larger than normal front-end latency. Follow front-end
operations count corrective actions.
Check metro node write delta time: If the time spent within metro node is more than usual, attempt to find out why. See
corrective actions for write delta time.
Verify the front-end average I/O Size, and confirm that you are sending small block I/O if you are trying to boost IOPS
performance.
Check for bandwidth/IOPS over-provisioned metro node front-end ports: Be sure to balance hosts and LUNs across the
available directors and front-end ports presented from metro node. Check the front-end fabric for saturation/over-capacity.
Verify front-end Fibre Channel ports, HBAs, and switch ports are configured to the proper port speeds.
For host multipathing software, configure ports based on metro node best practices, and ensure the installed software
versions are compatible with metro node (see the metro node Simple Support Matrix.)
Metro node Metro
For metro node Metro configurations, check the inter-cluster link health and maximum performance capabilities. From the
metro node Management GUI, check the observed inter-cluster WAN bandwidth. If your application throughput appears low,
and only seems to achieve something similar to what the WAN bandwidth reports, then chances are you are limited by the
WAN. Therefore:
Make sure you have provisioned enough inter-cluster bandwidth for the desired application workload. Verify that
your WAN configuration is supported by metro node (minimum supported bandwidth, supported inter-cluster latency,
compatible WAN hardware and software).
If using Fibre Channel devices over dark fibre or DWDM, confirm that you have allocated enough buffer credits or
configured the Fibre Channel WAN ports properly on your switches. Check for buffer credit starvation, c3 discards, and
CRC errors. Some vendors may require extended fabric licenses to enable WAN features.
Validate your WAN performance before going live in production. Create multiple test distributed-devices and force them
to rebuild. Observe the performance of the rebuilds.
When troubleshooting distributed-device performance, if feasible, check local device performance. Export a test LUN from
your storage array to metro node then to the host, and run a test I/O workload.
Check for any unexpected local or distributed rebuilds or data migrations. There will be some amount of performance impact
to host application traffic that relies on the same virtual-volumes and storage volumes. Tune the rebuild transfer-size setting
using the CLI to limit rebuild/migration performance impact. Consider scheduling migrations during off-peak hours.
42
Monitoring the system