Specifications
© IBM Copyright, 2012 Version: January 26, 2012
www.ibm.com/support/techdocs 48
Summary of Best Practices for Storage Area Networks
There is an additional complication with certain types of applications, such as
backup solutions. In these cases, the application vendor will also produce a matrix
of tested, supported hardware.
10.4 Separation of Production and Test Segments
A test environment is useful in many instances. A key factor for this testbed is that it
should contain at least one of the production environment’s critical devices. It is
impossible for any hardware vendor to test every combination of hardware and
software, much less extend test scenarios to include application software. It can be
very helpful to use the testbed to check new code releases against applications
known by the staff to be particularly troublesome. The testbed can be as small as a
single server with a pair of HBAs mapped to some old, slow, SAN storage and a
cast-off switch. Such a setup would be adequate to perform some internal tests on a
new failover driver.
Conversely, the test environment can be as elaborate as a miniature recreation of
the whole SAN system, including a dedicated storage system mapped from each
brand and/or model of installed storage controller, similar, albeit smaller, versions of
the switches used in the production environment, and servers “beefy” enough to run
application simulations. Such a setup can be used for application testing and
development, as well as general testing. Many shops give their internal application
development teams such setups, and use it as a “sandbox” for planned changes to
their production environments.
Why is such a test environment necessary? There are many obvious answers to
this question. One such example is multi-path software. Multipath software is pivotal
to ensuring servers maintain redundant paths to storage. Without properly
functioning multipathing, the dual physical SAN infrastructure is somewhat less
useful. IBM designs and performs test plans to ensure that during a given path
failure, no errors are reported by the operating system. However, the failover code
stream may take a few moments to verify that a path is truly down by attempting
some number of IO retries before failing the IO stream to a different path. For this
reason, it is not at all uncommon for some applications to suffer from IO timeouts
before any OS timer “fires” and issues appropriate error messages. Certainly some
application testing takes place, but it is simply impossible for IBM to test all
scenarios based on hardware and application family.