Specifications

DATA CENTER BEST PRACTICES
SAN Design and Best Practices 53 of 84
Consider the IBM AIX VIO platform as an example to explain Unix workload virtualization. (Other vendor systems
such as Oracle/Sun Solaris and HP HP-UX behave somewhat differently.) NPIV came late to Unix, with IBM
recently adopting NPIV in AIX VIO 2.1 to improve trafc through the SCSI I/O abstraction layer. The difference is
illustrated in Figure 38.
fig35_SAN_Design
Pre-VIO 2.1
VIO Server VIO Client
(AIX VM)
SCSI
Emulation
Generic SCSI
Driver
VIO 2.1 with NPIV
VIO Server VIO Client
(AIX VM)
HBA Sharing
FC HBA FC HBAFC HBAFC HBA
Figure 38. Before and after IBM AIX/VOI 2.1.
Pre-NPIV implementations of VIO, shown on the left in Figure 38, performed SCSI I/O through generic SCSI
drivers in the VM (the VIO client) in an AIX Logical Partition (LPAR). The VIO server in another LPAR has actual
control of the Fibre Channel adapters and provides SCSI emulation to all VIO clients. With VIO 2.1 and later
versions, the VIO client performs I/O directly via NPIV to the Fibre Channel HBA through a virtual HBA, and the
VIO server simply controls access to HBAs installed in the system, shown on the right.
The use of NPIV signicantly reduces the complexity of the I/O abstraction layer. I/O is therefore less of a
bottleneck and allows for more LPARs on each AIX hypervisor platform. More LPARs (VMs or VIO clients) means
better consolidation ratios and the potential to save capital expenses on hypervisor platforms. I/O utilization
per Fibre Channel HBA increases, perhaps necessitating the addition of more FC adapters to accommodate the
increased workload. This in turn translates to higher trafc levels and more IOPS per HBA.
As consolidation of Unix hosts progresses, expect to see much higher activity at the edge of the fabric. As a
result you will need to monitor the fabric much more carefully to avoid both trafc and frame congestion. It is
also much more likely that the hypervisors themselves will become substantial bottlenecks.
Design Guidelines
•With the higher levels of I/O potentially occurring at each edge port in the fabric, you must ensure that there
is sufcient bandwidth and paths across the fabric to accommodate the load. Consider a lot of trunked ISLs
and lower subscription ratios on the ISLs, if at all possible. Remember that many ows are partially hidden
due to the increased use of NPIV.
•Frame congestion is also a greater possibility. Many of the VMs may still be in clusters and may require
careful conguration. Spread out the LUNs across a lot of storage ports.
•Separate the hypervisors on separate directors and, certainly, keep them separate from storage ports. This
allows you to very easily apply controls through Brocade Fabric Watch classes without affecting storage.
•Determine what latencies are tolerable to both storage and hosts (VMs and storage), and consider setting
Brocade FOS thresholds accordingly.
•Port Fencing is a powerful tool. Once many applications are running in VMs on a single physical platform, take
care to ensure that Port Fencing does not disable ports too quickly.