Specifications

DATA CENTER BEST PRACTICES

SAN Design and Best Practices 53 of 84

Consider the IBM AIX VIO platform as an example to explain Unix workload virtualization. (Other vendor systems

such as Oracle/Sun Solaris and HP HP-UX behave somewhat differently.) NPIV came late to Unix, with IBM

recently adopting NPIV in AIX VIO 2.1 to improve trafc through the SCSI I/O abstraction layer. The difference is

illustrated in Figure 38.

fig35_SAN_Design

Pre-VIO 2.1

VIO Server VIO Client

(AIX VM)

SCSI

Emulation

Generic SCSI

Driver

VIO 2.1 with NPIV

VIO Server VIO Client

(AIX VM)

HBA Sharing

FC HBA FC HBAFC HBAFC HBA

Figure 38. Before and after IBM AIX/VOI 2.1.

Pre-NPIV implementations of VIO, shown on the left in Figure 38, performed SCSI I/O through generic SCSI

drivers in the VM (the VIO client) in an AIX Logical Partition (LPAR). The VIO server in another LPAR has actual

control of the Fibre Channel adapters and provides SCSI emulation to all VIO clients. With VIO 2.1 and later

versions, the VIO client performs I/O directly via NPIV to the Fibre Channel HBA through a virtual HBA, and the

VIO server simply controls access to HBAs installed in the system, shown on the right.

The use of NPIV signicantly reduces the complexity of the I/O abstraction layer. I/O is therefore less of a

bottleneck and allows for more LPARs on each AIX hypervisor platform. More LPARs (VMs or VIO clients) means

better consolidation ratios and the potential to save capital expenses on hypervisor platforms. I/O utilization

per Fibre Channel HBA increases, perhaps necessitating the addition of more FC adapters to accommodate the

increased workload. This in turn translates to higher trafc levels and more IOPS per HBA.

As consolidation of Unix hosts progresses, expect to see much higher activity at the edge of the fabric. As a

result you will need to monitor the fabric much more carefully to avoid both trafc and frame congestion. It is

also much more likely that the hypervisors themselves will become substantial bottlenecks.

Design Guidelines

•With the higher levels of I/O potentially occurring at each edge port in the fabric, you must ensure that there

is sufcient bandwidth and paths across the fabric to accommodate the load. Consider a lot of trunked ISLs

and lower subscription ratios on the ISLs, if at all possible. Remember that many ows are partially hidden

due to the increased use of NPIV.

•Frame congestion is also a greater possibility. Many of the VMs may still be in clusters and may require

careful conguration. Spread out the LUNs across a lot of storage ports.

•Separate the hypervisors on separate directors and, certainly, keep them separate from storage ports. This

allows you to very easily apply controls through Brocade Fabric Watch classes without affecting storage.

•Determine what latencies are tolerable to both storage and hosts (VMs and storage), and consider setting

Brocade FOS thresholds accordingly.

•Port Fencing is a powerful tool. Once many applications are running in VMs on a single physical platform, take

care to ensure that Port Fencing does not disable ports too quickly.