Designing a High Performance Network File Server

TRANSPORT_NAME[1]=tcp

NDD_NAME[1]=tcp_xmit_hiwater_def

NDD_VALUE[1]=262144

In addition to HP-UX clients, this engagement involved using many Linux NFS clients. It was therefore

necessary to tune the TCP send and receive windows on those systems. Linux clients use the

sysctl(8) utility to tune kernel parameters. The specific parameters that control the TCP send and

receive buffer sizes are net.ipv4.tcp_rmem and net.ipv4.tcp_wmem. These parameters take

three arguments: minimum, default and maximum send/receive buffer sizes.

The following lines were added to the /etc/sysctl.conf file on the Linux NFS clients:

net.ipv4.tcp_rmem = 4096 1048576 4194304

net.ipv4.tcp_wmem = 4096 1048576 4194304

These values instruct the Linux kernel to allocate 1 MB for the TCP send and receive buffer sizes.

These increased values allow the Linux clients to negotiate a larger send/receive window size with

the HP-UX NFS server.

Hardware Configuration Considerations

Thus far this paper has discussed the importance of optimally configuring the various software

components of the system (i.e. operating system, filesystems, networking, etc.). To achieve the

maximum throughput from any system, it’s of equal importance to configure the hardware properly.

In a server such as this one, where performance is heavily tied to networking throughput, special

consideration should be taken to ensure the network interfaces are configured to maximize efficiency.

Tip: The Importance of CPU and Cell Locality

Some of the biggest throughput gains achieved during this engagement

were the result of increasing the CPU and Cell locality of the NFS service

threads. This was achieved by allowing nfsktcpd threads to bind to specific

CPUs and then configuring those CPUs to service the interrupts of the

network interface cards where the NFS/TCP requests arrive. That way a

single CPU is involved with processing the inbound Ethernet frame,

servicing the NFS request, and sending the NFS reply back to the client.

By doing this we greatly reduce the amount of CPU cache-to-cache memory

traffic as well as cross-cell memory transfers.

The concepts involved with implementing these CPU/Cell locality

improvements will be explained in further detail in the following sections.

Optimally Assign Ethernet Card Interrupt Handling Duties

By default, interrupts for network interface cards are assigned to CPUs at system boot time in a

somewhat random fashion. These CPU/interrupt assignments may be displayed and modified via the

intctl(1M) command. Figure 2 shows an example of intctl output listing the current CPU

interrupt assignments for the installed Ethernet adapters.

In this example the server has eight CPUs (numbered 0 through 7) and they are randomly assigned to

service the interrupts of the eight “iether” Gigabit Ethernet interfaces. This example also shows CPUs

in a particular CPU-cell are not necessarily assigned to service NICs in the correspondingly numbered

card cell. Ideally the CPUs in a specific CPU-cell should be assigned to service the interrupts for the

network interfaces in the corresponding card-cell. This remapping is done via the intctl command.