Designing a High Performance Network File Server

10
TRANSPORT_NAME[1]=tcp
NDD_NAME[1]=tcp_xmit_hiwater_def
NDD_VALUE[1]=262144
In addition to HP-UX clients, this engagement involved using many Linux NFS clients. It was therefore
necessary to tune the TCP send and receive windows on those systems. Linux clients use the
sysctl(8) utility to tune kernel parameters. The specific parameters that control the TCP send and
receive buffer sizes are net.ipv4.tcp_rmem and net.ipv4.tcp_wmem. These parameters take
three arguments: minimum, default and maximum send/receive buffer sizes.
The following lines were added to the /etc/sysctl.conf file on the Linux NFS clients:
net.ipv4.tcp_rmem = 4096 1048576 4194304
net.ipv4.tcp_wmem = 4096 1048576 4194304
These values instruct the Linux kernel to allocate 1 MB for the TCP send and receive buffer sizes.
These increased values allow the Linux clients to negotiate a larger send/receive window size with
the HP-UX NFS server.
Hardware Configuration Considerations
Thus far this paper has discussed the importance of optimally configuring the various software
components of the system (i.e. operating system, filesystems, networking, etc.). To achieve the
maximum throughput from any system, it’s of equal importance to configure the hardware properly.
In a server such as this one, where performance is heavily tied to networking throughput, special
consideration should be taken to ensure the network interfaces are configured to maximize efficiency.
Tip: The Importance of CPU and Cell Locality
Some of the biggest throughput gains achieved during this engagement
were the result of increasing the CPU and Cell locality of the NFS service
threads. This was achieved by allowing nfsktcpd threads to bind to specific
CPUs and then configuring those CPUs to service the interrupts of the
network interface cards where the NFS/TCP requests arrive. That way a
single CPU is involved with processing the inbound Ethernet frame,
servicing the NFS request, and sending the NFS reply back to the client.
By doing this we greatly reduce the amount of CPU cache-to-cache memory
traffic as well as cross-cell memory transfers.
The concepts involved with implementing these CPU/Cell locality
improvements will be explained in further detail in the following sections.
Optimally Assign Ethernet Card Interrupt Handling Duties
By default, interrupts for network interface cards are assigned to CPUs at system boot time in a
somewhat random fashion. These CPU/interrupt assignments may be displayed and modified via the
intctl(1M) command. Figure 2 shows an example of intctl output listing the current CPU
interrupt assignments for the installed Ethernet adapters.
In this example the server has eight CPUs (numbered 0 through 7) and they are randomly assigned to
service the interrupts of the eight “iether” Gigabit Ethernet interfaces. This example also shows CPUs
in a particular CPU-cell are not necessarily assigned to service NICs in the correspondingly numbered
card cell. Ideally the CPUs in a specific CPU-cell should be assigned to service the interrupts for the
network interfaces in the corresponding card-cell. This remapping is done via the intctl command.