User`s guide
20 007-5476-001
4: Performance Tuning
Load Balancing the Receive Activity Among CPUs
The adapter can be configured to use a maximum of eight receive queues. The adapter
will be spread across all configured Rx queues.
Multiple Rx queues are available when the driver is loaded in MSI-X mode (default
mode). In this mode, once the driver is loaded, the number of queues must be specified
before the port is configured.
The interface cxgbtool(8) provides this capability:
cxgbtool intf qsets 8
Each queue must then be associated with a CPU through interrupt affinity. This sample
shell script allows this operation:
irqs=($(cat /proc/interrupts | grep <intf> | \
grep queue | awk ‘{ split($0,a,”:”); print a[1] }’))
cpumask=1
for (( c=0; c < ${#irqs[@]}; c++ ));
do
echo $cpumask > /proc/irq/${irqs[$c]}/smp_affinity
cpumask=`expr $cpumask \* 2`
done
Once the port is configured, receive traffic will be balanced between the CPUs associated
with a queue.
Latency/Throughput Tuning
The adapter is tuned by default for good latency with the interrupt holdoff timer set to 5
usecs. The setting can result in high interrupt load. If latency is not the primary target,
you might want to increase it.
Currently, ethtool(8) is not well suited to deal with hardware that supports multiple
receive queues. Interface cxgbtool(8) provides the facility to control the interrupt
holdoff timer on a per-receive-queue basis.