Maximizing File Transfer Performance Using 10Gb Ethernet and Virtualization

7
Figure 8 indicates that the throughput
in a VM is lower across all cases when

virtualized case, the throughput for the



scp and rsync running over standard ssh,
the throughput ranges from 300 Mbps to


Using the HPN-SSH layer to replace the

Also, disabling cryptography increases
the throughput, but not at as high a level


The same limitations that occurred in the
native case (such as standard tools not
being well threaded and cryptography
adding to the overhead) also apply in this

tools cannot take full advantage of

Most of the tools and utilities—including
ssh and scp—are single threaded; the

HPN-SSH layer to replace the OpenSSH

HPN-SSH, the cryptography operations are
multi-threaded (four threads), which boosts

threaded MAC layer, however, still creates

cryptography disabled, the performance


case with bbcp, which is multi-threaded
(using four threads by default), but the bulk

The next test uses eight parallel streams,
attempting to work around the threading

shows the receive network throughput

transfer tools when running eight parallel


Figure 8. 
0
10
20
30
40
50
60
70
80
90
100
NETPERF SCP
(SSH)
RSYNC
(SSH)
SCP
(HPN-SSH)
RSYNC
(HPN-SSH)
SCP
(HPN-SSH +
No Crypto)
RSYNC
(HPN-SSH +
No Crypto)
BBCP
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
Avg. CPU (%Util)
Receive Throughput (Mbps)
Receive Throughput – Native Receive Throughput – Virtualized
Avg. CPU (%Util) – Native Avg. CPU (%Util) – Virtualized
ESX* 4.0 GA (1 VM with 8 vCPU): Various File Copy Tools (1 stream)
Figure 9. 
0
10
20
30
40
50
60
70
80
90
100
NETPERF SCP
(SSH)
RSYNC
(SSH)
SCP
(HPN-SSH)
RSYNC
(HPN-SSH)
SCP
(HPN-SSH +
No Crypto)
RSYNC
(HPN-SSH +
No Crypto)
BBCP
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
Avg. CPU (%Util)
Receive Throughput (Mbps)
Receive Throughput – Native Receive Throughput – Virtualized
Avg. CPU (%Util) – Native Avg. CPU (%Util) – Virtualized
ESX* 4.0 GA (1VM with 8 vCPU): Various File Copy Tools (8 streams)



that cryptography operations are a limiter


the last three cases, in which the copies

Even though these results are better
compared to relying on one-stream data,

rate (approximately 10 Gbps) achieved in

one VM with eight vCPUs, the test team
determined that this might be a good case
for using direct assignment of the 10G NIC

7