Maximizing File Transfer Performance Using 10Gb Ethernet and Virtualization
Using the HPN-SSH layer to replace the
OpenSSH layer drives the throughput up to
threads for the cryptographic processing
and more than doubles application layer
If improving the bulk cryptographic
processing helps that much, what could
Without bulk cryptography, scp achieves
The performance gains in these cases,
however, rely on bypassing encryption
over the default OpenSSH layer that is
provided in most Linux distributions; this
According to a presentation created by
SLAC titled “P2P Data Copy Program
bbcp” (http://www.slac.stanford.edu/
grp/scs/paper/chep01-7-018.pdf), bbcp
encrypts sensitive passwords and control
information but does not encrypt the
without encrypting the bulk data, this can
still be an effective trade-off for many
environments where data privacy is not
the best results so far, surpassing even
the HPN-SSH cases with no cryptographic
better than the initial test results with
evaluating the use of HPN-SSH and bbcp
of the techniques tried so far, however,
has even come close to achieving 10
Gbps of throughput—not even reaching
0
10
20
30
40
50
60
70
80
90
100
NETPERF SCP
(SSH)
RSYNC
(SSH)
SCP
(HPN-SSH)
RSYNC
(HPN-SSH)
SCP
(HPN-SSH +
No Crypto)
RSYNC
(HPN-SSH +
No Crypto)
BBCP
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
Avg. CPU (%Util)
Receive Throughput (Mbps)
Native Linux*: Various File Copy Tools (8 streams vs. 1 stream)
Receive Throughput – 1 Stream Receive Throughput – 8 Streams Avg. CPU (%Util) – 1 Stream Avg. CPU (%Util) – 8 Streams
and FedEx engineering team focused on
identifying any bottlenecks preventing
of the tools to enhance the performance
through multi-threading, the team also
wanted to know if any other available
To gain a better understanding of the
performance issues, the engineering
streams to attempt to drive up aggregate
obtain better utilization of the platform
show that using eight parallel streams
overcomes the threading limits of these
tools and drives aggregate bulk encrypted
These results demonstrate that using
more parallelism dramatically scales up
Figure 6.
but the testing did not demonstrate eight
times the throughput when using eight
problem does not lie in simply using the
brute-force approach and running more
These results also show that bulk
encryption is expensive, in terms of
HPN-SSH, with its multi-threaded
cryptography, still provides a significant
benefit, but not as dramatic a benefit as
The results associated with the remaining
bulk cryptography, and the third case is
the eight-thread bbcp result in which bulk
results demonstrate that it is possible to
achieve nearly the same 10 Gbps line rate
throughput number as the netperf micro-
As this testing indicates, using multiple
parallel streams and disabling bulk
cryptographic processing is effective