White Papers
Ready Solutions Engineering Test Results 2
Copyright © 2017 Dell Inc. or its subsidiaries. All Rights Reserved. Dell, EMC, and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks may be
the property of their respective owners. Published in the USA. Dell EMC believes the information in this document is accurate as of its publication date. The information is
subject to change without notice.
Figure 1: Performance comparison with and without Singularity
Singularity at Scale
We ran HPL across multiple nodes and compared the performance with and without container. All nodes are Dell PowerEdge C4130
configuration G with four P100-PCIe GPUs, and they are connected via Mellanox EDR InfiniBand. The result comparison is shown in
Figure 2. As we can see, the percent performance difference is within ± 0.5%. This is within normal variation range since the HPL
performance is slightly different in each run. This indicates that MPI applications such as HPL can be run at scale without performance
loss with Singularity.
Figure 2: HPL performance on multiple nodes
Conclusions
1.00
1.01
1.00
1.00
1.00
0.96
1.01
1.00
0.98
0.50
0.60
0.70
0.80
0.90
1.00
1.10
1.20
1 P100 2 P100 4 P100
Singularity over bare
-metal
Inception-V3 with ILSVRC2012 Dataset
NV-Caffe2 MXNet TensorFlow
15.9
28.4
43.5
58.4
15.9
28.4
43.7
58.5
0.0%
-0.2%
0.5%
0.2%
-10.0%
-8.0%
-6.0%
-4.0%
-2.0%
0.0%
2.0%
4.0%
6.0%
8.0%
10.0%
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
1 node 2 nodes 3 nodes 4 nodes
Percentage
TFLOPS
HPL on Multiple Nodes
bare-metal Singularity Perf Diff