White Papers

ManualsBrandsDell ManualsConverged InfrastructureHigh Performance Computing Solution Resources

Ready Solutions Engineering Test Results 2

the property of their respective owners. Published in the USA. Dell EMC believes the information in this document is accurate as of its publication date. The information is

subject to change without notice.

Figure 1: Performance comparison with and without Singularity

Singularity at Scale

We ran HPL across multiple nodes and compared the performance with and without container. All nodes are Dell PowerEdge C4130

configuration G with four P100-PCIe GPUs, and they are connected via Mellanox EDR InfiniBand. The result comparison is shown in

Figure 2. As we can see, the percent performance difference is within ± 0.5%. This is within normal variation range since the HPL

performance is slightly different in each run. This indicates that MPI applications such as HPL can be run at scale without performance

loss with Singularity.

Figure 2: HPL performance on multiple nodes

Conclusions

1.00

1.01

1.00

0.96

1.01

1.00

0.98

0.50

0.60

0.70

0.80

0.90

1.00

1.10

1.20

1 P100 2 P100 4 P100

Singularity over bare

-metal

Inception-V3 with ILSVRC2012 Dataset

NV-Caffe2 MXNet TensorFlow

15.9

28.4

43.5

58.4

15.9

28.4

43.7

58.5

0.0%

-0.2%

0.5%

0.2%

-10.0%

-8.0%

-6.0%

-4.0%

-2.0%

0.0%

2.0%

4.0%

6.0%

8.0%

10.0%

0.0

10.0

20.0

30.0

40.0

50.0

60.0

70.0

1 node 2 nodes 3 nodes 4 nodes

Percentage

TFLOPS

HPL on Multiple Nodes

bare-metal Singularity Perf Diff