White Papers

Deep Learning Performance: Scale-up vs Scale-out
Architectures & Technologies Dell EMC | Infrastructure Solutions Group
37
Figure 31: Training with PowerEdge C4140-K-V100-16&32GB-SXM2 (8 GPUs) multi-node versus
Non-Dell EMC SN_8x-V100-16GB-SXM2
SN_8X V100_16GB- SXM2
MN- PowerEdge C4140-
K-V100-SXM2 (16Gb
&32GB)-IntelXeon4116
% Diff
Inception-v4
1606
1625
-1.21%
VGG-19
2449
2406
1.78%
VGG-16
2762
2820
-2.03%
Inception-v3
3077
2845
8.16%
ResNet-50
4852
4500
7.81%
GoogLeNet
7894
8754
-9.82%
AlexNet
16977
12145
39.79%
Table 5: 8x GPU Comparison between PowerEdge C4140-K multi-node and 8X SXM2
As seen from the table above, using PowerEdge C4140 with SXM2 shows pretty good
performance across various pre-trained neural models. The most common ones i.e. ResNet-50
and Inception-v3 show performance within 8% of 8X SXM2. The only exception is AlexNet where
it shows quite a bit of difference between 8X SXM2 and PowerEdge C4140.
The good performance shown by PowerEdge C4140 in multi node mode, comparable to a single
node server 8x V100-16GB, was reached after the right software stack configuration with the