White Papers

Deep Learning Performance: Scale-up vs Scale-out
Architectures & Technologies Dell EMC | Infrastructure Solutions Group
36
7.2.4 PowerEdge C4140-K Multi Node Training vs Non-Dell EMC 8x V100-16GB-SXM2
The Non-Dell EMC 8x V100-16GB- SXM2 system was tested on Nimbix cloud.
Figure 31 shows its throughput performance of 8X SXM2 and shows the comparison versus
PowerEdge C4140-K-V100 in distributed mode (8 GPUs).
Figure 31: Training with PowerEdge C4140-K-V100-16&32GB-SXM2 (8 GPUs) – multi-node versus
Non-Dell EMC SN_8x-V100-16GB-SXM2
SN_8X V100_16GB- SXM2
MN- PowerEdge C4140-
K-V100-SXM2 (16Gb
&32GB)-IntelXeon4116
% Diff
Inception-v4
1606
1625
-1.21%
VGG-19
2449
2406
1.78%
VGG-16
2762
2820
-2.03%
Inception-v3
3077
2845
8.16%
ResNet-50
4852
4500
7.81%
GoogLeNet
7894
8754
-9.82%
AlexNet
16977
12145
39.79%
Table 5: 8x GPU Comparison between PowerEdge C4140-K multi-node and 8X SXM2
As seen from the table above, using PowerEdge C4140 with SXM2 shows pretty good
performance across various pre-trained neural models. The most common ones i.e. ResNet-50