White Papers

Deep Learning Performance: Scale-up vs Scale-out

Architectures & Technologies Dell EMC | Infrastructure Solutions Group

Figure 31: Training with PowerEdge C4140-K-V100-16&32GB-SXM2 (8 GPUs) – multi-node versus

Non-Dell EMC SN_8x-V100-16GB-SXM2

SN_8X V100_16GB- SXM2

MN- PowerEdge C4140-

K-V100-SXM2 (16Gb

&32GB)-IntelXeon4116

% Diff

Inception-v4

1606

1625

-1.21%

VGG-19

2449

2406

1.78%

VGG-16

2762

2820

-2.03%

Inception-v3

3077

2845

8.16%

ResNet-50

4852

4500

7.81%

GoogLeNet

7894

8754

-9.82%

AlexNet

16977

12145

39.79%

Table 5: 8x GPU Comparison between PowerEdge C4140-K multi-node and 8X SXM2

As seen from the table above, using PowerEdge C4140 with SXM2 shows pretty good

performance across various pre-trained neural models. The most common ones i.e. ResNet-50

and Inception-v3 show performance within 8% of 8X SXM2. The only exception is AlexNet where

it shows quite a bit of difference between 8X SXM2 and PowerEdge C4140.

The good performance shown by PowerEdge C4140 in multi node mode, comparable to a single

node server 8x V100-16GB, was reached after the right software stack configuration with the