White Papers

Deep Learning Performance: Scale-up vs Scale-out
Architectures & Technologies Dell EMC | Infrastructure Solutions Group
26
Figure 20: PowerEdge C4140-V100-SXM2- Configuration-K vs PowerEdge C4140-V100-SXM2
Configuration-M
As shown in Figure 21 below, it shows that the number of CPU cores does play a role in terms of
throughput. And the biggest difference is when running AlexNet.
7.1.8 What role does CPU play in Deep learning?
The CPU plays a major role in the initial phase called data preprocessing. The steps below show
an instruction pipeline, with the following 4 instructions happening in parallel:
a. Train on batch n (on GPUs)
b. Copy batch n+1 to GPU memory
c. Transform batch n+2 (on CPU)
d. Load batch n+3 from disk (on CPU)
The loop for the data processing when training is:
a. Load mini-batch
b. Preprocess mini-batch
c. Train on mini-batch