White Papers

Dell - Internal Use - Confidential
performance of P100-PCIe with that of CPU and K80 GPUs for this application. It is shown that within 1
node, 4 P100-PCIe is 6.6x faster than 2 E5-2690 v4 CPUs and 1.4x faster than 4 K80 GPUs.
Figure 5: LAMMPS Performance on P100-PCIe
Figure 6 : Comparison between Configuration G and Configuration B
x16
G
2 CPU / 4 GPU
2 Virtual Switches
2 GPU per CPU
PCIe Gen3 96-lane Switch
GPU1 GPU2
LP
Slot
#2
CPU1
LP
Slot
#1
x16 x8 x16 x16 x8
X X
GPU3 GPU4
CPU2
x16
B
2 CPU
4:1 Switched
GPU1 GPU2 GPU3 GPU4
LP
Slot
#1
CPU1
LP
Slot
#2
PCIe Gen3 96-lane Switch
x16 x8 x16 x16 x8
X X X
CPU2