White Papers

Deep Learning Performance: Scale-up vs Scale-out
Architectures & Technologies Dell EMC | Infrastructure Solutions Group
10
1. System bandwidth performance i.e. PCIe connected to GPU - p2pbandwidth & latency
tests
2. GPU hardware performance without any Deep learning frameworks Baidu Deep Bench
3. System running GPU & benchmarks TensorFlow benchmarks
3.1 Criteria
1. In order to bound our testing, we picked TensorFlow as the framework of choice since it
has better support and models are readily available.
2. For distributed training, we selected Uber Horovod implementation, since it’s one of the
best performing distributed implementation [2].
3.2 Why TensorFlow as the framework of choice?
The reason we selected TensorFlow is because it’s the most widely used framework of choice for
machine learning and deep learning. It also has a wider support within open source community
and availability of pre-trained models. It also has better community support and supported very
well by the TensorFlow team.
TensorFlow is also widely used within the Dell EMC customer base and one of the top choices
when developing any new projects in machine learning. Figure 6 shows how TensorFlow
compares in terms of GitHub commits, stars and number of forks. This is a pretty good indicator
of its widespread adoption.