White Papers

10 CheXNet – Inference with Nvidia T4 on Dell EMC PowerEdge R7425 | Document ID

2.2 Test Setup

a) For the hardware, we selected PowerEdge 7425 which includes the Nvidia Tesla T4 GPU, the

most advanced accelerator for AI inference workloads. According to Nvidia, T4’s new Turing

Tensor cores accelerate int8 precision more than 2x faster than the previous generation low-

power offering [2].

b) For the framework and inference optimizer tools, we selected TensorFlow, TF-TRT integrated

and TensorRT C++ API, since they have better technical support and a wide variety of pre-

trained models are readily available.

c) Most of the tests were run in int8 precision mode, since it has significantly lower precision and

dynamic range than fp32, as well as lower memory requirements; therefore, it allows higher

throughput at lower latency.

Table 3 shows the software stack configuration on PowerEdge R7425

Table 3. OS and Software Stack Configuration

Software

Version

Ubuntu 16.04.5 LTS

Kernel

GNU/Linux 4.4.0-133-generic x86_64

Nvidia-driver

410.79

CUDA™

10.0

TensorFlow version

1.10

TensorRT™ version

5.0

Docker Image for TensorFlow CPU only

tensorflow:1.10.0-py3

Docker Image for TensorFlow GPU only

nvcr.io/nvidia/tensorflow:18.10-py3

Docker Image for TF-TRT integration

nvcr.io/nvidia/tensorflow:18.10-py3

Docker Image for TensorRT™ C++ API

nvcr.io/nvidia/tensorrt:18.11-py3

Script samples source

Samples included within the docker images

Test Date

February 2019