White Papers

32 CheXNet – Inference with Nvidia T4 on Dell EMC PowerEdge R7425 | Document ID

Command line to execute the benchmark:

python3 tensorrt_chest.py

--savedmodel_dir=/home/dell/chest-x-ray/chexnet_saved_model/1541777429/ \

--image_file=image.jpg \

--native \

--output_dir=/home/dell/chest-x-ray/output_tensorrt_chexnet_1541777429/

--batch_size=1

Docker image for TensorFlow-GPU: nvcr.io/nvidia/tensorflow:18.10-py3

Where: --native: Benchmark model with it's native precision FP32 and without TensorRT™.

Script Output sample:

==========================

network: native_frozen_graph.pb, batchsize 1, steps 100

fps median: 141.8, mean: 142.1, uncertainty: 0.3, jitter: 2.3

latency median: 0.00705, mean: 0.00704, 99th_p: 0.00740, 99th_uncertainty: 0.00010

==========================

• Throughput (images/sec): ~142

• Latency (sec): 0.00704*1000 = ~7

5.3 CheXNet Inference –TF-TRT 5.0 Integration in INT8int8 Precision

Mode

Benchmarks ran with batch sizes 1-32 using native TensorFlow FP32fp32 TensorRT™. We

ran the benchmarks within the docker image nvcr.io/nvidia/tensorflow:18.10-py3, which

supports TensorFlow with GPU as well as TensorRT™ 5.0.