White Papers

32 CheXNet Inference with Nvidia T4 on Dell EMC PowerEdge R7425 | Document ID
Command line to execute the benchmark:
python3 tensorrt_chest.py
--savedmodel_dir=/home/dell/chest-x-ray/chexnet_saved_model/1541777429/ \
--image_file=image.jpg \
--native \
--output_dir=/home/dell/chest-x-ray/output_tensorrt_chexnet_1541777429/
--batch_size=1
Docker image for TensorFlow-GPU: nvcr.io/nvidia/tensorflow:18.10-py3
Where: --native: Benchmark model with it's native precision FP32 and without TensorRT™.
Script Output sample:
==========================
network: native_frozen_graph.pb, batchsize 1, steps 100
fps median: 141.8, mean: 142.1, uncertainty: 0.3, jitter: 2.3
latency median: 0.00705, mean: 0.00704, 99th_p: 0.00740, 99th_uncertainty: 0.00010
==========================
Throughput (images/sec): ~142
Latency (sec): 0.00704*1000 = ~7
5.3 CheXNet Inference TF-TRT 5.0 Integration in INT8int8 Precision
Mode
Benchmarks ran with batch sizes 1-32 using native TensorFlow FP32fp32 TensorRT™. We
ran the benchmarks within the docker image nvcr.io/nvidia/tensorflow:18.10-py3, which
supports TensorFlow with GPU as well as TensorRT™ 5.0.