White Papers
36 CheXNet – Inference with Nvidia T4 on Dell EMC PowerEdge R7425 | Document ID
Figure 15. Throughput Native TensorFlow FP32 versus TF-TRT 5.0 Integration INT8
Figure 16 shows the latency curve for each inference configuration, the lower is the latency
better is the performance, and in this case TF-TRT-INT8 implementation reached the lowest
inference time for all the batch sizes.