Administrator Guide
6
Copyright © 2019 Dell Inc. or its subsidiaries. All Rights Reserved. Dell, EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries
Copyright © 2019 Dell Inc. or its subsidiaries. All Rights Reserved. Dell, EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries
Conclusions
We presented the Intel Programmable Accelerator Card (PAC) with Arria 10 FPGA for deep-
learning inference.
We showed that, using the Intel PAC on an x86-based Dell EMC PowerEdge server, we achieved
improved performance of ResNet-50 compared to previously released ResNet-50 model.
Specifically, we examined the throughput, latency, as well as CPU-onload performance with half-
socket, full-socket, dual-socket, and quad-socket configurations, and scaled up the number of
PACs to 4. While the full-fledged, quad-socket CPU configuration achieved 330 FPS at 0.79
FPS/Watt (a 60% efficiency reduction compared to the dual-socket counterpart), the FPGAs
achieved 1,251 FPS at 6 FPS/Watt with 20% power increase per PAC to the server power budget.
These performance numbers are expected to continue to improve with ongoing optimizations to
the hardware and software system stacks.
Acknowledgements
Center for Space High-Performance and Resilient Computing (SHREC), University of Florida.
References
[1] A. Singh and J. C. Príncipe, "A loss function for classification based on a robust similarity
metric," The 2010 International Joint Conference on Neural Networks (IJCNN), Barcelona, 2010
[2] K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image Recognition," 2016
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016.
Copyright © 2019 Dell Inc. or its subsidiaries. All Rights Reserved. Dell, EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries
Follow Us
For PowerEdge news
Contact Us
For feedback and requests
PowerEdge DfD Repository
For more technical learning