Product Data Sheet / Brochure

ManualsBrandsNvidia ManualsComponents & AccessoriesA2 GPU computing processor - A2 - 16 GB

NVIDIA A2 TENSOR CORE GPU | DATASHEET | 2

Higher IVA Performance for Intelligent Edge

Servers equipped with A2 offer up to 1.3X more performance in intelligent edge use cases,

including smart cities, manufacturing, and retail. NVIDIA A2 GPUs running IVA workloads

result in more efﬁcient deployments with up to 1.6X better price-performance and ten

percent better energy efﬁciency than previous GPU generations.

NVIDIA A2 Brings Breakthrough NVIDIA Ampere

Architecture Innovations

THIRDENERATION TENSOR ORES

The thrd-generaton Tensor ores n A2 support nteger math,

down to INT4, and floatng pont math, up to FP32, to delver

hgh AI tranng and nference performance The NVIDIA Ampere

archtecture also supports TF32 and NVIDIA’s automatc mxed

precson (AMP) capabltes

ROOT OF TRUST SEURITY

Provdng securty n edge deployments and end-ponts s crtcal

for enterprse busness operatons A2 offers secure boot through

trusted code authentcaton and hardened rollback protectons to

protect aganst malcous malware attacks

SEONDENERATION RT ORES

A2 ncludes dedcated RT ores for ray tracng that enable

groundbreakng technologes at breakthrough speed

Wth up to 2X the throughput over the prevous generaton and

the ablty to concurrently run ray tracng wth ether shadng or

denosng capabltes

HARDWARE TRANSODIN PERFORMANE

Exponental growth n vdeo applcatons demand real-tme

scalable performance, requrng the latest n hardware encode

and decode capabltes A2 PUs use dedcated hardware to fully

accelerate vdeo decodng and encodng for the most popular

codecs, ncludng H265, H264, VP9, and AV1 decode

System onguraton PU HPE DL380 en10 Plus, 2S Xeon old 6330N

22Hz, 512B DDR4 | omputer Vson EfcentDet-D0 (OO, 512x512) |

TensorRT 82, Precson INT8, BS8 (PU) | OpenVINO 20214, Precson INT8,

BS8 (PU)

6X 10X

8X2X 4X

Computer Vision (EfﬁcientDet-DO)

System onguraton PU HPE DL380 en10 Plus, 2S Xeon old 6330N

22Hz, 512B DDR4 | NLP BERT-Large (Sequence length 384, SQuAD

v11) | TensorRT 82, Precson INT8, BS1 (PU) | OpenVINO 20214,

Precson INT8, BS1 (PU)

Natural Language Processing (BERT-Large)

System onguraton PU HPE DL380 en10 Plus, 2S Xeon old 6330N

22Hz, 512B DDR4 | Text-to-Speech Tacotron2 + Waveglow end-to-end

ppelne (nput length 128) | PyTorch 19, Precson FP16, BS1 (PU) | PyTorch

19, Precson FP32, BS1 (PU)

15X 20X 25X

20X

5X 10X

Text-to-Speech (Tacotron2 + Waveglow)

MobileNet v2

0.0x

0.5x

1.0x

1.5x

Relative Performance (Video Streams 1080p30)

1.0X

1.2X

1.0X

1.3X

NVIDIA T4

ShufﬂeNet v2

NVIDIA A2

SystemConﬁguration: [Supermicro SYS-1029GQ-TRT, 2S Xeon Gold 6240 2.6GHz,

512GB DDR4, 1x NVIDIA A2 OR 1x NVIDIA T4] | Measured performance with

Deepstream 5.1. Networks: ShufﬂeNet-v2 (224x224), MobileNet-v2 (224x224) |

Pipeline represents end-to-end performance with video capture and decode,

pre-processing, batching, inference, and post-processing.

A2 Improves Performance by Up to 1.3X Versus T4

IVA Performance (Normalized)

NVIDIA A2

40 65 70 75

TDP Operatng Range (Watts)

A2 Reduces Power Consumption by Up to

40% Versus T4

Lower Power and Conﬁgurable TDP

55 6045 50

NVIDIA T4

2X 4X

Inference Speedup

omparsons of one NVIDIA A2 Tensor ore PU versus a

dual-socket Xeon old 6330N PU

NVIDIA A2

PU

Inference Speedup

omparsons of one NVIDIA A2 Tensor ore PU versus a

dual-socket Xeon old 6330N PU

NVIDIA A2

PU

Inference Speedup

omparsons of one NVIDIA A2 Tensor ore PU versus a

dual-socket Xeon old 6330N PU

NVIDIA A2

PU