Concept Guide

Dell, EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries

Traditionally, the IT best practices for compute-intensive (non-graphical) VM instances leveraged

GPU pass-through shown in the left half of Figure 1. In a VMware environment, this is referred to as

the VM DirectPath I/O mode of operation. It allows the GPU device to be accessed directly by the

guest operating system, bypassing the ESXi hypervisor. This provides a level of performance of a

GPU on vSphere that is very close to its performance on a native system (within 4-5%).

The main reasons for using the passthrough approach to expose GPUs on vSphere are:

(i) Simplicity: It is straightforward to allocate GPUs to a VM using pass-though and offer GPU

acceleration benefits to end users

(ii) Dedicated use: there is no need for sharing the GPU among different VMs, because a single

application will consume one or more full GPUs

(iii) Replicate public cloud instances: public cloud instances use GPU pass-through, and end

user wants the same environment in an on-premises datacenter

(iv) A single virtual machine can make use of multiple physical GPUs in passthrough mode

An important point to note is that the passthrough option for GPUs works without third-party software

driver being loaded into the ESXi hypervisor.

Disadvantages of GPU passthrough is as follows:

(i) The entire GPU is dedicated to that VM and there is no sharing of GPUs amongst the VMs

on a server.

(ii) Advanced vSphere features of vMotion, Distributed Resource Scheduling (DRS) and

Snapshots are not allowed with this form of using GPUs with a virtual machine.

Overview of NVIDIA vGPU Platform

GPU virtualization (NVIDIA vGPU) addresses limitations of pass-through but was traditionally

deployed to accelerate virtualized profession graphics applications, virtual desktop instances or

remote desktop solutions. NVIDIA added support for AI, DL and high-performance computing (HPC)

workloads in GRID 9.0 that was released in summer 2019. It also changed vGPU licensing to make

it more amenable for compute use cases. GRID vPC/vApps and Quadro vDWS are licensed by

concurrent user, either as a perpetual license or yearly subscription. Since vComputeServer is for

server compute workloads, the license is tied to the GPU rather than a user and is therefore licensed

per GPU as a yearly subscription. For more information about NVIDIA GRID software, see

http://www.nvidia.com/grid.

Figure 2 shows the different components of the Virtual GPU software stack.