6.5.1

Table Of Contents
The high latency of remote memory accesses can leave the processors under-utilized, constantly waiting for
data to be transferred to the local node, and the NUMA connection can become a boleneck for applications
with high-memory bandwidth demands.
Furthermore, performance on such a system can be highly variable. It varies, for example, if an application
has memory located locally on one benchmarking run, but a subsequent run happens to place all of that
memory on a remote node. This phenomenon can make capacity planning dicult.
Some high-end UNIX systems provide support for NUMA optimizations in their compilers and
programming libraries. This support requires software developers to tune and recompile their programs for
optimal performance. Optimizations for one system are not guaranteed to work well on the next generation
of the same system. Other systems have allowed an administrator to explicitly decide on the node on which
an application should run. While this might be acceptable for certain applications that demand 100 percent
of their memory to be local, it creates an administrative burden and can lead to imbalance between nodes
when workloads change.
Ideally, the system software provides transparent NUMA support, so that applications can benet
immediately without modications. The system should maximize the use of local memory and schedule
programs intelligently without requiring constant administrator intervention. Finally, it must respond well
to changing conditions without compromising fairness or performance.
How ESXi NUMA Scheduling Works
ESXi uses a sophisticated NUMA scheduler to dynamically balance processor load and memory locality or
processor load balance.
1 Each virtual machine managed by the NUMA scheduler is assigned a home node. A home node is one
of the system’s NUMA nodes containing processors and local memory, as indicated by the System
Resource Allocation Table (SRAT).
2 When memory is allocated to a virtual machine, the ESXi host preferentially allocates it from the home
node. The virtual CPUs of the virtual machine are constrained to run on the home node to maximize
memory locality.
3 The NUMA scheduler can dynamically change a virtual machine's home node to respond to changes in
system load. The scheduler might migrate a virtual machine to a new home node to reduce processor
load imbalance. Because this might cause more of its memory to be remote, the scheduler might migrate
the virtual machine’s memory dynamically to its new home node to improve memory locality. The
NUMA scheduler might also swap virtual machines between nodes when this improves overall
memory locality.
Some virtual machines are not managed by the ESXi NUMA scheduler. For example, if you manually set the
processor or memory anity for a virtual machine, the NUMA scheduler might not be able to manage this
virtual machine. Virtual machines that are not managed by the NUMA scheduler still run correctly.
However, they don't benet from ESXi NUMA optimizations.
The NUMA scheduling and memory placement policies in ESXi can manage all virtual machines
transparently, so that administrators do not need to address the complexity of balancing virtual machines
between nodes explicitly.
The optimizations work seamlessly regardless of the type of guest operating system. ESXi provides NUMA
support even to virtual machines that do not support NUMA hardware, such as Windows NT 4.0. As a
result, you can take advantage of new hardware even with legacy operating systems.
A virtual machine that has more virtual processors than the number of physical processor cores available on
a single hardware node can be managed automatically. The NUMA scheduler accommodates such a virtual
machine by having it span NUMA nodes. That is, it is split up as multiple NUMA clients, each of which is
assigned to a node and then managed by the scheduler as a normal, non-spanning client. This can improve
the performance of certain memory-intensive workloads with high locality. For information on conguring
the behavior of this feature, see Advanced Virtual Machine Aributes,” on page 118.
vSphere Resource Management
108 VMware, Inc.