6.5.1

ManualsBrandsVMware ManualsApplicationsvSphere

101

102

103

104

105

106

107

108

109

110

Table Of Contents

vSphere Resource Management

The high latency of remote memory accesses can leave the processors under-utilized, constantly waiting for

data to be transferred to the local node, and the NUMA connection can become a boleneck for applications

with high-memory bandwidth demands.

Furthermore, performance on such a system can be highly variable. It varies, for example, if an application

has memory located locally on one benchmarking run, but a subsequent run happens to place all of that

memory on a remote node. This phenomenon can make capacity planning dicult.

Some high-end UNIX systems provide support for NUMA optimizations in their compilers and

programming libraries. This support requires software developers to tune and recompile their programs for

optimal performance. Optimizations for one system are not guaranteed to work well on the next generation

of the same system. Other systems have allowed an administrator to explicitly decide on the node on which

an application should run. While this might be acceptable for certain applications that demand 100 percent

of their memory to be local, it creates an administrative burden and can lead to imbalance between nodes

when workloads change.

Ideally, the system software provides transparent NUMA support, so that applications can benet

immediately without modications. The system should maximize the use of local memory and schedule

programs intelligently without requiring constant administrator intervention. Finally, it must respond well

to changing conditions without compromising fairness or performance.

How ESXi NUMA Scheduling Works

ESXi uses a sophisticated NUMA scheduler to dynamically balance processor load and memory locality or

processor load balance.

1 Each virtual machine managed by the NUMA scheduler is assigned a home node. A home node is one

of the system’s NUMA nodes containing processors and local memory, as indicated by the System

Resource Allocation Table (SRAT).

2 When memory is allocated to a virtual machine, the ESXi host preferentially allocates it from the home

node. The virtual CPUs of the virtual machine are constrained to run on the home node to maximize

memory locality.

3 The NUMA scheduler can dynamically change a virtual machine's home node to respond to changes in

system load. The scheduler might migrate a virtual machine to a new home node to reduce processor

load imbalance. Because this might cause more of its memory to be remote, the scheduler might migrate

the virtual machine’s memory dynamically to its new home node to improve memory locality. The

NUMA scheduler might also swap virtual machines between nodes when this improves overall

memory locality.

Some virtual machines are not managed by the ESXi NUMA scheduler. For example, if you manually set the

processor or memory anity for a virtual machine, the NUMA scheduler might not be able to manage this

virtual machine. Virtual machines that are not managed by the NUMA scheduler still run correctly.

However, they don't benet from ESXi NUMA optimizations.

The NUMA scheduling and memory placement policies in ESXi can manage all virtual machines

transparently, so that administrators do not need to address the complexity of balancing virtual machines

between nodes explicitly.

The optimizations work seamlessly regardless of the type of guest operating system. ESXi provides NUMA

support even to virtual machines that do not support NUMA hardware, such as Windows NT 4.0. As a

result, you can take advantage of new hardware even with legacy operating systems.

A virtual machine that has more virtual processors than the number of physical processor cores available on

a single hardware node can be managed automatically. The NUMA scheduler accommodates such a virtual

machine by having it span NUMA nodes. That is, it is split up as multiple NUMA clients, each of which is

assigned to a node and then managed by the scheduler as a normal, non-spanning client. This can improve

the performance of certain memory-intensive workloads with high locality. For information on conguring

the behavior of this feature, see “Advanced Virtual Machine Aributes,” on page 118.

vSphere Resource Management

108 VMware, Inc.