Locality-Optimized Resource Alignment
5
Application workload
Memory reference patterns are key to system performance on NUMA platforms. Some applications
exhibit strong locality of memory reference, others scatter accesses to global data, and some fall in
between. Most commercial applications exhibit sufficient reference locality to gain significant benefit
by running with LORA. Technical applications that access extremely large data sets in a uniform
manner are better suited to interleaved memory and SMP mode.
In some cases, the type of the application is not as important as the size of the data set that it
references. If an nPartition is devoted to running one single application, and the working set of that
application consumes the bulk of the available physical memory, then there are few opportunities to
exploit locality. Database management systems are often run in this manner, and so will generally
exhibit more predictable behavior when run in SMP mode. By contrast, if an nPartition is running
multiple applications or application instances each of which has a working set much smaller than the
amount of physical memory, then LORA mode will usually be able to show a benefit by aligning
processing resources.
Variability in hardware resources
HP-UX 11i is a dynamic platform: hardware resources can be added to or deleted from the operating
system while it continues to service its application workload. When the variability in the amount of
resources is large, maintaining the locality that is essential to good performance may not be possible.
For that reason, we recommend LORA only when the variability in the amount of hardware resources
is within 33% above and below the initial value. For variability beyond that range, the SMP mode
may be a better choice than LORA.
Some of the management tools and operations that can cause variability are Instant Capacity (iCAP,
including TiCAP and GiCAP), Global Workload Manager (gWLM), and Dynamic nPartitions. All of
those are fully compatible with LORA, so long as the degree of variability does not exceed 33%.
For example, a partition configured with 10 cores and limits for a minimum of 7 cores and a
maximum of 13 cores has 30% variability. A smaller minimum or a larger maximum would exceed
the 33% variability limit recommended for LORA.
When to use LORA
LORA is recommended when all of the criteria mentioned above are satisfied, that is, for Integrity
servers with HP-UX 11i v3 Update 3 or later running commercial applications or virtual partitioning
with a variability of less than 33%. If some condition is not met, then the traditional SMP mode with
100% interleaved memory may be appropriate.
In general, a system operating in SMP mode should be configured with predominantly interleaved
memory. It would be unusual to see good performance if the operating system is treating memory in
a symmetric fashion while the hardware is exposing the memory localities.
LORA configuration rules
The fundamental configuration rule for LORA is to establish ⅞
ths
of the memory as local memory
(leaving ⅛
th
as interleaved memory). While some circumstances might see a slight improvement with
a few megabytes more or less than ⅞
ths
, the recommended value is an excellent general choice. It
simplifies configuration and management to converge on a single value. The design center for the
HP-UX 11i v3 kernel is to tune for ⅞
ths
local memory.
The processor scheduling algorithms have no flexibility if there is only one processor core available in
any given locality. Therefore, the LORA configuration rules require that there be a minimum of two
processor cores in each locality.