Locality-Optimized Resource Alignment

Advanced tuning

An important part of the LORA value proposition is to deliver ease-of-use along with performance.

Our goal is that LORA should work out-of-the-box, without the need for system administrators to

perform explicit tuning. Several factors make the goal impossible to reach in every single case. The

range of applications deployed across the HP-UX customer base is extremely diverse. So is the

capacity of the servers: the applications could be deployed in a virtual partition with two processor

cores and 3 GB of memory, or in a hard partition with 128 cores and 2 TB of memory. In addition,

workloads can exhibit transient spikes in demand many times greater than the steady-state average.

Here is the LORA philosophy for coping with this dilemma: provide out-of-the-box behavior that is

solid in most circumstances, but implement mechanisms to allow system administrators to adjust the

behavior to suit the idiosyncrasies of their particular workload if they desire to do so. This section

discusses some possibilities for explicit tuning to override the automatic LORA heuristics.

numa_mode

kernel tunable parameter

The numa_mode kernel tunable parameter controls the mode of the kernel with respect to NUMA

platform characteristics. Because of the close coupling between memory configuration and kernel

mode, it is recommended to accept the default value of numa_policy, which is 0, meaning to auto

sense the mode at boot time. Systems configured in accordance with the LORA guidelines will be

auto sensed into LORA mode; otherwise they will operate in SMP mode. As described in the

numa_mode man page, the tunable can be adjusted to override the autosensing logic.

In LORA mode, HP-UX implements a number of heuristics for automatic workload placement to

establish good alignment between the processes executing an application in the memory that they

reference. Every process and every thread is assigned a home locality. Processes and threads may

temporarily be moved away from their home localities to balance the system load, but they are

returned back home as soon as is practical. For process memory allocations, when the allocation

policy stipulates the closest locality, the home locality of the process is used. For shared memory

objects too large to fit within a single locality, the allocation is distributed evenly across the smallest

number of localities that can accommodate the object. Any processes attaching to that shared

memory object are then re-launched so as to be distributed evenly across the localities containing the

memory.

numa_sched_launch

kernel tunable parameter

The numa_sched_launch parameter controls the default process launch policy. The launch policy

refers to the preferred locality for processes forked as children of a parent process. In LORA mode,

the default launch policy is PACKED, which places child processes in the same locality as their parent.

Setting the parameter to the value 0 forces the default launch policy to be the same as it is in SMP

mode. Individual processes can be launched with a custom policy by using the mpsched command.

numa_policy

kernel tunable parameter

The numa_policy kernel tunable parameter governs the way HP-UX 11i v3 performs memory

allocation on NUMA platforms. When the parameter is at its default value of 0, the kernel chooses

the allocation policy at boot time based on the platform memory configuration. The system

administrator can override the default choice at any time by changing the value of the parameter.

The numa_policy man page contains the full details; a brief summary appears below.