Locality-Optimized Resource Alignment
18
Advanced tuning
An important part of the LORA value proposition is to deliver ease-of-use along with performance.
Our goal is that LORA should work out-of-the-box, without the need for system administrators to
perform explicit tuning. Several factors make the goal impossible to reach in every single case. The
range of applications deployed across the HP-UX customer base is extremely diverse. So is the
capacity of the servers: the applications could be deployed in a virtual partition with two processor
cores and 3 GB of memory, or in a hard partition with 128 cores and 2 TB of memory. In addition,
workloads can exhibit transient spikes in demand many times greater than the steady-state average.
Here is the LORA philosophy for coping with this dilemma: provide out-of-the-box behavior that is
solid in most circumstances, but implement mechanisms to allow system administrators to adjust the
behavior to suit the idiosyncrasies of their particular workload if they desire to do so. This section
discusses some possibilities for explicit tuning to override the automatic LORA heuristics.
numa_mode
kernel tunable parameter
The numa_mode kernel tunable parameter controls the mode of the kernel with respect to NUMA
platform characteristics. Because of the close coupling between memory configuration and kernel
mode, it is recommended to accept the default value of numa_policy, which is 0, meaning to auto
sense the mode at boot time. Systems configured in accordance with the LORA guidelines will be
auto sensed into LORA mode; otherwise they will operate in SMP mode. As described in the
numa_mode man page, the tunable can be adjusted to override the autosensing logic.
In LORA mode, HP-UX implements a number of heuristics for automatic workload placement to
establish good alignment between the processes executing an application in the memory that they
reference. Every process and every thread is assigned a home locality. Processes and threads may
temporarily be moved away from their home localities to balance the system load, but they are
returned back home as soon as is practical. For process memory allocations, when the allocation
policy stipulates the closest locality, the home locality of the process is used. For shared memory
objects too large to fit within a single locality, the allocation is distributed evenly across the smallest
number of localities that can accommodate the object. Any processes attaching to that shared
memory object are then re-launched so as to be distributed evenly across the localities containing the
memory.
numa_sched_launch
kernel tunable parameter
The numa_sched_launch parameter controls the default process launch policy. The launch policy
refers to the preferred locality for processes forked as children of a parent process. In LORA mode,
the default launch policy is PACKED, which places child processes in the same locality as their parent.
Setting the parameter to the value 0 forces the default launch policy to be the same as it is in SMP
mode. Individual processes can be launched with a custom policy by using the mpsched command.
numa_policy
kernel tunable parameter
The numa_policy kernel tunable parameter governs the way HP-UX 11i v3 performs memory
allocation on NUMA platforms. When the parameter is at its default value of 0, the kernel chooses
the allocation policy at boot time based on the platform memory configuration. The system
administrator can override the default choice at any time by changing the value of the parameter.
The numa_policy man page contains the full details; a brief summary appears below.