HP Virtual Server Environment: Tips for Application Developers

max_lcpus = buf.psd_max_proc_cnt;

printf("Max cores: %d; Max LCPUs: %d\n", max_cores, max_lcpus);

}

These maximum counts take into account all possible sources of additional CPU resources, including

the sum of all psets, all available iCAP processors, and all cores that can be migrated from another

vPar. Note that on a Montecito system, the max number of LCPUs will always be twice the max

number of cores, whether HyperThreads are enabled or not. This is because HyperThreads could be

enabled, which means the maximum number must include that possibility.

An application may benefit from having detailed topological information about the system. The APIs

to obtain topology information are:

 pstat_getprocessor(2) – used to get topology information on sockets, cores, threads and

logical CPUs and logical domains.

 pstat_getdynamic(2) – system-wide counters return information about maximum number of

active and supported processors and cores.

 mpctl(2) – provides hierarchical information on sockets, cores, LDOMs, etc.

These APIs should be used sparingly and with care. Much of the information available is unique to a

particular architecture, CPU version or system type. Use of such information may limit the application

to a particular subset of systems. Binding application execution to specific hardware components

may limit the ability of the OS to migrate those hardware components to higher priority partitions.

Conversely, applications bound to specific hardware resources may find those resources unavailable

if they have been re-assigned to another partition.

Tell me when it changes

The ideal design for a VSE enabled application includes dynamic modification in response to

changes in the resources available. Consider this common scenario:

 Application starts up, low system load, minimum CPU resources allocated to this partition.

 Application queries OS regarding number of cores, allocates two (process) threads per core

to handle workload.

 Application begins processing workload, load increases.

 Workload manager senses additional load, adds more cores for the application.

 Application is unable to use additional cores, due to limited number of threads.

The performance of this application is limited by the number of threads allocated initially. The

number of cores available at allocation time was minimal, because there was no load. To address

this, consider a different solution.

 Application starts up, low system load, minimum CPU resources allocated to this partition.

 Application queries OS regarding maximum number of cores, allocates two threads per core

to handle workload.

 Application begins processing workload, load increases.

 Workload manager senses additional load, adds more cores for the application.

 Application throughput rises with additional cores since there are enough threads to keep all

the cores busy.

 Workload manager takes away cores to give to a higher priority workload.

 Application now has more than two threads running per core, causing excess context

switching between threads and resulting in sub-optimal performance.