HP MLIB User's Guide Vol. 2 7th Ed.

Chapter 8 Introduction to LAPACK 647
Parallel processing
Assume the application started on two MPI processes. Using
MLIB_NUMBER_OF_THREADS set to 1, the code would run two-way parallel:
one MPI process for
and another for
Setting MLIB_NUMBER_OF_THREADS to 2 would allow nested parallelism
and run the code four-way parallel.
Default CPS library stack is too small for MLIB
In libcps, the HP Compiler Parallel Support library, a CPS thread has a
default stack size of 8M bytes. For performance reasons, several subprograms
in HP MLIB use the stack for temporary arrays that exceed the default value.
Using the default CPS stack size, these routines overwrite neighboring stacks,
resulting in errors that are difficult to diagnose.
The solution is to change the CPS thread stacksize attribute to a value that is
large enough to accommodate all the MLIB subprograms the thread may
encounter. Currently, 8 MB*(the number of threads) should be sufficient for all
MLIB subprograms.
The environment variable CPS_STACK_SIZE expects values in K bytes.
Setting the stack size as follows would be sufficient for programs that execute
on two threads:
For C shell:
% setenv CPS_STACK_SIZE 16384
For Korn shell:
% export CPS_STACK_SIZE=16384
Default Pthread library stack is too small for MLIB
The stack allocated for each new thread created using direct pthread calls to
“pthread_create” might not be large enough for HP MLIB. Several subprograms
in HP MLIB use the stack for storing temporary work arrays and improve
performance. If the stack size is not large enough, these routines overwrite
neighboring stacks, resulting in errors that are difficult to diagnose.
C αAB βC+=
F αDE βF+=