LSF Version 7.3 - Using Platform LSF HPC

Architecture
Without the generic PJL framework, the PJL starts tasks directly on each host, and
manages the job.
Even if the MPI job was submitted through LSF, LSF never receives information about
the individual tasks. LSF is not able to track job resource usage or provide job control.
If you simply replace PAM with a parallel job launcher that is not integrated with LSF,
LSF loses control of the process and is not able to monitor job resource usage or provide
job control. LSF never receives information about the individual tasks.
PAM is the resource manager for the job. The key step in the integration is to place TS
in the job startup hierarchy, just before the task starts. TS must be the parent process of
each task in order to collect the task process ID (PID) and pass it to PAM.
The following figure illustrates the relationship between PAM, PJL, PJL wrapper, TS,
and the parallel job tasks.
Instead of starting the PJL directly, PAM starts the specified PJL wrapper on a single
host.
...
Second
Execution
Host
First
Execution
Host
PJL
Task Task Task Task
... ... ...
...
PJL wrapper
...
Second
Execution
Host
First
Execution
Host
PAM
PJL
TS TS
Task Task
RES
TS TS
Task Task
RES
... ...
sbatchd mpirun.lsf
mbatchd
PIM
Master Host
mbschd
bsub -a
pjl_type
esub.
pjl_type