HP XC System Software Administration Guide Version 3.1
(nodes) in the system. The LSF-HPC with SLURM daemons consolidate this information into one entity,
such that these daemons present the HP XC system as one virtual LSF host.
Note:
LSF-HPC with SLURM operates only with the nodes in the SLURM lsf partition. As mentioned in the
previous paragraph, LSF-HPC with SLURM groups these nodes into one virtual LSF host, presenting the
HP XC system as a single, large SMP host. If there is no lsf partition in SLURM, then LSF-HPC with
SLURM sets the processor count to 1 and closes this single virtual HP XC host.
Example 15-1 shows how to use the controllsf command to determine which node is the LSF execution
host.
Example 15-1 Determining the LSF Execution Host
# controllsf show current
LSF is currently running on n16, and assigned to n16
All LSF-HPC with SLURM administration must be done from the LSF execution host. You can run the
lsadmin and badmin commands only on this host; they are not intended to be run on any other nodes
in the HP XC system and may produce false results if they are.
When the LSF-HPC with SLURM scheduler determines that it is time to dispatch a job, it requests an
allocation of nodes from SLURM. After the successful allocation, LSF-HPC with SLURM prepares the job
environment with the necessary SLURM allocation variables, that is, SLURM_JOBID and SLURM_NPROCS.
The SLURM_JOBID is a 32-bit integer that uniquely identifies a SLURM allocation in the system; the
SLURM_JOBID can be reused. The job dispatch depends on the type of job:
• For a batch job:
LSF-HPC with SLURM submits the job to SLURM as a batch job and passively monitors it with the
squeue command.
• For an interactive job:
LSF-HPC with SLURM launches the user's job locally on the LSF execution host.
An LSF-HPC with SLURM job starter script for LSF queues is provided and configured by default on the
HP XC system to launch interactive jobs on the first allocated node. This ensures that interactive jobs
behave just as they would if they were batch jobs. The job starter script is discussed in more detail in “Job
Starter Scripts” (page 179).
The environment in which the job is launched contains SLURM and LSF-HPC with SLURM environment
variables that describe the job's allocation. SLURM srun commands in the user's job use the SLURM
environment variables to distribute the tasks throughout the allocation.
The integration of LSF-HPC with SLURM has one drawback: the bsub command's -i option for providing
input to the user job is not supported. A workaround is to provide any file input directly to the job. The
SLURM srun command supports an --input option (also available in its short form as the -i option)
that provides input to all tasks.
15.2.1.1 Job Starter Scripts
LSF-HPC with SLURM dispatches all jobs locally. The default installation of LSF-HPC with SLURM on
the HP XC system provides a job starter script that is configured for use by all LSF queues. This job starter
script adjusts the LSB_HOSTS and LSB_MCPU_HOSTS environment variables to the correct resource values
in the allocation. Then, the job starter script uses the srun command to launch the user task on the first
node in the allocation.
If this job starter script is not configured for a queue, the user jobs begin execution locally on the LSF
execution host. In this case, it is recommended that the user job uses one or more srun commands to make
use of the resources allocated to the job. Work done on the LSF execution host competes for core time with
the LSF-HPC with SLURM daemons, and could affect the overall performance of LSF-HPC with SLURM
on the HP XC system.
15.2 LSF-HPC with SLURM 179