LSF Version 7.3 - Using Platform LSF HPC
Use DJOB_RU_INTERVAL in an application profile in lsb.applications to
configure an interval in seconds used to update the resource usage for the tasks of a
parallel or distributed job. DJOB_RU_INTERVAL only applies to the
blaunch
distributed application framework.
When DJOB_RU_INTERVAL is specified, the interval is scaled according to the
number of tasks in the job:
max(DJOB_RU_INTERVAL, 10) +
host_factor
where
host_factor = 0.01 * number of hosts allocated for the job
When defined in an application profile, the LSB_DJOB_RU_INTERVAL variable is set
in parallel or distributed job environment. You should not manually change the value of
LSB_DJOB_RU_INTERVAL.
By default, the interval is equal to SBD_SLEEP_TIME in
lsb.params, where the
default value of SBD_SLEEP_TIME is 30 seconds.
How blaunch supports task geometry and process group files
The current support for task geometry in LSF requires the user submitting a job to
specify the wanted task geometry by setting the environment variable
LSB_PJL_TASK_GEOMETRY in their submission environment before job
submission. LSF checks for LSB_PJL_TASK_GEOMETRY and modifies
LSB_MCPU_HOSTS appropriately
The environment variable LSB_PJL_TASK_GEOMETRY is checked for all parallel
jobs. If LSB_PJL_TASK_GEOMETRY is set users submit a parallel job (a job that
requests more than 1 slot), LSF attempts to shape LSB_MCPU_HOSTS accordingly.
Resource collection for all commands in a job script
Parallel and distributed jobs are typically launched with a job script. If your job script
runs multiple commands, you can ensure that resource usage is collected correctly for
all commands in a job script by configuring
LSF_HPC_EXTENSIONS=CUMULATIVE_RUSAGE in
lsf.conf. Resource
usage is collected for jobs in the job script, rather than being overwritten when each
command is executed.
Submitting jobs with blaunch
Use bsub to call blaunch, or to invoke an execution script that calls blaunch. The
blaunch command assumes that bsub -n implies one task per job slot.
◆
Submit a job:
bsub -n 4 blaunch myjob
◆
Submit a job to launch tasks on a specific host:
bsub -n 4 blaunch hostA myjob
◆
Submit a job with a host list:
bsub -n 4 blaunch -z "hostA hostB" myjob
◆
Submit a job with a host file:
bsub -n 4 blaunch -u ./hostfile myjob