LSF Version 7.3 - Platform LSF Configuration Reference
TASK_SWAPLIMIT: Enables enforcement of a virtual memory (swap) limit (bsub -v, bmod
-v, or SWAPLIMIT in lsb.queues) for individual tasks in a parallel job. If any parallel task
exceeds the swap limit, LSF terminates the entire job.
Example JOB_START events in lsb.events:
For a job submitted with
bsub -n 64 -R "span[ptile=32]" sleep 100
Without SHORT_EVENTFILE, a JOB_START event like the following would be logged in
lsb.events:
"JOB_START" "7.0" 1058989891 710 4 0 0 10.3 64 "hostA" "hostA" "hostA" "hostA" "hostA" "hostA"
"hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA"
"hostA" "hostA" "hostA" "u050" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA"
"hostA" "hostA" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB"
"hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB"
"hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "" "" 0 "" 0
With SHORT_EVENTFILE, a JOB_START event would be logged in lsb.events with the
number of execution hosts (numExHosts field) changed from 64 to 2 and the execution host
list (execHosts field) shortened to "32*hostA" and "32*hostB":
"JOB_START" "7.0" 1058998174 812 4 0 0 10.3 2 "32*hostA" "32*hostB" "" "" 0 "" 0 ""
Example JOB_FINISH records in lsb.acct:
For a job submitted with
bsub -n 64 -R "span[ptile=32]" sleep 100
Without SHORT_EVENTFILE, a JOB_FINISH event like the following would be logged in
lsb.acct:
"JOB_FINISH" "7.0" 1058990001 710 33054 33816578 64 1058989880 0 0 1058989891 "user1" "normal"
"span[ptile=32]" "" "" "hostA" "/scratch/user1/work" "" "" "" "1058989880.710" 0 64 "hostA"
"hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA"
"hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA"
"hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostA" "hostB" "hostB" "hostB" "hostB" "hostB"
"hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB"
"hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB" "hostB"
"hostB" "hostB" "hostB" 64 10.3 "" "sleep 100" 0.079999 0.270000 0 0 -1 0 0 0 0 0 0 0 -1 0 0 0 0
0 -1 "" "default" 0 64 "" "" 0 4304 6024 "" "" ""
With SHORT_EVENTFILE, a JOB_FINISH event like the following would be logged in
lsb.acct with the number of execution hosts (numExHosts field) changed from 64 to 2 and
the execution host list (execHosts field) shortened to "32*hostA" and "32*hostB":
"JOB_FINISH" "7.0" 1058998282 812 33054 33816578 64 1058998163 0 0 1058998174 "user1" "normal"
"span[ptile=32]" "" "" "hostA" "/scratch/user1/work" "" "" "" "1058998163.812" 0 2 "32*hostA"
"32*hostB" 64 10.3 "" "sleep 100" 0.039999 0.259999 0 0 -1 0 0 0 0 0 0 0 -1 0 0 0 0 0 -1 "" "default"
0 64 "" "" 0 4304 6024 "" "" "" "" 0
Example bjobs -l ouput without SHORT_PIDLIST:
bjobs -l displays all the PGIDs and PIDs for the job:
bjobs -l
Job <109>, User <user3>, Project <default>, Status <RUN>, Queue <normal>, Inte
ractive mode, Command <./myjob.sh>
Mon Jul 21 20:54:44: Submitted from host <hostA>, CWD <$HOME/LSF/jobs;
RUNLIMIT
10.0 min of hostA
STACKLIMIT CORELIMIT MEMLIMIT
5256 K 10000 K 5000 K
Mon Jul 21 20:54:51: Started on <hostA>;
Mon Jul 21 20:55:03: Resource usage collected.
MEM: 2 Mbytes; SWAP: 15 Mbytes
PGID: 256871; PIDs: 256871
lsf.conf
Platform LSF Configuration Reference 459