LSF Version 7.3 - Platform LSF Configuration Reference
Understanding Platform LSF job exit
information
Contents
•
Why did my job exit?
•
How LSF translates events into exit codes
•
Application and system exit values
•
LSF job termination reason logging
•
Job termination by LSF exit information
•
LSF RMS integration exit values
Why did my job exit?
LSF collects job information and reports the final status of a job. Traditionally jobs finishing
normally report a status of 0, which usually means the job has finished normally. Any non-
zero status means that the job has exited abnormally.
Most of the time, the abnormal job exit is related either to the job itself or to the system it ran
on and not because of an LSF error. This document explains some of the information LSF
provides about the abnormal job termination.
How LSF translates events into exit codes
The following table summarizes LSF exit behavior for some common error conditions.
Error codition
LSF exit
code
Operating
system
System exit
code
equivalent
Meaning
Command not found 127 all 1 or 127 Command shell returns 1 if command
not found. If the command cannot be
found inside a job script, LSF return exit
code 127.
Directory not available for
output
0 all 1 LSF sends the output back to user
through email if directory not available
for output (bsub -o).
LSF internal error -127, 127 all N/A RES returns -127 or 127 for all internal
problems.
Out of memory N/A all N/A Exit code depends on the error handling
of the application itself.
LSF job states 0 all N/A Exit code 0 is returned for all job states
Host failure
If an LSF server host fails, jobs running on that host are lost. No other jobs are affected. Jobs
can be submitted so that they are automatically rerun from the beginning or restarted from a
checkpoint on another host if they are lost because of a host failure.
If all of the hosts in a cluster go down, all running jobs are lost. When a host comes back up
and takes over as master, it reads the lsb.events file to get the state of all batch jobs. Jobs
Understanding Platform LSF job exit information
Platform LSF Configuration Reference 609