LSF Version 7.3 - Administering Platform LSF
Automatic Job Requeue
470 Administering Platform LSF
When MAX_JOB_REQUEUE is set, if a job fails and its exit value falls into
REQUEUE_EXIT_VALUES, the number of times the job has been requeued is set
to 1 and the job is requeued. When the requeue limit is reached, the job is
suspended with PSUSP status. If a job fails and its exit value is not specified in
REQUEUE_EXIT_VALUES, the default requeue behavior applies.
Viewing the
requeue retry limit
1 Run bjobs -l to display the job exit code and reason if the job requeue limit is
exceeded.
2 Run
bhist -l to display the exit code and reason for finished jobs if the job
requeue limit is exceeded.
How job requeue
retry limit is
recovered
The job requeue limit is recovered when LSF is restarted and reconfigured. LSF
replays the job requeue limit from the JOB_STATUS event and its pending reason
in
lsb.events.