LSF Version 7.3 - Administering Platform LSF
Sending a Signal to a Job
126 Administering Platform LSF
Sending a Signal to a Job
LSF uses signals to control jobs, to enforce scheduling policies, or in response to
user requests. The principal signals LSF uses are
SIGSTOP to suspend a job, SIGCONT
to resume a job, and
SIGKILL to terminate a job.
Occasionally, you may want to override the default actions. For example, instead of
suspending a job, you might want to kill or checkpoint it. You can override the
default job control actions by defining the JOB_CONTROLS parameter in your
queue configuration. Each queue can have its separate job control actions.
You can also send a signal directly to a job. You cannot send arbitrary signals to a
pending job; most signals are only valid for running jobs. However, LSF does allow
you to kill, suspend and resume pending jobs.
You must be the owner of a job or an LSF administrator to send signals to a job.
You use the
bkill -s command to send a signal to a job. If you issue bkill without
the -
s option, a SIGKILL signal is sent to the specified jobs to kill them. Twenty
seconds before
SIGKILL is sent, SIGTERM and SIGINT are sent to give the job a
chance to catch the signals and clean up.
On Windows, job control messages replace the
SIGINT and SIGTERM signals, but
only customized applications are able to process them. Termination is implemented
by the
TerminateProcess() system call.
Signals on different platforms
LSF translates signal numbers across different platforms because different host
types may have different signal numbering. The real meaning of a specific signal is
interpreted by the machine from which the
bkill command is issued.
For example, if you send signal 18 from a SunOS 4.x host, it means
SIGTSTP. If the
job is running on HP-UX and
SIGTSTP is defined as signal number 25, LSF sends
signal 25 to the job.
Send a signal to a job
On most versions of UNIX, signal names and numbers are listed in the kill(1) or
signal(2) man pages. On Windows, only customized applications are able to
process job control messages specified with the
-s option.
1 Run bkill -s signal job_id, where signal is either the signal name or the signal
number:
bkill -s TSTP 3421
Job <3421> is being signaled
The above example sends the TSTP signal to job 3421.