Platform LSF Administration Guide Version 6.2
Chapter 6
Managing Jobs
Administering Platform LSF
151
Sending a Signal to a Job
LSF uses signals to control jobs, to enforce scheduling policies, or in response to user
requests. The principal signals LSF uses are
SIGSTOP to suspend a job, SIGCONT to
resume a job, and
SIGKILL to terminate a job.
Occasionally, you may want to override the default actions. For example, instead of
suspending a job, you might want to kill or checkpoint it. You can override the default
job control actions by defining the JOB_CONTROLS parameter in your queue
configuration. Each queue can have its separate job control actions.
You can also send a signal directly to a job. You cannot send arbitrary signals to a
pending job; most signals are only valid for running jobs. However, LSF does allow you
to kill, suspend and resume pending jobs.
You must be the owner of a job or an LSF administrator to send signals to a job.
You use the
bkill -s command to send a signal to a job. If you issue bkill without
the -
s option, a SIGKILL signal is sent to the specified jobs to kill them. Twenty
seconds before
SIGKILL is sent, SIGTERM and SIGINT are sent to give the job a
chance to catch the signals and clean up.
On Windows, job control messages replace the
SIGINT and SIGTERM signals, but only
customized applications are able to process them. Termination is implemented by the
TerminateProcess() system call.
Signals on different platforms
LSF translates signal numbers across different platforms because different host types
may have different signal numbering. The real meaning of a specific signal is interpreted
by the machine from which the
bkill command is issued.
For example, if you send signal 18 from a SunOS 4.x host, it means
SIGTSTP. If the job
is running on HP-UX and
SIGTSTP is defined as signal number 25, LSF sends signal
25 to the job.
Sending a signal to a job
Run bkill -s signal job_id, where signal is either the signal name or the
signal number. For example:
%
bkill -s TSTP 3421
Job <3421> is being signaled
sends the TSTP signal to job 3421.
On most versions of UNIX, signal names and numbers are listed in the
kill(1) or
signal(2) man pages. On Windows, only customized applications are able to process
job control messages specified with the
-s option.