LSF Version 7.3 - Administering Platform LSF

Killing Jobs
124 Administering Platform LSF
Killing Jobs
The bkill command cancels pending batch jobs and sends signals to running jobs.
By default, on UNIX,
bkill sends the SIGKILL signal to running jobs.
Before
SIGKILL is sent, SIGINT and SIGTERM are sent to give the job a chance to
catch the signals and clean up. The signals are forwarded from
mbatchd to sbatchd.
sbatchd waits for the job to exit before reporting the status. Because of these delays,
for a short period of time after the
bkill command has been issued, bjobs may still
report that the job is running.
On Windows, job control messages replace the
SIGINT and SIGTERM signals, and
termination is implemented by the
TerminateProcess() system call.
Kill a job
1 Run bkill job_ID. For example, the following command kills job 3421:
bkill 3421
Job <3421> is being terminated
Kill multiple jobs
1 Run bkill 0 to kill all pending jobs in the cluster or use bkill 0 with the -g,
-J, -m, -q, or -u options to kill all jobs that satisfy these options.
The following command kills all jobs dispatched to
the hostA host:
bkill -m hostA 0
Job <267> is being terminated
Job <268> is being terminated
Job <271> is being terminated
The following command kills all jobs in the groupA job group:
bkill -g groupA 0
Job <2083> is being terminated
Job <2085> is being terminated
Kill a large number of jobs rapidly
Killing multiple jobs with bkill 0 and other commands is usually sufficient for
moderate numbers of jobs. However, killing a large number of jobs (approximately
greater than 1000 jobs) can take a long time to finish.
1 Run bkill -b to kill a large number of jobs faster than with normal means.
However, jobs killed in this manner are not logged to
lsb.acct.
Local pending jobs are killed immediately and cleaned up as soon as possible,
ignoring the time interval specified by CLEAN_PERIOD in
lsb.params.
Other jobs are killed as soon as possible but cleaned up normally (after the
CLEAN_PERIOD time interval).