Command Reference Guide
bchkpnt
36 Platform LSF Command Reference
bchkpnt
checkpoints one or more checkpointable jobs
Synopsis
bchkpnt [-f] [-k] [-p minutes | -p 0]
job_ID | "job_ID[index_list]" ...
bchkpnt [-f] [-k] [-p minutes | -p 0] -J job_name
|-m host_name | -m host_group |-q queue_name
|-u "user_name" | -u all [0]
bchkpnt -h | -V
Description
Checkpoints the most recently submitted running or suspended checkpointable
job.
LSF administrators and
root can checkpoint jobs submitted by other users.
Jobs continue to execute after they have been checkpointed.
LSF invokes the
echkpnt(8) executable found in LSF_SERVERDIR to perform the
checkpoint.
Only running members of a chunk job can be checkpointed. For chunk jobs in
WA I T s t at e ,
mbatchd rejects the checkpoint request.
Options
0 (Zero). Checkpoints all of the jobs that satisfy other specified critera.
-f Forces a job to be checkpointed even if non-checkpointable conditions exist (these
conditions are OS-specific).
-k Kills a job after it has been successfully checkpointed.
-p minutes | -p 0 Enables periodic checkpointing and specifies the checkpoint period, or modifies
the checkpoint period of a checkpointed job. Specify
-p 0 (zero) to disable periodic
checkpointing.
Checkpointing is a resource-intensive operation. To allow your job to make
progress while still providing fault tolerance, specify a checkpoint period of 30
minutes or longer.
-J job_name Checkpoints only jobs that have the specified job name.
-m host_name | -m host_group
Checkpoints only jobs dispatched to the specified hosts.
-q queue_name
Checkpoints only jobs dispatched from the specified queue.
-u "user_name" | -u all