Platform LSF Administration Guide Version 6.2

Chapter 25
Job Checkpoint, Restart, and Migration
Administering Platform LSF
401
Making Jobs Checkpointable
Making a job checkpointable involves specifying the location of a checkpoint directory
to LSF. This can be done manually on the command line or automatically through
configuration.
Manually
Manually making a job checkpointable involves specifying the checkpoint directory on
the command line. LSF will create the directory if it does not exist. A job can be made
checkpointable at job submission or after submission.
At job submission
Use the -k "checkpoint_dir" option of bsub to specify the checkpoint directory
for a job at submission. For example, to specify
my_dir as the checkpoint directory for
my_job:
%
bsub -k "my_dir" my_job
Job <123> is submitted to default queue <default>.
After job
submission
Use the -k "checkpoint_dir" option of bmod to specify the checkpoint directory
for a job after submission. For example, to specify
my_dir as the checkpoint directory
for a job with job ID 123:
%
bmod -k "my_dir" 123
Parameters of job <123> are being changed
Automatically
Automatically making a job checkpointable involves submitting the job to a queue that
is configured for checkpointable jobs. To configure a queue, edit
lsb.queues and
specify the checkpoint directory for the CHKPNT parameter on a queue. The
checkpoint directory must already exist, LSF will not create the directory.
For example, to configure a queue for checkpointable jobs using a directory named
my_dir:
Begin Queue
...
CHKPNT=my_dir
DESCRIPTION = Make jobs checkpointable using "my_dir"
...
End Queue