Platform LSF Administration Guide Version 6.2

The Checkpoint Directory
Administering Platform LSF
400
The Checkpoint Directory
A checkpoint directory must be specified for every checkpointable job and is used to
store the files to restart a job. The directory must be writable by the job owner. To restart
the job on another host (job migration), the directory must be accessible by both hosts.
LSF does not delete the checkpoint files; checkpoint file maintenance is the user’s
responsibility.
LSF writes the checkpoint file in a directory named with the job ID of the job being
checkpointed under the checkpoint directory. This allows LSF to checkpoint multiple
jobs to the same checkpoint directory. For example, when you specify a checkpoint
directory called
my_dir and when job 123 is checkpointed, LSF will save the
checkpoint file in:
my_dir/123/
When LSF restarts a checkpointed job, it renames the checkpoint directory using the job
ID of the new job and creates a symbolic link from the old checkpoint directory to the
new one. For example, if a job with job ID 123 is restarted with job ID 456 the
checkpoint directory will be renamed to:
my_dir/456/