Platform LSF Administration Guide Version 6.2

Chapter 25
Job Checkpoint, Restart, and Migration
Administering Platform LSF
399
Checkpointing a Job
Before LSF can checkpoint a job, it must be made checkpointable. LSF provides
automatic and manual controls to make jobs checkpointable and to checkpoint jobs.
When working with checkpointable jobs, a checkpoint directory must always be
specified. Optionally, a checkpoint period can be specified to enable periodic
checkpointing.
When a job is checkpointed, LSF performs the following actions:
1
Stops the job if its running
2
Creates the checkpoint file in the checkpoint directory
3
Restarts the job
Prerequisites
LSF can create a checkpoint for any eligible job. Review the discussion about
Approaches to Checkpointing” on page 395 to determine if your application and
environment are suitable for checkpointing.
In this section
The Checkpoint Directory” on page 400
Making Jobs Checkpointable” on page 401
Manually Checkpointing Jobs” on page 402
Enabling Periodic Checkpointing” on page 403
Automatically Checkpointing Jobs” on page 404