Platform LSF Administration Guide Version 6.2
Chapter 25
Job Checkpoint, Restart, and Migration
Administering Platform LSF
399
Checkpointing a Job
Before LSF can checkpoint a job, it must be made checkpointable. LSF provides
automatic and manual controls to make jobs checkpointable and to checkpoint jobs.
When working with checkpointable jobs, a checkpoint directory must always be
specified. Optionally, a checkpoint period can be specified to enable periodic
checkpointing.
When a job is checkpointed, LSF performs the following actions:
1
Stops the job if its running
2
Creates the checkpoint file in the checkpoint directory
3
Restarts the job
Prerequisites
LSF can create a checkpoint for any eligible job. Review the discussion about
“Approaches to Checkpointing” on page 395 to determine if your application and
environment are suitable for checkpointing.
In this section
◆
“The Checkpoint Directory” on page 400
◆
“Making Jobs Checkpointable” on page 401
◆
“Manually Checkpointing Jobs” on page 402
◆
“Enabling Periodic Checkpointing” on page 403
◆
“Automatically Checkpointing Jobs” on page 404