LSF Version 7.3 - Platform LSF Configuration Reference
Configuration file Parameter and syntax Behavior
lsb.queues
CHKPNT=chkpnt_dir [chkpnt_period]
•
All jobs submitted to the queue are
checkpointable.
•
The specified checkpoint directory must
already exist. LSF will not create the
checkpoint directory.
•
The user account that submits the job
must have read and write permissions for
the checkpoint directory.
•
For the job to restart on another execution
host, both the original and new hosts must
have network connectivity to the
checkpoint directory.
•
If the queue administrator specifies a
checkpoint period, in minutes, LSF creates a
checkpoint file every chkpnt_period during
job execution.
•
If a user specifies a checkpoint directory and
checkpoint period at the job level with bsub
-k, the job-level values override the queue-
level values.
RERUNNABLE=Y
•
If the execution host becomes unavailable,
LSF reruns the job from the beginning on a
different host.
lsb.applications
CHKPNT_DIR=chkpnt_dir
•
Specifies the checkpoint directory for
automatic checkpointing for the application.
To enable automatic checkpoint for the
application profile, administrators must
specify a checkpoint directory in the
configuration of the application profile.
•
If CHKPNT_PERIOD,
CHKPNT_INITPERIOD or
CHKPNT_METHOD was set in an application
profile but CHKPNT_DIR was not set, a
warning message is issued and and those
settings are ignored.
•
The checkpoint directory is the directory
where the checkpoint files are created.
Specify an absolute path or a path relative to
the current working directory for the job. Do
not use environment variables in the directory
path.
•
If checkpoint-related configuration is
specified in both the queue and an application
profile, the application profile setting
overrides queue level configuration.
CHKPNT_INITPERIOD=
init_chkpnt_period
Feature: Job migration
86 Platform LSF Configuration Reference