LSF Version 7.3 - Using Platform LSF HPC

Examples
Sequential FLUENT batch job with checkpoint and restart
% bsub -a fluent -k "/home/username 60" fluent 3d -g -i
journal_file -lsf
Submits a job that uses the checkpoint/restart method echkpnt.fluent and
erestart.fluent, /home/username as the checkpoint directory, and a 60
minute duration between automatic checkpoints. FLUENT checks if there is a
checkpoint trigger file
/home/username/exit or /home/username/check.
% bchkpnt
job_ID
echkpnt creates the checkpoint trigger file /home/username/check and waits
until the file is removed and the checkpoint is successful. FLUENT writes a case
and data file, and a restart journal file at the end of its current iteration. The files are
saved in
/home/username/
job_ID
and FLUENT continues to iterate.
Use
bjobs to verify that the job is still running after checkpoint.
% bchkpnt -k
job_ID
echkpnt creates the checkpoint trigger file /home/username/exit and waits
until the file is removed and the checkpoint is successful. FLUENT writes a case
and data file, and a restart journal file at the end of its current iteration. The files are
saved in
/home/username/
job_ID
and FLUENT exits.
Use
bjobs to verify that the job is not running after checkpoint.
% brestart /home/username/
job_ID
Starts a FLUENT job using the latest case and data files in
/home/username/
job_ID
. The restart journal file
/home/username/
job_ID
/#restart.inp is used to instruct FLUENT to
read the latest case and data files and continue iterating.
Parallel FLUENT VMPI version batch job with checkpoint and restart on 4 CPUs
% bsub -a fluent -k "/home/username 60" -n 4 fluent 3d -t4
-pvmpi -g -i journal_file -lsf
% bchkpnt -k
job_ID
Forces FLUENT to write a case and data file, and a restart journal file at the end of
its current iteration. The files are saved in
/home/username/
job_ID
and
FLUENT exits.
% brestart /home/username/
job_ID
Starts a FLUENT job using the latest case and data files in
/home/username/
job_ID
. The restart journal file
/home/username/
job_ID
/#restart.inp is used to instruct FLUENT to
read the latest case and data files and continue iterating.
The parallel job is restarted using the same number of processors (4) requested in
the original
bsub submission.
% bmig -m hostA 0
All jobs on hostA are checkpointed and moved to another host.