LSF Version 7.3 - Using Platform LSF HPC
Using LSF with LSTC LS-Dyna
LSF is integrated with products from Livermore Software Technology Corporation
(LSTC). LS-Dyna jobs can use the checkpoint and restart features of LSF and take
advantage of both SMP and distributed MPP parallel computation.
To submit LS-Dyna jobs through LSF, you only need to make sure that your jobs are
checkpointable.
See Administering Platform LSF for more information about checkpointing in LSF.
◆
Platform LSF
◆
LS-Dyna version 960 and higher, available from LSTC
◆
Hardware vendor-supplied MPI environment for network computing
◆
LSF MPI integration
Configuring LSF for LS-Dyna jobs
During installation, lsfinstall adds the Boolean resource ls_dyna to the Resource
section of
lsf.shared.
LSF also installs the
echkpnt.ls_dyna and erestart.ls_dyna files in
LSF_SERVERDIR.
If only some of your hosts can accept LS-Dyna jobs, configure the Host section of
lsf.cluster.
cluster_name
to identify those hosts.
Edit
LSF_ENVDIR/conf/lsf.cluster.cluster_name file and add the
ls_dyna resource to the hosts that can run LS-Dyna jobs:
Begin Host
HOSTNAME model type server r1m mem swp RESOURCES
...
hostA ! ! 1 3.5 () () ()
hostB ! ! 1 3.5 () () (ls_dyna)
hostC ! ! 1 3.5 () () ()
...
End Host
LS-Dyna integration with LSF checkpointing
LS-Dyna is integrated with LSF to use the LSF checkpointing capability. It uses
application-level checkpointing, working with the functionality implemented by LS-
Dyna. At the end of each time step, LS-Dyna looks for the existence of a checkpoint
trigger file, named D3KIL.
Use the bchkpnt command to create the checkpoint trigger file, D3KIL, which LS-
Dyna reads. The file forces LS-Dyna to checkpoint, or checkpoint and exit itself. The
existence of a
D3KIL file and the checkpoint information that LSF writes to the
checkpoint directory specified for the job are all LSF needs to restart the job.
Checkpointing and tracking of resources of SMP jobs is supported.