HP XC System Software Administration Guide Version 3.1

ManualsBrandsHP ManualsSoftwareHP XC System 3.x Software

171

172

173

174

175

176

177

178

179

180

The log directory is moved to /var/lsf so that per-node LSF daemon logging is stored locally and that

it is unaffected by updateimage operations. However the logs will be lost during a reimage operation.

The LSF directory containing the binary files remains in /opt/hptc/lsf/top; it will be imaged to all

the other nodes.

Also during the operation of the cluster_config utility, the HP XC nodes without the compute role

are configured to remain closed with 0 job slots available for use. This is done by editing the Hosts section

of the lsb.hosts file and configuring these hosts with MXJ (or Maximum Job Slots) set to zero (0).

You can run LSF commands from these hosts, but no jobs run on them.

Nodes without the compute role are closed with '0' job slots available for use.

The LSF environment is set up automatically for the user on login. LSF commands and their manpages

are readily accessible. The profile.lsf and cshrc.lsf source files are copied from the

/hptc_cluster/lsf/conf directory to the /opt/hptc/lsf/top/env directory, which is specific to

each node. Then the and then creating the /etc/profile.d/lsf.sh and /etc/profile.d/lsf.csh

files that reference the appropriate source file upon login are created.

Finally, Standard LSF-HPC is configured to start when the HP XC boots up. A soft link from

/etc/init.d/lsf to the lsf_daemons startup script provided by Standard LSF-HPC is created.

All this configuration optimizes the installation of Standard LSF-HPC on HP XC.

The following LSF commands are particularly useful:

• The bhosts command is useful for viewing LSF batch host information.

• The lshosts command provides static resource information.

• The lsload command provides dynamic resource information.

• The bsub command is used to submit jobs to LSF.

• The bjobs command provides information on batch jobs.

For more information on using Standard LSF-HPC on the HP XC system, see the Platform LSF

documentation available on the HP XC documentation disk.

15.2 LSF-HPC with SLURM

The Platform Load Sharing Facility for High Performance Computing (LSF-HPC with SLURM) product

is installed and configured as an embedded component of the HP XC system during installation. This

product has been integrated with SLURM to provide a comprehensive high-performance workload

management solution for the HP XC system.

This section describes the LSF-HPC with SLURM product that differentiate it from Standard LSF-HPC.

These topics include integration with SLURM, job starter scripts, the SLURM lsf partition, the SLURM

external scheduler, and LSF-HPC with SLURM failover.

See “Troubleshooting” (page 229) for information on LSF-HPC with SLURM troubleshooting.

See “Installing LSF-HPC with SLURM into an Existing Standard LSF Cluster ” (page 251) for information

on extending the LSF-HPC with SLURM cluster.

15.2.1 Integration of LSF-HPC with SLURM

The LSF component of the LSF-HPC with SLURM product acts primarily as the workload scheduler and

node allocator running on top of SLURM. The SLURM component provides a job execution and monitoring

layer for LSF-HPC with SLURM. LSF-HPC with SLURM uses SLURM interfaces to perform the following:

• To query system topology information for scheduling purposes.

• To create allocations for user jobs.

• To dispatch and launch user jobs.

• To monitor user job status.

• To signal user jobs and cancel allocations.

• To gather user job accounting information.

The major difference between LSF-HPC with SLURM and Standard LSF-HPC is that LSF-HPC with SLURM

daemons run on only one node in the HP XC system, that node is known as the LSF execution host. The

LSF-HPC with SLURM daemons rely on SLURM to provide information on the other computing resources

178 Managing LSF