HP XC System Software Administration Guide Version 2.1
12
LSF-HPC Administration
The Platform Load Sharin g Facility for H igh Performance Computing (LSF-HPC) prod uct is
installed and configured as an embedded component of the HP XC sy stem durin g installation.
This product has been integrated with SLURM to provide a comprehensive h igh-performance
workload manage men t solution for the HP XC system. This chapter describes the LSF-HPC
product, its installation and operation on the H P XC system with SLURM, and explains the
subtle differences between this product and Platform’s standard LSF product. Topics include
the fo llowing:
• An introductory discussion on LS
F-HPC (Section 12.1)
• LSF-HPC installation details (Section 12.2)
• A description of how to start and stop LSF-HPC (Section 12.3)
• A discussion on the control of th
e LSF-HPC service (Section 1 2.4)
• Information on load indexes and resource information (Section 12.5)
• An explanation of how job s are submitted (Section 12.6)
• A description of th e commands tha
t c o ntrol LSF-H PC jo bs (Sectio n 12.7)
• Information on LSF-HPC job accounting (Section 12.8)
• A discussion on LSF-HPC Failover (Section 12.9)
• A discussion on m on ito ring LSF (
Section 12.10)
• Information on how LSF o n an HP XC system can be enhanced (Section 12.11)
• Information and procedu res on extending LSF (Section 12.12)
• Procedure on configuring an exte
rnal virtual hostname for LSF-H PC on an HP XC
(Section 12.13)
See Chapter 16 for information on LSF-HPC troubleshoo ting.
For y our con venience, the HP XC docu men tation CD co ntains LSF Version 6.0 manu als from
Platform Computing.
12.1 Integration of LSF-HPC with SLURM
LSF-HPC acts primarily as the workload scheduler and node allocator on top of SLURM.
SLURM prov ides a job execution and m onitoring layer for LSF-HPC. LSF-HPC uses SLURM
interfaces to perform t he following:
• To query system topology information.
• To make scheduling decisi
ons.
• To create allocatio ns.
• To signal u ser jobs.
The major difference betwe
en LS F-HPC for SLURM and standard LS F is that LSF-H PC
daemons run on only one no
de in the HP XC system, that node is known as th e LSF-HPC
Execution Host. The LSF-
HPC daemons rely on SLURM to provide information on the other
computing resources (no
des) in the system. The L SF-HPC daemons consolidate this inform ation
into one entit y, such t h
at these daemons present the HP XC system as one "virtual" LSF host.
Use the controllsf command to determine which node is the LSF-HPC E xecution Host:
LSF-HPC Administration 12-1