HP XC System Software Administration Guide Version 2.1
Obtain the MemTotal value from /proc/meminfo,t
hen divide that value by 1024 to
determine the correct amount of RealMemory on a
node. Use the cexec command to gather
this infor mat ion from all the nodes in the HP XC
system:
# cexec -a ’grep MemTotal /proc/meminfo’
SLURM p ro vid es a great deal of flexibility in con fig uring the compute nodes. Here is an
example of the NodeName entries on an HP XC system with two large 4-processor S M P nodes
acting as service and l ogin nodes for users:
NodeName=n[1-126] Procs=2 RealMemory=3
456 Feature=compute
NodeName=n[127-128] Procs=4 RealMemory=
4096 Feature=service
The corresponding partition configuration for this example m ig ht be:
PartitionName=lsf RootOnly=YES Shared=FORCE Nodes=n[1-126]
PartitionName=dev Default=YES Nodes=n[127-128]
This configuration allows users to submit
jobs to run o n t he two service n odes (for example,
for d evelo pm ent purposes), and puts the r
est of the pure compute nodes under control of
LSF-HPC. The Feature setting is only usef
ul if all nodes were placed into one partition. See
the slurm.conf
(5) and srun(1) m anpages
for more information on node features.
After changing the nod e and/or partitio n con figuration setti ngs, run th e following com mand to
update the SLURM daemons:
# scontrol reconfigure
The lsf partitionisrequiredbyLSF-H
PC to identify the nodes available for its management.
The RootOnly setting ensures that on
ly the superuser (root) can request use of these nodes;
LSF-HPC daemons are run by root. Th
e Shared=FORCE setting allows LSF-HPC to dispatch
more than one job to a node in this p
artition and to support preem ption and the efficient use of
the resources for serial (single
processor) jobs.
The SwitchType SLURM configuration setting is set during cluster_config and cannot
be adjusted by the installer (except manually). The cluster_config process queries
the HP XC cmdb for the HP X C system interconnect type, and if it is Quadrics Elan, the
SwitchType is set to switch/elan. Otherwise it is set to switch/none. This setting
enables or disables SLURM’s Quadrics Elan support.
If the SwitchType setting is adjusted manually, you will need to restart SLURM:
# cexec -a service slurm resta
rt
Although a number of options are available, the slurm.conf file is generally set up to
perform optimally on an HP XC sy stem. However, you might wan t to change the node
characteristics or the assignment of nodes to partitions. You can also assign prim ary and backup
nodes for the SLURM control daemon.
The follo wing sections describe the configuration of S LURM servers, nodes, and partitions.
11.2.1 Configuring SLUR
M System Interconnect Support
SLURM h as system interconnect support for Quadrics ELAN, which assists MPI jobs with the
global exchange process during startup, when each process is establishing the communication
channels w ith the other processes in t he job.
During th e initial ins
tallation of SLURM on HP XC, the installer checks for the existence of
oneormoreQuadricsEl
an network cards. If at least one is discovered, the installer configures
SLURM with elan sw itc
h suppo rt in the slurm.conf configuration file. This setting can be
viewed and adjusted d
uring the SLURM portion of the cluster_config inst allat ion step .
See the HP XC System So
ftware Installation Guide for i nfo rm ation.
SLURM Administration 11-3