HP XC System Software Installation Guide Version 3.0

temporary disk space and SLURM features. SLURM features are useful for distinguishing different types of
nodes. You can also modify this file to configure additional node partitions.
HP recommends that you review the /hptc_cluster/slurm/etc/slurm.conf file, particularly to
address the following configuration aspects:
Compute node characteristics
Initially, all nodes with the compute role are listed as SLURM compute nodes, and these nodes are
configured statically with a processor count of two.
A sample default compute node entry in the slurm.conf file looks like this:
NodeName=n[1-64] Procs=2
SLURM provides the ability to set several other compute node characteristics. At a minimum, you should
ensure that the processor count is accurate. You should also set the RealMemory and TmpDisk
characteristics so that SLURM can monitor those values on the nodes and users can submit jobs that
request specific values.
A sample updated entry might look like this:
NodeName=n[1-59] Procs=2 RealMemory=2048 TmpDisk=9036
NodeName=n[60-64] Procs=4 RealMemory=4096 TmpDisk=16384 Weight=2
For more information about setting compute node characteristics, see slurm.conf(5) .
Compute node partition layout
Initially, all nodes are placed into one partition for exclusive management by LSF. To set aside some
nodes for non-LSF use, you must configure a second partition for those nodes.
A sample default partition entry in the slurm.conf file looks like this:
PartitionName=lsf RootOnly=YES Shared=FORCE Nodes=n[1-64]
An updated partition configuration to create a second partition for direct SLURM use by users might
look like this:
PartitionName=lsf RootOnly=YES Shared=FORCE Nodes=n[10-64]
PartitionName=srun Default=YES Nodes=n[1-9]
For more information on configuring partitions, see slurm.conf(5) .
If you make any manual changes to the slurm.conf file, restart SLURM on the head node:
# service slurm restart
You might see service startup error messages if the resource_management or compute roles are not
assigned to the head node. You can ignore these errors.
If you need more information about SLURM, consult the reference manual at:
http://www.llnl.gov/LCdocs/slurm/
Proceed to “Task 9: Run the startsys Utility To Start the System and Propagate the Golden Imageto start
the system and propagate the golden image to all client nodes.
Task 9: Run the startsys Utility To Start the System and Propagate the Golden
Image
The first time the entire system is started with the startsys command, power to each node is turned on,
each node boots from its network adapter, and the SystemImager automatic installation environment is
downloaded. This environment automatically installs and configures each node from the golden image. The
startsys command may take several minutes to power on the nodes on large-scale systems due to scale
requirements.
The number of nodes to be installed influences the amount of time it takes to complete the process. After all
nodes are installed, they automatically reboot to the login prompt. This process can take between two to
three hours on a system with 1024 compute nodes.
This release uses the multicast file transfer technology to download software to client nodes during their
image installation. Multicast file transfer technology provides a fast and scalable method of installing systems.
Using multicast imaging sends data to many nodes simultaneously that have been previously set up to listen
60 Configuring and Imaging the System