Installing Standard LSF on a Subset of HP XC Nodes
Re-run cluster_config to update node
roles and re-image
1. shutdown the rest of the cluster with stopsys.
2. Change directory to /opt/hptc/config/sbin and execute ./cluster_config as follows:
a. select "Modify Nodes" and change the roles on the fat nodes to remove the
"compute" and "resource_management" roles. Ensure there's at least one
"resource_management" role remaining in the cluster (2 "resource_management"
nodes are recommended).
b. do not re-install LSF.
3. When ./cluster_config finishes, you may need to manually adjust the
/hptc_cluster/slurm/etc/slurm.conf file to remove the fat nodes from the NodeName and
PartitionName entries.
4. Run scontrol reconfig to update SLURM with the changed information.
Restart the cluster with startsys
6. When startsys is complete and the nodes have re-imaged, everything should be up and running:
# sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
lsf up infinite 6 idle xc[7-120]
# lshosts
HOST_NAME type model cpuf ncpus maxmem maxswp server RESOURCES
lsfhost.loc SLINUX6 Itanium2 60.0 228 1973M - Yes (slurm)
xc1 LINUX64 Itanium2 60.0 8 3456M 6143M Yes ()
xc2 LINUX64 Itanium2 60.0 8 3456M 6143M Yes ()
xc3 LINUX64 Itanium2 60.0 8 3456M 6143M Yes ()
xc4 LINUX64 Itanium2 60.0 8 3456M 6143M Yes ()
xc5 LINUX64 Itanium2 60.0 8 3456M 6143M Yes ()
xc6 LINUX64 Itanium2 60.0 8 3456M 6143M Yes ()
7. Only those nodes on which role changes were made will be re-imaged. ,This means that standard
LSF binaries and the slsf script and softlink will not be present on the "thin" nodes. See the HP XC
Administration Guide on the use of the updateclient command to update the "thin" nodes with these
latest file changes.
Note that the "thin" nodes do not need to be updated with these files in order to complete this
procedure. It is just a matter of consistency among all the nodes in the cluster. The "thin" nodes can
be brought up-to-date with these changes at a later time. Refer to the HP XC documentation for more
information on these commands.
Verification
Change to a non-root user and test the changes by running some jobs:
$ bsub -I -n1 -R type=LINUX64 hostname
Job <176> is submitted to default queue <normal>.