HP XC System Software Installation Guide Version 3.0
If your system is using a QsNet
II
interconnect, ensure that the number of node entries in the
/opt/hptc/libelanhosts/etc/elanhosts file matches the expected number of operational
nodes in the cluster. If the number does not match, verify the status of the nodes to ensure that they are
all up and running and rerun the spconfig script.
Output from the spconfig utility looks similar to the following for all other interconnect types:
Configured unknown node n14 with 1 CPU and 4872 MB of total memory...
Restarting SLURM...
SLURM Post-Configuration Done.
5. Complete this step for systems installed with an InfiniBand interconnect; skip this step for all other
interconnect types.
Create a list of all nodes in the HP XC system. The InfiniBand diagnostic tools requires this list of nodes
in order to function properly:
# shownode all > /usr/voltaire/scripts/HCA400-Checks/node-list
Proceed to “Task 10: Confirm Compute Resources” to confirm compute resources.
Task 10: Confirm Compute Resources
Follow this procedure to verify that compute resources are functioning properly:
1. Begin this procedure as the root user on the head node.
2. Set up the LSF environment by sourcing the LSF profile file:
# . /opt/hptc/lsf/top/conf/profile.lsf
3. Verify that the LSF profile file has been sourced by finding an LSF command:
# which lsid
/opt/hptc/lsf/top/6.1/linux2.6-glibc2.3-amd64-slurm/bin/lsid
Note
This sample output was obtained from an Opteron-based system. Thus, the directory name
linux2.6-glibc2.3-amd64-slurm is included in the path (the string amd64 signifies an
Opteron-based architecture). When an Itanium-based system is configured, the string ia64 is included
in the directory name. The string slurm exists in the path only if LSF-HPC with SLURM is configured.
4. If SLURM is configured, verify that the lsf partition exists:
# sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
lsf up infinite 3 idle n[14-16]
5. Wait a few seconds for the LSF daemons to stabilize, then run the following commands to confirm that
licensing is correct, the correct number of available processors are listed (it should match the number
of processors in the lsf partition), and that the status of the system is shown as ok. Command output
looks different depending upon which type of LSF is installed and configured:
• LSF-HPC with SLURM:
Verify that LSF-HPC with SLURM is running:a.
# lsid
Platform LSF HPC 6.1 for SLURM, LSF_build_date
Copyright 1992-2005 Platform Computing Corporation
My cluster name is hptclsf
My master name is lsfhost.localdomain
b. Verify the static resource information:
Task 10: Confirm Compute Resources 63