HP XC How To Installing LSF-HPC for SLURM into an existing standard LSF cluster Version 1.
© 2005 Hewlett-Packard Development Company, L.P. © 2005 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. Linux is a U.S.
Contents Introduction Requirements ....................................................................................................................................... 5 HP XC Preparation Ensure that LSF on HP XC is shut down ....................................................................................................... 7 Mount the LSF tree to XC...........................................................................................................................
Revision history Revision tables Table 1 Revisions Date Edition Revision Jun 2005 V1.0 First Edition Nov 2005 V1.
Introduction This HP XC How To describes how you can install LSF-HPC for SLURM into an existing, standard LSF cluster. An understanding of standard LSF installation and configuration procedures is required to perform this procedure. You should be familiar with the LSF installation documentation and the README file provided in the LSF installation tar file. You should also be familiar with the normal procedures in adding a node to an existing LSF cluster, such as establishing default communications (.
A summary of the IP addresses and host names used in this example follows: plain 16.32.1.24 Non-XC node in pre-existing LSF cluster xc-head 16.32.2.128 Head node of XC cluster xc 16.32.2.130 XC cluster alias (not used by LSF) xclsf 16.32.2.140 IP and hostname to be used as the external XC LSF alias The XC cluster alias is mentioned here to prevent confusion between the XC cluster alias and the XC LSF alias.
HP XC preparation You must perform the following steps to prepare to install LSF-HPC for SLURM into an existing standard LSF cluster. Read through all of these steps first to ensure that you understand what is to be done. All steps are performed via a login to the HP XC head node, and most of the steps involve propagating changes to the rest of the cluster. Ensure that LSF on HP XC is shut down Use the following procedure to shut down and remove LSF: 1.
3. 4. Add the appropriate fstab file entry to the /hptc_cluster/etc/fstab.proto file in the section titled ALL. Restart the cluster_fstab service cluster-wide.
You must apply the new rules to every node that might be selected to run the LSF-HPC daemons. A later step in this procedure describes how to generate a new /etc/sysconfig/iptables file on each node, using your modified iptables.proto file. Node-to-node communication LSF can use either rsh or ssh to control all the LSF daemons in the cluster. The daemons expect the selected mechanism to enable access to all nodes without a password.
case $PATH in *-slurm/etc:*) ;; *:/shared/lsf/*) ;; *) if [ -f /shared/lsf/conf/profile.lsf.xc ]; then . /shared/lsf/conf/profile.lsf.xc fi esac [root@xc128 profile.d]# cat lsf.csh if ( "${path}" !~ *-slurm/etc* ) then if ( -f /shared/lsf/conf/cshrc.lsf.xc ) then source /shared/lsf/conf/cshrc.lsf.xc endif endif To summarize the current state of the installation: • The scripts profile.lsf.xc and cshrc.lsf.xc scripts do not exist yet.
2. If additional software (such as rsh) was not installed, use the following procedure to update the golden image and propagate these changes with minimal impact to a running cluster. (This action propagates changes only to those nodes that are up and running): a. Update the golden image with the following command: # updateimage --gc `nodename nh` --image base_image --no-netboot b. Use the pdcp command to propagate the specific file changes to all nodes: # pdcp -a /etc/sysconfig/iptables.
Installation of XC LSF Now that you have prepared the cluster, install XC LSF into the LSF tree as described in the following procedure. 1. Preserve the existing environment setup files. Change directory to the existing LSF_TOP/conf directory and rename the setup files by appending a unique identifier. For example: # cd /shared/lsf/conf # mv profile.lsf profile.lsf.orig # mv cshrc.lsf cshrc.lsf.orig The installation of XC LSF provides its own profile.lsf and cshrc.
The LSF documentation and instructions mentioned at the end of the hpc_install script are generic and do not apply to an HP XC cluster. The following manual procedures describe every task that you must perform on an HP XC cluster: 1. Restore the original environment setup files. Change directory back to the existing LSF_TOP/conf directory and rename the environment setup files to distinguish the XC files and restore the original files. Using the example: # # # # # cd mv mv mv mv /shared/lsf/conf profile.
LSF_RSH=ssh 4. Save and exit the file. Optional: Configure any special XC-specific queues. For HP XC V2.1, HP recommends that you use a JOB_STARTER script, configured for all queues on a XC system. The default installation of LSF on XC provides queue configurations in the /opt/hptc/lsf/etc/configdir/lsb.queues. file. The JOB_STARTER script and its helper scripts are located in the /opt/hptc/lsf/bin/ file.
use of the lshosts or bhosts commands will display the new XC node, although the node status is UNKNOWN and unavailable, respectively. You can now start LSF on XC as follows: [root@xc128 root]# controllsf start This command sets up the virtual LSF alias on the appropriate node and then starts the LSF daemons. It also creates a $LSF_ENVDIR/hosts file (in the example $LSF_ENVDIR = /shared/lsf/conf). This hosts file is used by LSF to map the LSF alias to the actual host name of the node in XC running LSF.
xc1 xc1 In this scenario, the srun command was not found because the user's $PATH did not include /opt/hptc/bin, which is specific to XC. There are several standard ways to address this if necessary. For example, you can add /opt/hptc/bin to the default $PATH on the non-XC node; or create a softlink to the srun command from /usr/bin on all the nodes in XC.
See the Platform LSF documentation for more information on these commands. Sample installation dialog A sample installation script is provided below. Logging installation sequence in /shared/lsf/hpctmp/hpc6.0_hpcinstall/Install.log LSF pre-installation check ... Checking the LSF TOP directory /shared/lsf ... ... Done checking the LSF TOP directory /shared/lsf ... LSF license is defined in "/shared/lsf/conf/lsf.conf", LSF_LICENSE is ignored ... Checking LSF Administrators ...
LSF is already installed ... Old version of LSF configuration files exist in /shared/lsf. LSF configuration files under /shared/lsf/conf will be upgraded. corplsf is an existing cluster ... Updating PRODUCTS line in /shared/lsf/conf/lsf.cluster.corplsf ... 1. Backup /shared/lsf/conf/lsf.cluster.corplsf to /shared/lsf/conf/lsf.cluster.corplsf.old.31585 2. Enable Platform_HPC 3. Remove LSF_Data and LSF_Parallel Setting common HPC external resources to /shared/lsf/conf/lsf.