HP XC System Software User’s Guide Part Number: AA-RWJVB-TE June 2005 Product Version: HP XC System Software Version 2.1 This document provides information about the HP XC user and programming environment.
© Copyright 2003–2005 Hewlett-Packard Development Company, L.P. UNIX® is a registered trademark of The Open Group. Linux® is a U.S. registered trademark of Linus Torvalds. LSF, Platform Computing, and the LSF and Platform Computing logos are trademarks or registered trademarks of Platform Computing Corporation. Intel®, the Intel logo, Itanium®, Xeon™, and Pentium® are trademarks or registered trademarks of Intel Corporation in the United States and other countries.
Contents About This Document 1 Overview of the User Environment 1.1 1.1.1 1.1.2 1.1.3 1.1.4 1.1.5 1.1.6 1.2 1.2.1 1.2.2 1.2.3 1.2.3.1 1.2.3.2 1.2.3.3 1.2.3.4 1.2.3.5 1.3 1.3.1 1.3.2 1.4 1.4.1 1.4.2 1.4.3 1.4.4 1.5 2 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operating System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Launching and Managing Jobs Quick Start . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Getting Information About Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Getting Information About Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Compiling and Linking HP-MPI Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.2.2 Examples of Compiling and Linking HP-MPI Applications . . . . . . . . . . . . 3.7.2.3 Developing Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 Designing Libraries for XC4000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.1 Advanced Topics . . . . . . . . . .
.4.6.1 6.4.6.2 6.4.7 6.4.8 6.4.9 6.4.10 6.5 6.6 6.7 6.8 6.9 6.10 7 Introduction to LSF in the HP XC Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Overview of LSF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Topology Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Notes on LSF-HPC . . . . . . . . . . .
8.2 8.3 8.3.1 8.3.2 8.3.2.1 8.3.2.2 8.3.3 8.3.3.1 8.3.3.2 8.3.3.3 8.3.4 8.3.5 8.4 8.4.1 8.4.2 8.5 8.6 8.7 8.8 8.9 8.9.1 8.9.2 8.9.3 8.9.4 8.9.5 8.9.6 8.9.7 8.9.8 8.9.9 8.9.10 8.10 8.11 8.12 9 HP-MPI Directory Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Compiling and Running Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Setting Environment Variables . . . . . . . . .
9.3.1 9.3.2 9.3.3 9.3.4 9.3.5 9.3.5.1 9.3.5.2 9.3.5.3 9.3.5.4 9.3.6 9.3.7 10 9-4 9-4 9-5 9-5 9-5 9-5 9-5 9-5 9-6 9-6 9-6 Advanced Topics 10.1 10.2 A Platform Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Library Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MPI Parallelism . . . . . . . . . . . . . . . . . . . . . . . .
7-2 7-3 7-4 7-5 7-6 7-7 7-8 7-9 7-10 7-11 7-12 7-13 7-14 7-15 7-16 7-17 7-18 7-19 7-20 7-21 7-22 7-23 8-1 8-2 8-3 8-4 8-5 7-12 7-12 Using the External Scheduler to Submit a Job to Run on Specific Nodes . . . . . . . Using the External Scheduler to Submit a Job to Run One Task per Node . . . . . . Using the External Scheduler to Submit a Job That Excludes One or More Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
About This Document This manual provides information about using the features and functions of the HP XC System Software and describes how the HP XC user and programming environments differ from standard Linux® system environments.
• Chapter 9 describes how to use MLIB on the HP XC system. • Appendix A provides examples of HP XC applications. • The Glossary provides definitions of the terms used in this manual. HP XC Information The HP XC System Software Documentation Set includes the following core documents. All XC documents, except the HP XC System Software Release Notes, are shipped on the XC documentation CD.
HP Message Passing Interface HP Message Passing Interface (MPI) is an implementation of the MPI standard for HP systems. The home page is located at the following URL: http://www.hp.com/go/mpi HP Mathematical Library The HP math libraries (MLIB) support application developers who are looking for ways to speed up development of new applications and shorten the execution time of long-running technical applications. The home page is located at the following URL: http://www.hp.
• http://www.nagios.org/ Home page for Nagios®, a system and network monitoring application. Nagios watches specified hosts and services and issues alerts when problems occur and when problems are resolved. Nagios provides the monitoring capabilities on an XC system. • http://supermon.sourceforge.net/ Home page for Supermon, a high-speed cluster monitoring system that emphasizes low perturbation, high sampling rates, and an extensible data protocol and programming interface.
Related Information This section provides pointers to the Web sites for related software products and provides references to useful third-party publications. The location of each Web site or link to a particular topic is subject to change without notice by the site provider. Related Linux Web Sites • http://www.redhat.com Home page for Red Hat®, distributors of Red Hat Enterprise Linux Advanced Server, a Linux distribution with which the HP XC operating environment is compatible. • http://www.linux.
• Linux Administration Unleashed, by Thomas Schenk, et al. • Managing NFS and NIS, by Hal Stern, Mike Eisler, and Ricardo Labiaga (O’Reilly) • MySQL, by Paul Debois • MySQL Cookbook, by Paul Debois • High Performance MySQL, by Jeremy Zawodny and Derek J. Balling (O’Reilly) • Perl Cookbook, Second Edition, by Tom Christiansen and Nathan Torkington • Perl in A Nutshell: A Desktop Quick Reference , by Ellen Siever, et al.
discover(8) A cross-reference to a manpage includes the appropriate section number in parentheses. For example, discover(8) indicates that you can find information on the discover command in Section 8 of the manpages. Ctrl/x In interactive command examples, this symbol indicates that you hold down the first named key while pressing the key or button that follows the slash ( / ). When it occurs in the body of text, the action of pressing two or more keys is shown without the box.
1 Overview of the User Environment The HP XC system is a collection of computer nodes, networks, storage, and software built into a cluster that work together to present a single system. It is designed to maximize workload and I/O performance, and provide efficient management of large, complex, and dynamic workloads. The HP XC system provides a set of integrated and supported user features, tools, and components which are described in this chapter.
different roles that can be assigned to a client node, the following roles contain services that are of special interest to the general user: login role The role most visible to users is on nodes that have the login role. Nodes with the login role are where you log in and interact with the system to perform various tasks. For example, once logged in to a node with login role, you can execute commands, build applications, or submit jobs to compute nodes for execution.
choose to use either the HP XC Administrative Network, or the XC system Interconnect, for NFS operations. The HP XC system interconnect can potentially offer higher performance, but only at the potential expense of the performance of application communications. For high-performance or high-availability file I/O, the Lustre file system is available on HP XC. The Lustre file system uses POSIX-compliant syntax and semantics.
nodes of the system. The system interconnect network is a private network within the HP XC. Typically, every node in the HP XC is connected to the system interconnect. The HP XC system interconnect can be based on either Gigabit Ethernet or Myrinet-2000 switches. The types of system interconnects that are used on HP XC systems are: • Myricom Myrinet on HP Cluster Platform 4000 (ProLiant/Opteron servers), also referred to as XC4000 in this manual.
1.2.3.1 Linux Commands The HP XC system supports the use of standard Linux user commands and tools. Standard Linux commands are not described in this document. You can access descriptions of Linux commands in Linux documentation and manpages. Linux manpages are available by invoking the Linux man command with the Linux command name. 1.2.3.2 LSF Commands HP XC supports LSF-HPC and the use of standard LSF commands, some of which operate differently in the HP XC environment from standard LSF behavior.
1.4 Run-Time Environment In the HP XC environment, LSF-HPC, SLURM, and HP-MPI work together to provide a powerful, flexible, extensive run-time environment. This section describes LSF-HPC, SLURM, and HP-MPI, and how these components work together to provide the HP XC run-time environment. 1.4.1 SLURM SLURM (Simple Linux Utility for Resource Management) is a resource management system that is integrated into the HP XC system. SLURM is suitable for use on large and small Linux clusters.
request. LSF-HPC always tries to pack multiple serial jobs on the same node, with one CPU per job. Parallel jobs and serial jobs cannot coexist on the same node. After the LSF-HPC scheduler allocates the SLURM resources for a job, the SLURM allocation information is recorded with the job. You can view this information with the bjobs and bhist commands. When LSF-HPC starts a job, it sets the SLURM_JOBID and SLURM_NPROCS environment variables in the job environment.
supported as part of the HP XC. The tested software packages include, but are not limited to, the following: • Intel Fortran 95, C, C++ Compiler Version 7.1 and 8.0, including OpenMP, for Itanium (includes ldb debugger) • gcc version 3.2.3 (included in the HP XC distribution) • g77 version 3.2.3 (included in the HP XC distribution) • Portland Group PGI Fortran90, C, C++ Version 5.
2 Using the System This chapter describes tasks and commands that the general user must know to use the system. It contains the following topics: • Logging in to the system (Section 2.1) • Setting up the user environment (Section 2.2) • Launching and managing jobs (Section 2.3) • Performing some common user tasks (Section 2.4) • Getting help (Section 2.5) 2.1 Logging in to the System Logging in to an HP XC system is similar to logging in to any standard Linux system.
environment variables, such as PATH and MANPATH, to enable access to various installed software. One of the key features of using modules is to allow multiple versions of the same software to be used in your environment in a controlled manner. For example, two different versions of the Intel C compiler can be installed on the system at the same time – the version used is based upon which Intel C compiler modulefile is loaded. The HP XC software provides a number of modulefiles.
of shared objects. If you have multiple compilers (perhaps with incompatible shared objects) installed, it is probably wise to set MPI_CC (and others) explicitly to the commands made available by the compiler’s modulefile. The contents of the modulefiles in the modulefiles_hptc RPM use the vendor-intended location of the installed software. In many cases, this is under the /opt directory, but in a few cases (for example, the PGI compilers and TotalView) this is under the /usr directory.
Table 2-1: Supplied Modulefiles (cont.) Modulefile Sets the HP XC User Environment: intel/8.1 For Intel Version 8.1 compilers. mlib/intel/7.1 For MLIB and Intel Version 7.1 compilers. mlib/intel/8.0 For MLIB and Intel Version 8.0 compilers. mlib/pgi/5.1 For MLIB and PGI Version 5.1 compilers. mpi/hp For HP-MPI. pgi/5.1 For PGI Version 5.1 compilers. pgi/5.2 For PGI Version 5.2 compilers. idb/7.3 To use the Intel IDB debugger. totalview/default For the TotalView debugger. 2.2.
If you encounter a modulefile conflict when loading a modulefile, you must unload the conflicting modulefile before you load the new modulefile. Refer to Section 2.2.8 for further information about modulefile conflicts. 2.2.6.1 Loading a Modulefile for the Current Session You can load a modulefile for your current login session as needed.
ifort/8.0(19):ERROR:102: Tcl command execution failed: conflict ifort/8.1 In this example, the user attempted to load the ifort/8.0 modulefile, but after issuing the command to load the modulefile, an error message occurred indicating a conflict between this modulefile and the ifort/8.1 modulefile, which is already loaded. When a modulefile conflict occurs, unload the conflicting modulefile(s) before loading the new modulefile. In the above example, you should unload the ifort/8.
2.3 Launching and Managing Jobs Quick Start This section provides a brief description of some of the many ways to launch jobs, manage jobs, and get information about jobs on an HP XC system. This section is intended only as a quick overview about some basic ways of running and managing jobs. Full information and details about the HP XC job launch environment are provided in the SLURM chapter (Chapter 6) and the LSF chapter (Chapter 7) of this manual. 2.3.1 Introduction As described in Section 1.
• The LSF lshosts command displays machine-specific information for the LSF execution host node. $ lshosts Refer to Section 7.3.2 for more information about using this command and a sample of its output. • The LSF lsload command displays load information for the LSF execution host node. $ lsload Refer to Section 7.3.3 for more information about using this command and a sample of its output. 2.3.
2.3.5.2 Submitting a Non-MPI Parallel Job Submitting non-MPI parallel jobs is discussed in detail in Section 7.4.4. The LSF bsub command format to submit a simple non-MPI parallel job is: bsub -n num-procs [bsub-options] srun [srun-options] executable [executable-options] The bsub command submits the job to LSF-HPC. The -n num-procs parameter specifies the number of processors requested for the job. This parameter is required for parallel jobs.
Example 2-3: Submitting a Non-MPI Parallel Job to Run One Task per Node $ bsub -n4 -ext "SLURM[nodes=4]" -I srun hostname Job <22> is submitted to default queue <> <> n1 n2 n3 n4 2.3.5.3 Submitting an MPI Job Submitting MPI jobs is discussed in detail in Section 7.4.5.
Example 2-5: Running an MPI Job with LSF Using the External Scheduler Option (cont.) Hello world! Hello world! Hello world! I’m 2 of 4 on host2 I’m 3 of 4 on host3 I’m 4 of 4 on host4 2.3.5.4 Submitting a Batch Job or Job Script Submitting batch jobs is discussed in detail in Section 7.4.6. The bsub command format to submit a batch job or job script is: bsub -n num-procs [bsub-options] script-name The -n num-procs option specifies the number of processors the job requests.
2.3.6 Getting Information About Your Jobs You can obtain information about your running or completed jobs with the bjobs and bhist commands. bjobs Checks the status of a running job (Section 7.5.2) bhist Gets brief or full information about finished jobs (Section 7.5.3) The components of the actual SLURM allocation command can be seen with the bjobs -l and bhist -l LSF commands. 2.3.7 Stopping and Suspending Jobs You can suspend or stop your jobs with the bstop and bkill commands.
distributed with the HP XC cluster, such as HP-MPI. Manpages for third-party vendor software components may be provided as a part of the deliverables for that software component. To access manpages, type the man command with the name of a command. For example: $ man sinfo This command accesses the manpage for the SLURM sinfo command.
3 Developing Applications This chapter discusses topics associated with developing applications in the HP XC environment. Before reading this chapter, you should you read and understand Chapter 1 and Chapter 2. This chapter discusses the following topics: • HP XC application development environment overview (Section 3.1) • Using compilers (Section 3.2) • Getting system information (Section 3.3) • Getting system information (Section 3.4) • Setting debugging options (Section 3.
3.2 Using Compilers You can use compilers acquired from other vendors on an HP XC system. For example, HP XC supports Intel C/C++ and Fortran compilers for the 64-bit architecture, and Portland Group C/C++ and Fortran compilers for the XC4000 platform. You can use other compilers and libraries on the HP XC system as on any other system, provided they contain single-processor routines and have no dependencies on another message-passing system. 3.2.
3.2.4 Pathscale Compilers Compilers in the Pathscale EKOPath Version 2.1 Compiler Suite are supported on HP XC4000 systems only. See the following Web site for more information: http://www.pathscale.com/ekopath.html. 3.2.5 MPI Compiler The HP XC System Software includes MPI. The MPI library on the HP XC system supports HP MPI 2.1. 3.3 Checking Nodes and Partitions Before Running Jobs Before launching an application, you can determine the availability and status of the system’s nodes and partitions.
• Section 3.6.1 describes the serial application programming model. • Section 3.6.2 discusses how to build serial applications. For further information about developing serial applications, refer to the following sections: • Section 4.1 describes how to debug serial applications. • Section 6.4 describes how to launch applications with the srun command. • Section A.1 provides examples of serial applications. 3.6.
• Launching applications with the srun command (Section 6.4) • Advanced topics related to developing parallel applications (Section 3.9) • Debugging parallel applications (Section 4.2) 3.7.1 Parallel Application Build Environment This section discusses the parallel application build environment on an HP XC system.
Compilers from GNU, Intel and PGI provide a -pthread switch to allow compilation with the Pthread library. Packages that link against Pthreads, such as MKL and MLIB, require that the application is linked using the -pthread option. The Pthread option is invoked with the following compiler-specific switches: GNU -pthread Intel -pthread PGI -lpgthread For example: $ mpicc object1.o ... -pthread -o myapp.exe 3.7.1.
The HP XC cluster comes with a modulefile for HP-MPI. The mpi modulefile is used to set up the necessary environment to use HP-MPI, such as the values of the search paths for header and library files. Refer to Chapter 8 for information and examples that show how to build and run an HP-MPI application. 3.7.1.8 Intel Fortran and C/C++Compilers Intel Fortran compilers (Version 7.x and greater) are supported on the HP XC cluster. However, the HP XC cluster does not supply a copy of Intel compilers.
3.7.1.15 Reserved Symbols and Names The HP XC system reserves certain symbols and names for internal use. Reserved symbols and names should not be included in user code. If a reserved symbol or name is used, errors could occur. 3.7.2 Building Parallel Applications This section describes how to build MPI and non-MPI parallel applications on an HP XC system. 3.7.2.
3.8 Developing Libraries This section discusses developing shared and archive libraries for HP XC applications. Building a library generally consists of two phases: • Compiling sources to objects • Assembling the objects into a library - Using the ar archive tool for archive (.a) libraries - Using the linker (possibly indirectly by means of a compiler) for shared (.so) libraries. For sufficiently small shared objects, it is often possible to combine the two steps.
has /opt/mypackage/lib in it, which will then be able to handle both 32-bit and 64-bit binaries that have linked against libmystuff.so. Example 3-1: Directory Structure /opt/mypackage/ include/ mystuff.h lib/ i686/ libmystuff.a libmystuff.so x86_64/ libmystuff.a libmystuff.so If you have an existing paradigm using different names, HP recommends introducing links with the above names. An example of this is shown in Example 3-2. Example 3-2: Recommended Directory Structure /opt/mypackage/ include/ mystuff.
single compilation line, so it is common to talk about concurrent compilations, though GNU make is more general. On non-cluster platforms or command nodes, matching concurrency to the number of processors often works well. It also often works well to specify a few more jobs than processors so that one job can proceed while another is waiting for I/O. On an HP XC system, there is the potential to use compute nodes to do compilations, and there are a variety of ways to make this happen.
srcdir = . HYPRE_DIRS =\ utilities\ struct_matrix_vector\ struct_linear_solvers\ test all: @ \ for i in ${HYPRE_DIRS}; \ do \ if [ -d $$i ]; \ then \ echo "Making $$i ..."; \ (cd $$i; make); \ echo ""; \ fi; \ done clean: @ \ for i in ${HYPRE_DIRS}; \ do \ if [ -d $$i ]; \ then \ echo "Cleaning $$i ..."; \ (cd $$i; make clean); \ fi; \ done veryclean: @ \ for i in ${HYPRE_DIRS}; \ do \ if [ -d $$i ]; \ then \ echo "Very-cleaning $$i ..."; \ (cd $$i; make veryclean); \ fi; \ done 3.9.1.
By modifying the makefile to reflect the changes illustrated above, we will now be processing each directory serially and parallelize the individual makes within each directory. The modified Makefile is invoked as follows: $ make PREFIX=’srun –n1 –N1 MAKE_J=’-j4’ 3.9.1.2 Example Procedure 2 Go through the directories in parallel and have the make procedure within each directory be serial. For the purpose of this exercise we are only parallelizing the “make all” component.
utilities/libHYPRE_utilities.a: $(PREFIX) $(MAKE) $(MAKE_J) -C utilities The modified Makefile is invoked as follows: $ make PREFIX=’srun -n1 -N1’ MAKE_J=’-j4’ 3.9.2 Local Disks on Compute Nodes The use of a local disk for private, temporary storage may be configured on the compute nodes of your HP XC system. Contact your system administrator to find out about the local disks configured on your system. A local disk is a temporary storage space and does not hold data across execution of applications.
3.9.4 Communication Between Nodes On the HP XC system, processes in an MPI application run on compute nodes and use the system interconnect for communication between the nodes. By default, intranode communication is done using shared memory between MPI processes. Refer to Chapter 8 for information about selecting and overriding the default system interconnect.
4 Debugging Applications This chapter describes how to debug serial and parallel applications in the HP XC development environment. In general, effective debugging of applications requires the applications to be compiled with debug symbols, typically the -g switch. Some compilers allow -g with optimization. 4.1 Debugging Serial Applications Debugging a serial application on an HP XC system is performed the same as debugging a serial application on a conventional Linux operating system.
4.2.1 Debugging with TotalView You can purchase the TotalView debugger, from Etnus, Inc., for use on the HP XC cluster. TotalView is a full-featured, GUI-based debugger specifically designed to meet the requirements of parallel applications running on many processors. TotalView has been tested for use in the HP XC environment. However, it is not included with the HP XC software and technical support is not provided by HP XC. If you install and use TotalView, and have problems with it, contact Etnus, Inc.
• 3. If TotalView is not installed, have your administrator install it. Then either you or your administrator should set up your environment, as described in the next step. Set the DISPLAY environment variable of the system that hosts TotalView to display on your local system. Also, run the xhosts command to accept data from the system that hosts TotalView; see the X(7X) manpage for more information. 4. Set up your environment to run TotalView.
4.2.1.5 Starting TotalView for the First Time This section tells you what you must do when running TotalView for the first time — before you begin to use it to debug an application. The steps in this section assume that you have already set up your environment to run TotalView, as described in Section 4.2.1.2. The first time you use TotalView, you should set up preferences. For example, you need to tell TotalView how to launch TotalView processes on all of the processors.
2. Select Preferences from the File pull-down menu of the TotalView Root Window. A Preferences window is displayed, as shown in Figure 4-2.
3. 4-6 In the Preferences window, click on the Launch Strings tab.
4. In the Launch Strings tab, ensure that the Enable single debug server launch button is selected. 5. In the Launch Strings table, in the area immediately to the right of Command:, assure that the default command launch string shown is the following string: %C %R -n "%B/tvdsvr -working_directory %D -callback %L -set_pw %P -verbosity %V %F" If it is not the above string, you may be able to obtain this setting by pressing the Defaults button.
6. In the Preferences window, click on the Bulk Launch tab. Make sure that Enable debug server bulk launch is not selected. 7. Click on the OK button at the bottom-left of the Preferences window to save these changes. The file is stored in the .totalview directory in your home directory. As long as the file exists, you can omit the steps in this section for subsequent TotalView runs. 8. Exit TotalView by selecting Exit from the File pulldown menu.
3. The TotalView main control window, called the TotalView root window, is displayed. It displays the following message in the window header: Etnus TotalView Version# 4. The TotalView process window is displayed (Figure 4-3). This window contains multiple panes that provides various debugging functions and debugging information. The name of the application launcher that is being used (either srun or mpirun) is displayed in the title bar. Figure 4-3: TotalView Process Window Example 5.
7. Click Yes in this pop-up window. The TotalView root window appears and displays a line for each process being debugged. If you are running Fortran code, another pop-up window may appear with the following warning: Sourcefile initfdte.f was not found, using assembler mode. Click OK to close this pop-up window . You can safely ignore this warning. 8. 9. You can now set a breakpoint somewhere in your code. The method to do this may vary slightly between versions of TotalView. For TotalView Version 6.
5. In a few seconds, the TotalView Process Window will appear, displaying information on the srun process. In the TotalView Root Window, click Attached (Figure 4-5). Double-click one of the remote srun processes to display it in the TotalView Process Window. Figure 4-5: Attached Window 6. At this point, you should be able to debug the application as in Step 8 of Section 4.2.1.6. 4.2.1.8 Exiting TotalView It is important that you make sure your job has completed before exiting TotalView.
5 Tuning Applications This chapter discusses how to tune applications in the HP XC environment. 5.1 Using the Intel Trace Collector/Analyzer This section describes how to use the Intel Trace Collector (ITC) and Intel Trace Analyzer (ITA) with HP-MPI on an HP XC system. The Intel Trace Collector/Analyzer were formerly known as VampirTrace and Vampir, respectively. The following topics are discussed in this section: • Building a Program (Section 5.1.1) • Running a Program (Section 5.1.
CLDFLAGS -ldwarf FLDFLAGS -ldwarf = -static-libcxa -L$(VT_ROOT)/lib $(TLIB) -lvtunwind \ -lnsl -lm -lelf -lpthread = -static-libcxa -L$(VT_ROOT)/lib $(TLIB) -lvtunwind \ -lnsl -lm -lelf -lpthread In the cases where Intel compilers are used, add the -static-libcxa option to the link line. Otherwise the following type of error will occur at run-time: $ mpirun.mpich -np 2 ~/examples_directory/vtjacobic ~/examples_directory/vtjacobic: error while loading shared libraries: libcprts.so.
6 Using SLURM 6.1 Introduction HP XC uses the Simple Linux Utility for Resource Management (SLURM) for system resource management and job scheduling. SLURM is a reliable, efficient, open source, fault-tolerant, job and compute resource manager with features that make it suitable for large-scale, high performance computing environments. SLURM can report on machine status, perform partition management, job management, and job scheduling.
Table 6-1: SLURM Commands (cont.) Command Function sinfo Reports the state of partitions and nodes managed by SLURM. It has a wide variety of filtering, sorting, and formatting options. sinfo displays a summary of available partition and node (not job) information (such as partition names, nodes/partition, and CPUs/node). scontrol Is an administrative tool used to view or modify the SLURM state. Typically, users do not need to access this command.
6.4.1.1 srun Roles srun options allow you submit a job by: • Specifying the parallel environment for your job, such as the number of nodes to use, partition, distribution of processes among nodes, and maximum time. • Controlling the behavior of your parallel job as it runs, such as by redirecting or labeling its output, sending it signals, or specifying its reporting verbosity. 6.4.1.
This command forwards the standard output and error messages from the running job with SLURM ID 6543 to the attaching srun command to reveal the job’s current status, and (with -j) also joins the job so that you can send it signals as if this srun command had initiated the job. Omit -j for read-only attachments. Because you are attaching to a running job whose resources have already been allocated, the srun resource-allocation options (such as -N) are incompatible with -a.
If you specify a script at the end of the srun command line (not as an argument to -A), the spawned shell executes that script using the allocated resources (interactively, without a queue). See the -b option for script requirements. If you specify no script, you can then execute other instances of srun interactively, within the spawned subshell, to run multiple parallel jobs on the resources that you allocated to the subshell.
Each partition’s node limits supersede those specified by -N. Jobs that request more nodes than the partition allows never leave the PENDING state. To use a specific partition, use the srun -p option. Combinations of -n and -N control how job processes are distributed among nodes according to the following srun policies: -n/-N combinations srun infers your intended number of processes per node if you specify both the number of processes and the number of nodes for your job.
6.4.5 srun Control Options srun control options determine how a SLURM job manages its nodes and other resources, what its working features (such as job name) are, and how it gives you help. Separate "constraint" options and I/O options are available and are described in other sections of this chapter. The following types of control options are available: • Node management • Working features • Resource control • Help options 6.4.5.
-J jobname (--job-name=jobname) The -J option specifies jobname as the identifying string for this job (along with its system-supplied job ID, as stored in SLURM_JOBID) in responses to your queries about job status (the default jobname is the executable program’s name). -v (--verbose) The -v option reports verbose messages as srun executes your job. The default is program output with only overt error messages added. Using multiple -v options further increases message verbosity. 6.4.5.
commands let you choose from among any of five I/O redirection alternatives (modes) that are explained in the next section. -o mode (--output=mode) The -o option redirects standard output stdout for this job to mode, one of five alternative ways to display, capture, or subdivide the job’s I/O, explained in the next section. By default, srun collects stdout from all job tasks and line buffers it to the attached terminal.
You can use a parameterized "format string" to systematically generate unique names for (usually) multiple I/O files, each of which receives some job I/O depending on the naming scheme that you choose. You can subdivide the received I/O into separate files by job ID, step ID, node (name or sequence number), or individual task. In each case, srun opens the appropriate number of files and associates each with the appropriate subset of tasks.
--contiguous=yes|no The --contiguous option specifies whether or not your job requires a contiguous range of nodes. The default is YES, which demands contiguous nodes, while the alternative (NO) allows noncontiguous allocation. --mem=size The -mem option specifies a minimum amount of real memory per node, where size is an integer number of megabytes. See also -vmem. --mincpus=n The -mincpus option specifies a minimum number n of CPUs per node.
6.4.8 srun Environment Variables Many srun options have corresponding environment variables. An srun option, if invoked, always overrides (resets) the corresponding environment variable (which contains each job feature’s default value, if there is a default). In addition, srun sets the following environment variables for each executing task on the remote compute nodes: SLURM_JOBID Specifies the job ID of the executing job. SLURM_NODEID Specifies the relative node ID of the current node.
The squeue command can report on jobs in the job queue according to their state; valid states are: pending, running, completing, completed, failed, timeout, and node_fail. Example 6-3 uses the squeue command to report on failed jobs. Example 6-3: Reporting on Failed Jobs in the Queue $ squeue --state=FAILED JOBID PARTITION NAME 59 amt1 hostname USER root ST F TIME 0:00 NODES NODELIST 0 6.6 Killing Jobs with the scancel Command The scancel command cancels a pending or running job or job step.
Example 6-8: Reporting Reasons for Downed, Drained, and Draining Nodes $ sinfo -R REASON Memory errors Not Responding NODELIST dev[0,5] dev8 6.8 Job Accounting HP XC System Software provides an extension to SLURM for job accounting. The sacct command displays job accounting data in a variety of forms for your analysis. Job accounting data is stored in a log file; the sacct command filters that log file to report on your jobs, jobsteps, status, and errors.
7 Using LSF The Load Sharing Facility (LSF) from Platform Computing Corporation is a batch system resource manager used on the HP XC system. LSF is included with HP XC, and is an integral part of the HP XC environment. On an HP XC system, a job is submitted to LSF, which places the job in a queue and allows it to run when the necessary resources become available. In addition to launching jobs, LSF provides extensive job management and information capabilities.
SLURM views the LSF-HPC system as one large computer with many resources available to run jobs. SLURM does not provide the same amount of information that can be obtained via standard LSF. But on HP XC systems, where the compute nodes have the same architecture and are expected to be allocated solely through LSF on a per-processor or per-node basis, the information provided by SLURM is sufficient and allows the LSF-HPC design to be more scalable and generate less overhead on the compute nodes.
To illustrate how the external scheduler is used to launch an application, consider the following command line, which launches an application on ten nodes with one task per node: $ bsub -n 10 -ext "SLURM[nodes=10]" srun my_app The following command line launches the same application, also on ten nodes, but stipulates that node n16 should not be used: $ bsub -n 10 -ext "SLURM[nodes=10;exclude=n16]" srun my_app 7.1.
queue contains the job starter script, but the unscripted queue does not have the job starter script configured. Example 7-1: Comparison of Queues and the Configuration of the Job Starter Script $ bqueues -l normal | grep JOB_STARTER JOB_STARTER: /opt/hptc/lsf/bin/job_starter.sh $ bqueues -l unscripted | grep JOB_STARTER JOB_STARTER: $ bsub -Is hostname Job <66> is submitted to the default queue . <> <
Figure 7-1: How LSF-HPC and SLURM Launch and Manage a Job User 1 N16 N 16 N16 Login node $ bsub-n4 -ext”SLURM[nodes-4]” -o output.out./myscript 2 lsfhost.localdomain LSF Execution Host job_starter.sh $ srun -nl myscript N2 6 3 4 hostname Compute Node n2 SLURM_JOBID=53 SLURM_NPROCS=4 7 N1 5 $ hostname n1 Compute Node N3 myscript $ hostname $ srun hostname $ mpirun -srun ./hellompi 6 srun hostname Compute Node n3 6 7 hostname N4 n1 6 7 hostname Compute Node n4 7 1.
4. LSF-HPC prepares the user environment for the job on the LSF-HPC execution host node and dispatches the job with the job_starter.sh script. This user environment includes standard LSF environment variables and two SLURM-specific environment variables: SLURM_JOBID and SLURM_NPROCS. SLURM_JOBID is the SLURM job ID of the job. Note that this is not the same as the LSF jobID. SLURM_NPROCS is the number of processors allocated.
• LSF does not support chunk jobs. If a job is submitted to chunk queue, SLURM will let the job pend. • LSF does not support topology-aware advanced reservation scheduling. 7.1.6 Notes About Using LSF in the HP XC Environment This section provides some additional information that should be noted about using LSF in the HP XC Environment. 7.1.6.1 Job Startup and Job Control When LSF starts a SLURM job, it sets SLURM_JOBID to associate the job with the SLURM allocation.
The following example shows the output from the bhosts command: $ bhosts HOST_NAME STATUS JL/U MAX lsfhost.localdomain ok - 16 NJOBS RUN SSUSP USUSP RSV 0 0 0 0 0 Of note in the bhosts output: • The HOST_NAME column displays the name of the LSF execution host. • The MAX column displays the total processor count (usable CPUs) of all available computer nodes in the lsf partition. • The STATUS column shows the state of LSF and displays a status of either ok or closed.
See the OUTPUT section of the lsload manpage for further information about the output of this example. In addition, refer to the Platform Computing Corporation LSF documentation and the lsload manpage for more information about the features of this command. 7.3.4 Checking LSF System Queues All jobs on the HP XC system that are submitted to LSF-HPC are placed into an LSF job queue.
The basic synopsis of the bsub command is: bsub [ bsub_options] jobname [ job_options] The HP XC system has several features that make it optimal for running parallel applications, particularly (but not exclusively) MPI applications. You can use the bsub command’s -n to request more than one CPU for a job. This option, coupled with the external SLURM scheduler, discussed in Section 7.4.2, gives you much flexibility in selecting resources and shaping how the job is executed on those resources.
additional capabilities at the job level and queue level by allowing the inclusion of several SLURM options in the LSF command line. Refer to Section 7.4.2. 7.4.2 LSF-SLURM External Scheduler An important option that can be included in submitting parallel jobs with LSF is the external scheduler option: The external scheduler option provides application—specific external scheduling options for jobs capabilities and enables inclusion of several SLURM options in the LSF command line.
Example 7-2: Using the External Scheduler to Submit a Job to Run on Specific Nodes $ bsub -n4 -ext "SLURM[nodelist=n6,n8]" -I srun hostname Job <70> is submitted to default queue . <> <> n6 n6 n8 n8 In the previous example, the job output shows that the job was launched from the LSF execution host lsfhost.localdomain, and it ran on four nodes using the specified nodes n6 and n8 as two of the four nodes.
This example runs the job exactly the same as in Example 2, but additionally requests that node n3 is not to be used to run the job. Note that this command could have been written to exclude additional nodes. 7.4.3 Submitting a Serial Job The synopsis of the bsub command to submit a serial (single CPU) job to LSF-HPC is: bsub [bsub-options ] [ srun [srun-options]] jobname [job-options] The bsub command launches the job.
The srun command, used by the mpirun command to launch the MPI tasks in parallel, determines the number of tasks to launch from the SLURM_NPROCS environment variable that was set by LSF-HPC. Recall that the value of this environment variable is equivalent to the number provided by the -n option of the bsub command. Consider an HP XC system configuration in which lsfhost.localdomain is the LSF execution host and nodes n[1-10] are compute nodes in the lsf partition.
7.4.6.1 Examples Consider an HP XC system configuration in which lsfhost.localdomain is the LSF execution host and nodes n[1-10] are compute nodes in the lsf partition. All nodes contain 2 processors, providing 20 processors for use by LSF jobs. Example 7-8 displays, then runs, a simple batch script. Example 7-8: Submitting a Batch Job Script $ cat ./myscript.sh #!/bin/sh srun hostname mpirun -srun hellompi $ bsub -n4 -I ./myscript.sh Job <78> is submitted to default queue . <
Example 7-11: Submitting a Batch job Script That Uses the srun --overcommit Option $ bsub -n4 -I ./myscript.sh "-n8 -O" Job <81> is submitted to default queue . <> <
The following example shows this resource requirement string in an LSF command: $ bsub -R "type=SLINUX64" -n4 -I srun hostname 7.5 Getting Information About Jobs There are several ways you can get information about a specific job after it has been submitted to LSF. This section briefly describes some of the commands that are available under LSF to gather information about a job. This section is not intended as complete information about this topic.
EXTERNAL MESSAGES: MSG_ID FROM 0 1 lsfadmin POST_TIME MESSAGE ATTACHMENT date and time stamp SLURM[nodes=4] N In particular, note the node and job allocation information provided in the above output: date and time stamp: Started on 4 Hosts/Processors <4*lsfhost.localdomain>; date and time stamp: slurm_id=22;ncpus=8;slurm_alloc=n[5-8]; 7.5.1.
Example 7-14: Using the bjobs Command (Long Output) $ bjobs -l 24 Job <24>, User ,Project ,Status , Queue , Interactive pseudo-terminal shell mode, Extsched , Command date and time stamp: Submitted from host , CWD <$HOME>, 4 Processors Requested, Requested Resources ; date and time stamp: Started on 4 Hosts/Processors <4*lsfhost.
To get detailed information about a finished job, add the -l option to the bhist command, shown in Example 7-16. The -l option specifies that the long format is requested.
$ bsub -Is -n4 -ext "SLURM[nodes=4]" /usr/bin/xterm Job <101> is submitted to default queue . <> <> n1 At this time an xterm terminal window appears on your display. The xterm program runs on the first node in the allocation. You can execute multiple srun and mpirun commands from this terminal; they will make use of the resources that were reserved by LSF-HPC. The following examples are from an interactive session.
Example 7-20: View Job Details in LSF (cont.) , 4 Processors Requested; date and time stamp: Dispatched to 4 Hosts/Processors <4*lsfhost.
comfortable interactive session, but every job submitted to this queue is executed on the LSF execution host instead of the first allocated node. Example 7-23 shows this subtle difference. Note that the LSF execution host in this example is n20: Example 7-23: Submitting an Interactive Shell Program on the LSF Execution Host $ bsub -Is -n4 -ext "SLURM[nodes=4]" -q noscript /bin/bash Job <96> is submitted to default queue
Table 7-2: LSF Equivalents of SLURM srun Options (cont.) srun Option Description LSF Equivalent -w, --nodelist=node1, ..nodeN Request a specific list of nodes. The job will at least contain these nodes. The list may be specified as a comma-separated list of nodes, or a range of nodes. By default, job does not require. -ext “SLURM[nodelist=node1,..nodeN]” -x, --exclude=node1, ..nodeN Requests that a specific list of hosts not be included in resource allocated to this job.
Table 7-2: LSF Equivalents of SLURM srun Options (cont.) srun Option Description LSF Equivalent -r, --relative=n Run a job step relative to node n of the current allocation. It is about placing tasks within allocation. Use when launching parallel tasks. -D, --chdir=path Specify working directory of job. Job will be started in job submission directory by default. -k , –no-kill Do not automatically terminate a job if one of the nodes it has been allocated fails.
8 Using HP-MPI This chapter describes how to use HP-MPI in the HP XC environment. The main focus of this chapter is to help you to quickly get started using HP-MPI on an HP XC system. In this chapter, the basics of getting started are demonstrated. The semantics of building and running a simple MPI program are described for single-host and multiple-host systems. In addition, you are shown how to configure your environment before running your program.
HP-MPI on the HP XC system, last minute changes to HP-MPI functionality, and known problems and work-arounds, refer to the HP-MPI Release Notes, which are included with the HP XC documentation. 8.2 HP-MPI Directory Structure All HP-MPI files are stored in the /opt/hpmpi directory. The directory structure is organized as described Table 8-1. If you move the HP-MPI installation directory from its default location in /opt/hpmpi, set the MPI_ROOT environment variable to point to the new location.
parallelism. For information about running more complex applications, refer to the HP-MPI user documentation. 8.3.2.1 Example Application hello_world To quickly become familiar with compiling and running HP-MPI programs, start with the C version of a familiar hello_world program. This program is called hello_world.c and prints out the text string “Hello world! I’m r of s on host”.
Hello Hello Hello Hello Hello world! world! world! world! world! I’m I’m I’m I’m 1 3 0 2 of of of of 4 4 4 4 on on on on host1 host2 host1 host2 8.3.3 Using srun with HP-MPI The SLURM srun utility (srun command) is used with the mpirun command to launch an MPI job on an HP XC system. The following is the general format of the mpirun command with srun: mpirun [mpirun options] -srun [srun options] This method runs with no restrictions on MPI-2 functionality.
• The following command runs a.out with four ranks, two ranks per node, ranks are block allocated, and two nodes are used: $ mpirun -srun -n4 ./a.out host1 rank1 host1 rank2 host2 rank3 host2 rank4 • The following command runs a.out with six ranks (oversubscribed), three ranks per node, ranks are block allocated, and two nodes are used: $ mpirun -srun -n6 -O -N2 -m block ./a.out host1 rank1 host1 rank2 host1 rank3 host2 rank4 host2 rank5 host2 rank6 • The following example runs a.
Example 8-1 displays how to perform a system interconnect selection. Example 8-1: Performing System Interconnect Selection % % % % export export export mpirun MPI_IC_ORDER="elan:TCP:gm:itapi" MPIRUN_SYSTEM_OPTIONS="-subnet 192.168.1.1" MPIRUN_OPTIONS="-prot" -srun -n4 ./a.out The command line for the above will appear to mpirun as: $ mpirun -subnet 192.168.1.1 -prot -srun -n4 ./a.out The system interconnect decision will look for the presence of Elan and use it if found.
Example 8-5: Allocating 12 Processors on 6 Nodes $ bsub -I -n12 $MPI_ROOT/bin/mpirun -srun -n6 -N6./a.out 1 Note that LSF jobs can be submitted without the -I (interactive) option. 8.3.5 MPI Versioning The mpirun command includes an option to print the version number. The -version option used with mpirun displays the major and minor version numbers. The mpi.h header includes matching constants as HP_MPI and HP_MPI_MINOR. 8.
If you would like to see the effects of using the TCP/IP protocol over a higher-speed system interconnect, use the -TCP option and omit the -subnet option. Generally, performance as measured by Pallas will be roughly 40% to 50% slower using TCP/IP over Elan, GM, or IT-API. 8.4.
8.8 The mpirun Command Options HP-MPI on the HP XC system provides the following additional mpirun command line options: -srun The -srun option is required in mpirun command in the HP XC environment. The preferred method for startup for HP XC is: mpirun mpirun options -srun srun options Starting up directly from srun is not supported. In this context, mpirun sets a few environment variables and invokes /opt/hptc/bin/srun.
8.9 Environment Variables HP-MPI on HP XC provides the following additional environment variables: 8.9.1 MPIRUN_OPTIONS MPIRUN_OPTIONS is a mechanism for specifying additional command line arguments to mpirun. If this environment variable is set, then any mpirun command will behave as if the arguments in MPIRUN_OPTIONS had been specified on the mpirun command line. For example: % export MPIRUN_OPTIONS="-v -prot" % $MPI_ROOT/bin/mpirun -np 2 /path/to/program.
for the purpose of determining how much memory to pin for RDMA message transfers on InfiniBand and Myrinet GM. The value determined by HP-MPI can be displayed using the -dd option. If HP-MPI specifies an incorrect value for physical memory, this environment variable can be used to specify the value explicitly: % export MPI_PHYSICAL_MEMORY=1048576 The above example specifies that the system has 1GB of physical memory. 8.9.
% export MPI_USE_LIBELAN=0 8.9.10 MPI_USE_LIBELAN_SUB The use of Elan’s native collective operations may be extended to include communicators which are smaller than MPI_COMM_WORLD by setting the MPI_USE_LIBELAN_SUB environment variable to “TRUE”. By default, this functionality is disabled due to the fact that libelan memory resources are consumed and may eventually cause run-time failures when too many sub-communicators are created.
Run the resulting prog.x under MPICH. However, various problems will be encountered. First, the MPICH installation will need to be built to include shared libraries and a soft link would need to be created for libmpich.so, since their libraries might be named differently. Next an appropriate LD_LIBRARY_PATH setting must be added manually since MPICH expects the library path to be hard coded into the executable at link time by -rpath.
8.12 Additional Information, Known Problems, and Work-arounds For additional information, as well as information about known problems and work-arounds, refer to the HP-MPI V2.1 for HP XC4000 and HP XC6000 Clusters Release Note. This document is provided with HP XC Documentation CD.
9 Using HP MLIB The information in this section describes how to use HP MLIB Version 1.5 in the HP XC environment on HP XC4000 and HP XC6000 clusters. These are discussed in separate sections in this chapter. 9.1 Overview HP MLIB is the mathematical library supported on the HP XC system. It is installed by default.
9.1.2 MLIB and Module Files For building and running an application built against MLIB, you must have a consistent environment. Modulefiles can make it easier to access a package; therefore, if you use modulefiles, be sure to use a consistent set of modulefiles. In particular, modulefiles can be used to select a compiler, both making its command available in the PATH environment variable and making its shared objects available in the LD_LIBRARY_PATH environment variable.
9.2.4 Modulefiles and MLIB When building or running an application built against MLIB, it is crucial that the environment is consistent. Modulefiles can make it easier to access a package. Therefore if modulefiles are used, it is necessary to use a consistent set of modulefiles. In particular, modulefiles can be used to select a compiler, both making its command available in $PATH as well as making its shared objects available in $LD_LIBRARY_PATH.
$ mpi90 [options] file ... /opt/mlib/[intel_7.1\intel_8.0]/hpmpi_2.1/lib/64/libscalapack.a \ -openmp $ mpicc [options] file ... /opt/mlib/[intel_7.1\intel_8.0]/hpmpi_2.1/lib/64/libscalapack.a \ -openmp 9.2.6.4 Linking SuperLU_DIST For programs that link SuperLU_DIST, you can specify the entire path of the library file on the compiler command line. You can use the following commands to link SuperLU_DIST: $ mpi90 [options] file ... /opt/mlib/[intel_7.1\intel_8.0]/hpmpi_2.1/lib/64/libsuperlu_dist.
9.3.3 MPI Parallelism Internal parallelism in ScaLAPACK and SuperLU_DIST is implemented using MPI — a portable, scalable programming model that gives distributed-memory parallel programmers a simple and flexible interface for developing parallel applications. 9.3.4 Modulefiles and MLIB When building or running an application built against MLIB, it is crucial that the environment is consistent. Modulefiles can make it easier to access a package.
$ mpicc [options] file ... /opt/mlib/pgi_5.1/hpmpi_2.1/lib/64/libscalapack.a -mp -lpgf90 -lpgf90_rpml -lpgf902 -lpgf90rtl -lpgftnrtl 9.3.5.4 Linking SuperLU_DIST For programs that link SuperLU_DIST, you can specify the entire path of the library file on the compiler command line. You can use the following commands to link SuperLU_DIST: $ mpi90 [options] file ... /opt/mlib/pgi_5.1/hpmpi_2.1/lib/64/libsuperlu_dist.a -mp $ mpicc [options] file ... /opt/mlib/pgi_5.1/hpmpi_2.1/lib/64/libsuperlu_dist.
10 Advanced Topics This chapter covers topics intended for the advanced user. The following topics are discussed: • Enabling remote execution with OpenSSH (Section 10.1) • Running an X terminal session from a remote node (Section 10.2) 10.1 Enabling Remote Execution with OpenSSH To reduce the risk of network attacks and increase the security of your HP XC system, the traditional rsh, rlogin, and telnet tools are disabled by default, and OpenSSH is provided instead.
Next, get the name of the local machine serving your display monitor: $ hostname mymachine Then, use the host name of your local machine to retrieve its IP address: $ host mymachine mymachine has address 14.26.206.134 Step 2. Logging in to HP XC System Next, you need to log in to a login node on the HP XC system. For example: $ ssh user@xc-node-name Once logged in to the HP XC system, you can start an X terminal session using SLURM or LSF. Both methods are described in the following sections. Step 3.
Step 4. Running an X terminal Session Using LSF This section shows how to create an X terminal session on a remote node using LSF. In this example, suppose that you want to use LSF to reserve 4 processors (2 nodes) and start an X terminal session on one of them. First, check the available nodes on the HP XC system.
A Examples This appendix provides examples that illustrate how to build and run applications on the HP XC system. The examples in this section show you how to take advantage of some of the many methods available, and demonstrate a variety of other user commands to monitor, control, or kill jobs. The examples in this section assume that you have read the information in previous chapters describing how to use the HP XC commands to build and run parallel applications.
steps through a series of commands that illustrate what occurs when you launch an interactive shell. Check LSF execution host information: $ bhosts HOST_NAME lsfhost.
View the job: $ bjobs -l 8 Job <8>, User , Project , Status , Queue , Interactive mode, Extsched , Command date and time stamp: Submitted from host , CWD <$HOME>, 2 Processors Requested; date and time stamp: Started on 2 Hosts/Processors <2*lsfhost.localdomain>; date and time stamp: slurm_id=24;ncpus=4;slurm_alloc=n[13-14]; date and time stamp: Done successfully. The CPU time used is 0.0 seconds.
steps through a series of commands that illustrate what occurs when you launch an interactive shell. Check LSF execution host information: $ bhosts HOST_NAME STATUS lsfhost.
Exit from the shell: $ exit exit Check the finished job’s information: $ bhist -l 124 Job <124>, User , Project , Interactive pseudo-terminal shell mode, Extsched , Command date and time stamp: Submitted from host , to Queue , CWD <$HOME>, 4 Processors Requested, Requested Resources ; date and time stamp: Dispatched to 4 Hosts/Processors <4*lsfhost.
<> <> n14 n14 n16 n16 Linux n14 2.4.21-15.3hp.XCsmp #2 SMP date ia64 Linux n14 2.4.21-15.3hp.XCsmp #2 SMP date ia64 Linux n16 2.4.21-15.3hp.XCsmp #2 SMP date ia64 Linux n16 2.4.21-15.3hp.XCsmp #2 SMP date ia64 and time stamp ia64 ia64 GNU/Linux and time stamp ia64 ia64 GNU/Linux and time stamp ia64 ia64 GNU/Linux and time stamp ia64 ia64 GNU/Linux A.
Run some commands from the pseudo-terminal: $ srun hostname n13 n13 n14 n14 n15 n15 n16 n16 $ srun -n3 hostname n13 n14 n15 Exit the pseudo-terminal: $ exit exit View the interactive jobs: $ bjobs -l 1008 Job <1008>, User smith, Project , Status , Queue , Interactive pseudo-terminal mode, Command date and time stamp: Submitted from host n16, CWD <$HOME/tar_drop1/test>, 8 Processors Requested; date and time stamp: Started on 8Hosts/Processors<8*lsfhost.
Show the environment: $ lsid Platform LSF HPC 6.0 for SLURM, Sep 23 2004 Copyright 1992-2004 Platform Computing Corporation My cluster name is penguin My master name is lsfhost.localdomain $ sinfo PARTITION lsf $ lshosts HOST_NAME lsfhost.loc AVAIL up TIMELIMIT infinite type SLINUX6 NODES 4 STATE alloc NODELIST n[13-16] model cpuf ncpus maxmem maxswp server RESOURCES DEFAULT 1.0 8 1M Yes (slurm) $ bhosts HOST_NAME STATUS lsfhost.
date and time stamp: Submitted from host , to Queue ,CWD <$HOME>, 6 Processors Requested; date and time stamp: Dispatched to 6 Hosts/Processors <6*lsfhost.localdomain>; date and time stamp: slurm_id=22;ncpus=6;slurm_alloc=n[13-15]; date and time stamp: Starting (Pid 11216); date and time stamp: Done successfully. The CPU time used is 0.
Glossary A Administrative Network The private network within the XC system that is used for administrative operations. admin branch The half (branch) of the Administrative Network that contains all of the general-purpose admin ports to the nodes of the XC system. B base image The collection of files and directories that represents the common files and configuration data that are applied to all nodes in an XC system. branch switch A component of the Administrative Network.
extensible firmware interface See EFI external network node A node that is connected to a network external to the XC system. F fairshare An LSF job-scheduling policy that specifies how resources should be shared by competing users. A fairshare policy defines the order in which LSF attempts to place jobs that are in a queue or a host partition. FCFS First come first served.
image server A node specifically designated to hold images that will be distributed to one or more client systems. In a standard XC installation, the head node acts as the image server and golden client. Integrated Lights Out See iLO interconnect The private network within the XC system that is used primarily for user file access and for communications within applications. interconnect Provides high-speed connectivity between the nodes.
LSF master host The overall LSF coordinator for the system. The master load information manager (LIM) and master batch daemon (mbatchd) run on the LSF master host. Each system has one master host to do all job scheduling and dispatch. If the master host goes down, another LSF server in the system becomes the master host. LVS Linux Virtual Server. Provides a centralized login capability for system users. LVS handles incoming login requests and directs them to a node with a login role.
P parallel application An application that uses a distributed programming model and can run on multiple processors. An HP XC MPI application is a parallel application. That is, all interprocessor communication within an HP XC parallel application is performed through calls to the MPI message passing library. PXE Preboot Execution Environment. A standard client/server interface that allows networked computers that are not yet installed with an operating system to be configured and booted remotely.
symmetric multiprocessing See SMP Glossary-6
Index A application ( See application development, application development environment ) tuning, 5-1 application development building parallel applications, 3-8 checking processor availability before launching jobs, 3-3 communication between nodes, 3-15 compiling and linking parallel applications, 3-8 compiling and linking serial applications, 3-4 debugging parallel applications, 4-1 debugging serial applications, 4-1 debugging with TotalView, 4-2 determining available resources for, 7-7 developing librar
compiler utilities for compiling and linking parallel programs, interrupting jobs, 3-3 intranode communication, 3-15 3-8 compilers, 1-7 from other vendors, 3-2 Intel, 3-2 PGI, 3-2 compute node configuring local disk, 3-14 D DDT, 4-1 debugger TotalView, 4-2 debugging DDT, 4-1 gdb, 4-1 idb, 4-1 pgdbg, 4-1 TotalView, 4-1 F file system local disk, 3-14 Fortran, 1-7 building parallel applications, 3-7 G gdb, 4-1 GNU, 1-7 GNU C/C++ building parallel applications, 3-7 GNU make building parallel applications,
building parallel applications, 3-6 module commands avail command, 2-4 list command, 2-4 load command, 2-4 unload command, 2-5 modulefile automatically loading at login, 2-5 configuring parallel build environment, 3-5 creating, 2-6 default modulefiles loaded, 2-4 gnu modulefile, 2-3 Intel modulefile, 2-3 loading for current session, 2-5 modules modulefile, 2-1, 2-3, 2-4 mpi modulefile, 2-3 overview of loading, 2-4 setting shell variables, 2-1 supported modulefiles, 2-3 unloading, 2-5 using, 2-1 viewing inst
R architecture, 1-1 developing applications on, 3-1 logging in, 2-1 overview, 1-1 using, 2-1 reserved symbol names building parallel applications, 3-8 resource manager, 7-1 role, 1-1 S serial applications building, 3-4 compiling and linking, 3-4 debugging, 4-1 developing, 1-5, 3-1, 3-3 examples of, A-1 programming model, 3-4 shared file view, 3-14 srun command interrupting jobs, 3-3 submitting a job, 2-10 submitting an HP-MPI job, 7-13 system Index-4 T TotalView, 4-1 debugging an application, 4-8 exiti