Installation guide
Release Notes
3. Scyld ClusterWare includes the env-modules environment-modules package, which enables the dynamic modification
of a user’s environment via modulefiles. Each modulefile contains the information needed to configure the shell for an
application, allowing a user to easily switch between applications with a simple module switch command that resets
environment variables like PATH and LD_LIBRARY_PATH. A number of modules are already installed configuring
application builds and execution with OpenMPI, including jobs submitted through TORQUE. For more information on
these modules, see the Programmer’s Guide for details. For more information about creating your own modules, see
http://modules.sourceforge.net, or view the manpages man module and man modulefile.
4. Scyld ClusterWare now includes pacct, a utility to generate simple reports from the verbose TORQUE log files. There
are two types of log files: the event log, which record events from each TORQUE daemon, and the accounting logs.
The accounting log files reside by default in the /var/spool/torque/server_priv/accounting/ directory. See
http://www.nacad.ufrj.br/~bino/pbs_acct-e.html for more information about this tool. Note: the Scyld ClusterWare ver-
sion of pacct reports total core hours, rather than total node hours.
5. The mvapich package has been renamed to mvapich-scyld.
6. The Pathscale compiler is no longer supported. Accordingly, the mpich, mvapich-scyld, and openmpi-scyld
packages no longer include Pathscale libraries that previously resided in /usr/lib64/MPICH/p4/path/,
/usr/lib64/MPICH/vapi/path/, and /usr/openmpi/path/share/, respectively.
7. The mpich and mvapich-scyld libraries now explicitly limit an application to a maximum of 1000 threads. This is not a
reduction of a previous capability; it is, in fact, a bounds check that recognizes and enforces an existing limitation in the
implementation.
8. In some instances, mpirun -machine vapi was not properly linking the application to the MVAPICH libraries on a
compute node, and was instead mistakenly linking with the default gnu p4 (Ethernet) libraries. This has now been fixed,
in part by replicating the master node’s /usr/lib64/MPICH/ directory structure on each compute node at node startup.
The libraries themselves are only pulled to a compute node if and when they are actually needed.
9. Fixes a bug with MVAPICH (Infiniband) applications which improperly left lingering application threads running after
the application was supposedly killed by a TORQUE qdel, or after some, but not all, the application’s threads died
because they were explicitly killed (e.g., using /usr/bin/kill) or abnormally terminated (e.g., with a segmentation viola-
tion).
10. The beonss name space functionality has improved robustness, error reporting via the syslog, and a modest performance
improvement for compute-to-master kickback communication.
11. bpsh (and process migration, in general) now communicates the current umask specification to the compute nodes.
Previously, the umask was ignored, and files created on a compute node defaulted to world-writeable permissions.
12. OpenMPI is upgraded to version 1.3.3.
13. TORQUE is upgraded to version 2.3.7.
14. Scyld ClusterWare’s default port numbers can now be overridden using the server directive in /etc/beowulf/config.
See
the Section called Issues with port numbers for details.
15. Scyld ClusterWare’s beoserv daemon now responds to any DHCP request that arrives on the cluster private network.
Previously, beoserv only functioned as a DHCP server for Scyld nodes.
Known Issues And Workarounds
The following are known issues of significance with the latest version of Scyld ClusterWare 4.9.0 and suggested
workarounds.
17