HP XC System Software Administration Guide Version 3.0

.
.
Can't propagate RLIMIT_CORE of 100000 from submit host.
For more information, see slurm.conf(5).
Restricting User Access to Nodes
Although full user authentication is required on every node so that SLURM can launch jobs, and although
this access is beneficial for users who need to debug their applications, it can be a problem because one
user could adversely affect the performance of another user's job: a user could log on any compute node
and steal processor cycles from any job running on that node. The solution is to restrict users to access to
only the nodes that they reserve.
A Pluggable Authentication Module for use with SLURM (pam_slurm) is supplied with HP XC System Software
to verify access. This module is designed to work within the Linux Pluggable Authentication Module (PAM)
framework and confirms with SLURM before authorizing user access to the local node.
The pam_slurm module is disabled by default, but you can enable it to restrict the use of a particular node
to only one reserved user at a time. Before you enable this module, you must have a login role defined on
at least one node.
During its operation, the cluster_config utility prompts for the option of whether or not to enable the
pam_slurm module and restrict user access to the compute nodes. If you choose to enable the pam_slurm
module, the cluster_config utility confirms which nodes should have pam_slurm enabled. Make sure
that login nodes are removed from the list of these nodes; otherwise, no one will be able to log in to those
nodes.
Restricted access does not apply to the superuser (root user).
If you need to disable the pam_slurm module on a node temporarily, invoke the configparm.pl
unconfigure command. The following example disables node n100 temporarily:
# pdsh -w n100 '/opt/hptc/slurm/sbin/configpam.pl unconfigure'
You must rerun the cluster_config utility to reconfigure permanently which nodes are enabled with the
pam_slurm module.
Job Accounting
SLURM on the HP XC system can collect job accounting data, store it in a log file, and display it. The
accounting data is available only at the conclusion of a job; you cannot obtain it in real time while the job
is running. You must enable this SLURM job accounting capability to support LSF-HPC on HP XC. LSF-HPC
is discussed in the next chapter.
This section briefly describes the sacct command, which you can use to display the stored job accounting
data, and discusses how to deconfigure and configure job accounting.
Information on jobs that are invoked with the SLURM srun command is logged automatically into the job
accounting file. Entries in the slurm.conf file enable job accounting and designate the name of the job
accounting log file. The default (and recommended) job accounting log file is
/hptc_cluster/slurm/job/jobacct.log.
SLURM job accounting attempts to gather all the statistics available on the systems on which it is run. The
following statistics are valid for HP XC systems:
User processor time
System processor time
Maximum number of minor page faults (page reclaims) for any process
Maximum number of major page faults for any process
Total number of processes used
Total number of processors allocated to the job
Job's elapsed time
Job status or state (running, completed, failed, timed out, or node fail)
Restricting User Access to Nodes 109