HP XC System Software Administration Guide Version 3.1

Table 14-4 Output of the sinfo command for Various Transitions
Meaning:sinfo shows:Transition Cause:
The node is running a job
alloc
Transient Network Congestion
The slurmctld daemon has lost contact
with the node
alloc*
Contact between the node and the
slurmctld daemon has been restored
alloc
The node is ready to accept a job
idle
Node fails while no job is running on the
node.
The slurmctld daemon lost contact with
the node
idle*
The slurmctld daemon has removed the
node from service (see `sinfo -R`)
down*
The node has been returned to service
idle
The node is running a job.
alloc
Node fails while a job is running on the
node
The slurmctld daemon lost contact with
the node.
alloc*
The slurmctld daemon has removed the
node from service (see sinfo -R).
down*
The node has been returned to service.
idle
The node is ready to accept a job.
idle
The System Administrator sets the node
state to down.
The slurmctld daemon has removed the
node from service.
down
The slurmctld daemon lost contact with
the node (see sinfo -R).
down*
The node has been returned to service.
idle
The node is running a job.
alloc
The System Administrator sets the node
state to drain while a job is running on
the node.
SLURM is waiting for the job or jobs to
finish.
drng
SLURM removed the node from service.
drain
The slurmctld daemon lost contact with
the node (see sinfo -R).
drain*
The node has been returned to service.
idle
The node is ready to accept a job.
idle
The System Administrator sets the node
state to drain while a job is running on
the node.
SLURM removed the node from service.
drain
The slurmctld daemon lost contact with
the node (see sinfo -R).
drain*
The node has been returned to service.
idle
14.7 Configuring the SLURM Epilog Script
SLURM provides the capability of automatically killing rogue processes at the end of a job using an epilog
script.
When configured, the SLURM epilog script is launched after the user's job on the node completes. This
script verifies that the user has another job assigned to this node, and, if not, sends a SIGKILL signal to
all the processes that belong to that user on all the nodes in the user's allocation.
NOTE: If the user logged in from a node that is also a compute node, the epilog script also terminates
the user's login. You can avoid this problem by editing the EPILOG_EXCLUDE_NODES variable in the
epilog file. It is empty by default. Specify the host names of the login nodes, separated by spaces, so that
the epilog script does not kill the user jobs on those nodes; for example:
14.7 Configuring the SLURM Epilog Script 171