HP SVA V2.1 Release Notes
7. Job Launch: SLURM Epilog Script Excessive Clean Up
• Impact
Medium.
• Summary
The HP XC System Software Installation Guide uses a SLURM epilog script called
slurm.conf. This epilog script runs as root when a SLURM job terminates on each
node in the job.
The epilog script terminates any rgsender processes on nodes used in a job (started
when using HP Remote Graphics Software), removes any X lock files (for example,
/tmp/.X0-lock, /tmp/.X1-lock), and invokes the optional XC epilog script, which
kills any processes owned by the user running the job (except for root and any other
UID less than 100).
This is an effective way of ensuring that all pieces of your job go away when your job
exits. However, if you have other processes on that node, even if not started by the job
(for example, when debugging), they are also killed.
• Solution
You can disable the epilog by changing /hptc/slurm/etc/slurm.conf and restarting
SLURM on all nodes using this command:
pdsh -a service slurm restart
Future SVA jobs may not be able to start if processes are left around by previous jobs
and not cleaned up by the epilog. If you don't want the SVA-specific behavior of killing
rgsender processes and removing X server lock files, but you do want to kill processes
owned by the user who ran the job, change the following:
Epilog=/opt/sva/sbin/sva_epilog.clean
To:
Epilog=/opt/hptc/slurm/etc/slurm.epilog.clean
If you don't want any of this cleanup behavior, delete the following line:
Epilog=/opt/sva/sbin/sva_epilog.clean
8. Job Launch: Scripts Fail in Directories with Spaces in Names
• Impact
Low.
• Summary
Scripts do not work if executed within directories containing spaces in their names.
• Solution
Change to a directory which does not have a space in its name.
1.3 Release Notes 11