LSF Version 7.3 - Administering Platform LSF

Managing Error Logs
694 Administering Platform LSF
MAX_INFO_DIRS is defined in lsb.params) for use at dispatch time or if the job
is rerun. The info directory is managed by LSF and should not be modified by
anyone.
Log directory permissions and ownership
Ensure that the permissions on the LSF_LOGDIR directory to be writable by root.
The LSF administrator must own LSF_LOGDIR.
Support for UNICOS accounting
In Cray UNICOS environments, LSF writes to the Network Queuing System (NQS)
accounting data file,
nqacct, on the execution host. This lets you track LSF jobs and
other jobs together, through NQS.
Support for IRIX Comprehensive System Accounting (CSA)
The IRIX 6.5.9 Comprehensive System Accounting facility (CSA) writes an
accounting record for each process in the
pacct file, which is usually located in the
/var/adm/acct/day directory. IRIX system administrators then use the csabuild
command to organize and present the records on a job by job basis.
The LSF_ENABLE_CSA parameter in
lsf.conf enables LSF to write job events to
the
pacct file for processing through CSA. For LSF job accounting, records are
written to
pacct at the start and end of each LSF job.
See the Platform LSF Configuration Reference for more information about the
LSF_ENABLE_CSA parameter.
See the IRIX 6.5.9 resource administration documentation for information about
CSA.
Managing Error Logs
Error logs maintain important information about LSF operations. When you see
any abnormal behavior in LSF, you should first check the appropriate error logs to
find out the cause of the problem.
LSF log files grow over time. These files should occasionally be cleared, either by
hand or using automatic scripts.
Daemon error logs
LSF log files are reopened each time a message is logged, so if you rename or remove
a daemon log file, the daemons will automatically create a new log file.
The LSF daemons log messages when they detect problems or unusual situations.
The daemons can be configured to put these messages into files.
The error log file names for the LSF system daemons are:
res.log.host_name
sbatchd.log.host_name
mbatchd.log.host_name
mbschd.log.host_name