Platform LSF Administration Guide Version 6.2

Chapter 37
Achieving Performance and Scalability
Administering Platform LSF
547
Important
For automatic tuning of the loading interval, make sure the parameter
EXINTERVAL in lsf.cluster.cluster_name file is not defined. Do not
configure your cluster to load the information at specific intervals.
Managing the I/O performance of the info directory
In large clusters, there are large numbers of jobs submitted by its users. Since each job
generally has a job file, this results in a large number of job files stored in the
LSF_SHAREDIR/cluster_name/logdir/info directory at any time. When the
total size of the job files reaches a certain point, you will notice a significant delay when
performing I/O operations in the
info directory.
This delay is caused by a limit in the total size of files that can reside in a file server
directory. This limit is dependent on the file system implementation. A high load on the
file server delays the master batch daemon operations, and therefore slows down the
overall cluster throughput.
You can prevent this delay by creating and using subdirectories under the parent
directory. Each new subdirectory is subject to the file size limit, but the parent directory
is not subject to the total file size of its subdirectories. Since the total file size of the
info
directory is divided among its subdirectories, your cluster can process more job
operations before reaching the total size limit of the job files.
If your cluster has a lot of jobs resulting in a large
info directory, you can tune your
cluster by enabling LSF to create subdirectories in the
info directory. Use
MAX_INFO_DIRS in lsb.params to create the subdirectories and enable mbatchd to
distribute the job files evenly throughout the subdirectories.
Syntax
MAX_INFO_DIRS=
num_subdirs
Where:
num_subdirs
Specifies the number of subdirectories that you want to create under the
LSF_SHAREDIR/cluster_name/logdir/info directory. Valid values are positive
integers between
1 and 1024. By default, MAX_INFO_DIRS is not defined.
Run
badmin reconfig to create and use the subdirectories.
Duplicate event
logging
If you enabled duplicate event logging, you must run badmin mbdrestart
instead of badming reconfig to restart mbatchd.
Run bparams -l to display the value of the MAX_INFO_DIRS parameter.
Example
MAX_INFO_DIRS=10
mbatchd creates ten subdirectories from
LSB_SHAREDIR/cluster_name/logdir/info/0 to
LSB_SHAREDIR/cluster_name/logdir/info/9.