Platform LSF Administration Guide Version 6.2

Chapter 37
Achieving Performance and Scalability
Administering Platform LSF
543
Tuning UNIX for Large Clusters
The following hardware and software specifications are requirements for a large cluster
that supports 5,000 hosts and 100,000 jobs at any one time.
Hardware recommendation
LSF master host:
3 GHz CPU speed
4 CPUs, one each for:
mbatchd
mbschd
lim
operating system
10 GB Ram
Software requirement
To meet the performance requirements of a large cluster, increase the file descriptor
limit of the operating system.
The file descriptor limit of most operating systems used to be fixed, with a limit of 1024
open files. Some operating systems, such as Linux and AIX, have removed this limit,
allowing you to increase the number of file descriptors.
Increase the file descriptor limit
To achieve efficiency of performance in LSF, follow the instructions in your operating
system documentation to increase the number of file descriptors on the LSF master
host.
The following is an example configuration. The instructions for different operating
systems, kernels, and shells are varied. You may have already configured the host to use
the maximum number of file descriptors that are allowed by the operating system. On
some operating systems, the limit is configured dynamically.
Tip
To optimize your configuration, set your file descriptor limit to a value at least as high
as the number of hosts in your cluster.
Example
Your cluster size is 5000 hosts. Your master host is on Linux, kernel version 2.4:
1
Log in to the LSF master host as the root user.
2
Add the following line to your /etc/rc.d/rc.local startup script:
echo -n "5120" > /proc/sys/fs/file-max
3
Restart the operating system to apply the changes.
4
In the bash shell, instruct the operating system to use the new file limits:
# ulimit -n unlimited