Platform LSF Administration Guide Version 6.2

ManualsBrandsHP ManualsSoftwareHP XC System 3.x Software

541

542

543

544

545

546

547

548

549

550

Tuning LSF for Large Clusters

Administering Platform LSF

544

Tuning LSF for Large Clusters

To enable and sustain large clusters, you need to tune LSF for efficient querying,

dispatching, and event log management.

Managing scheduling performance

For fast job dispatching in a large cluster, configure the following parameters:

◆

LSB_MAX_JOB_DISPATCH_PER_SESSION in lsf.conf

The maximum number of jobs the scheduler can dispatch in one scheduling session

Some operating systems, such as Linux and AIX, let you increase the number of file

descriptors that can be allocated on the master host. You do not need to limit the

number of file descriptors to 1024 if you want fast job dispatching. To take

advantage of the greater number of file descriptors, you must set

LSB_MAX_JOB_DISPATCH_PER_SESSION to a value greater than 300.

◆

MAX_SBD_CONNS in lsb.params

The maximum number of open file connections between mbatch and sbatchd.

Set

MAX_SBD_CONNS to the same value as

LSB_MAX_JOB_DISPATCH_PER_SESSION

To enable fast job

dispatch

Increase the system-wide file descriptor limit of your operating system if you have

not already done so.

In lsf.conf, set the parameter LSB_MAX_JOB_DISPATCH_PER_SESSION to a

value greater than 300.

For example:

LSB_MAX_JOB_DISPATCH_PER_SESSION = 1024

Ensure that the value of

LSB_MAX_JOB_DISPATCH_PER_SESSION

is less than

the maximum number of allowed open file descriptors.

In lsb.params, set the parameter MAX_SBD_CONNS to the same value as

LSB_MAX_JOB_DISPATCH_PER_SESSION

For example:

MAX_SBD_CONNS

=1024

In the shell you used to increase the file descriptor limit, shut down the LSF batch

daemons on the master host:

badmin hshutdown

badmin mbdrestart

Run badmin hstartup to restart the LSF batch daemons on the master host.

Run badmin hrestart all to restart every sbatchd in the cluster:

When you shut down the batch daemons on the master host, all LSF services are

temporarily unavailable, but existing jobs are not affected. When

mbatchd is later

started by sbatchd, its previous status is restored and job scheduling continues.

Scheduling tip

In large clusters, enable the scheduler to run constantly. Define the parameter

JOB_SCHEDULING_INTERVAL=0 in lsb.params: