LSF Version 7.3 - Administering Platform LSF

Administering Platform LSF 623
Achieving Performance and Scalability
Increase the job ID
display length
By default, bjobs and bhist display job IDs with a maximum length of 7
characters. Job IDs greater than 9999999 are truncated on the left.
Use LSB_JOBID_DISP_LENGTH in
lsf.conf to increase the width of the JOBID
column in bjobs and bhist display. When LSB_JOBID_DISP_LENGTH=10, the
width of the JOBID column in
bjobs and bhist increases to 10 characters.
Monitoring Performance Metrics in Real Time
Enable metric collection
Set SCHED_METRIC_ENABLE=Y in lsb.params to enable performance metric
collection.
Start performance metric collection dynamically:
badmin perfmon start sample_period
Optionally, you can set a sampling period, in seconds. If no sample period is
specified, the default sample period set in
SCHED_METRIC_SAMPLE_PERIOD in
lsb.params is used.
Stop sampling:
badmin perfmon stop
SCHED_METRIC_ENABLE
and SCHED_METRIC_SAMPLE_PERIOD can be specified
independently. That is, you can specify
SCHED_METRIC_SAMPLE_PERIOD and not
specify
SCHED_METRIC_ENABLE. In this case, when you turn on the feature
dynamically (using
badmin perfmon start), the sampling period valued defined
in
SCHED_METRIC_SAMPLE_PERIOD will be used.
badmin perfmon start and badmin perfmon stop override the configuration
setting in
lsb.params. Even if SCHED_METRIC_ENABLE is set, if you run
badmin perfmon start, performance metric collection is started. If you run
badmin perfmon stop, performance metric collection is stopped.
Tune the metric sampling period
Set SCHED_METRIC_SAMPLE_PERIOD in lsb.params to specify an initial
cluster-wide performance metric sampling period.
Set a new sampling period in seconds:
badmin perfmon setperiod sample_period
Collecting and recording performance metric data may affect the performance of
LSF. Smaller sampling periods will result in the
lsb.streams file growing faster.
Display current performance
Run badmin perfmon view to view real time performance metric information.
The following metrics are collected and recorded in each sample period:
The number of queries handled by mbatchd
The number of queries for each of jobs, queues, and hosts. (bjobs, bqueues,
and
bhosts, as well as other daemon requests)
The number of jobs submitted (divided into job submission requests and jobs
actually submitted)