LSF Version 7.3 - Administering Platform LSF
Administering Platform LSF 625
Achieving Performance and Scalability
Last Period
Last sampling value of metric. It is calculated per sampling period. It is represented
as the metric value per period, and normalized by the following formula.
Max Maximum sampling value of metric. It is re-evaluated in each sampling period by
comparing Max and Last Period. It is represented as the metric value per period.
Min Minimum sampling value of metric. It is re-evaluated in each sampling period by
comparing Min and Last Period. It is represented as the metric value per period.
Avg Average sampling value of metric. It is recalculated in each sampling period. It is
represented as the metric value per period, and normalized by the following
formula.
Reconfiguring your cluster with performance metric sampling enabled
badmin mbdrestart If performance metric sampling is enabled dynamically with
badmin perfmon start. You must enable it again after running badmin
mbdrestart. If performance metric sampling is enabled by default, StartTime will
be reset to the point
mbatchd is restarted.
badmin reconfig If SCHED_METRIC_ENABLE and SCHED_METRIC_SAMPLE_PERIOD parameters are
changed,
badmin reconfig is the same as badmin mbdrestart.
Performance metric logging in lsb.streams
By default, collected metrics must be written to lsb.streams. However,
performance metric can still be turned on even if
ENABLE_EVENT_STREAM=N is
defined. In this case, no metric data will be logged.
◆ If EVENT_STREAM_FILE is defined and is valid, collected metrics should be
written to
EVENT_STREAM_FILE.
◆ If ENABLE_EVENT_STREAM=N is defined, metrics data will not be logged.
Job arrays
Only one submission request is counted. Element jobs are counted for jobs
submitted, jobs dispatched, and jobs completed.
Job rerun
Job rerun occurs when execution hosts become unavailable while a job is running,
and the job will be put to its original queue first and later will be dispatched when
a suitable host is available. So in this case, only one submission request, one job
submitted, and n jobs dispatched, n jobs completed are counted (n represents the
number of times the job reruns before it finishes successfully).