Platform LSF Administration Guide Version 6.2
What’s New in Platform LSF Version 6.0
Administering Platform LSF
36
See Chapter 3, “Working with Your Cluster”, Chapter 4, “Working with Hosts”, and
Chapter 5, “Working with Queues” for more information.
Platform LSF
Reports
Understand cluster operations better, so that you can improve performance and
troubleshoot configuration problems.
Platform LSF Reports provides a lightweight reporting package for single LSF clusters.
It provides simple two-week reporting for smaller LSF clusters (about 100 hosts, 1,000
jobs/day) and shows trends for basic cluster metrics by user, project, host, resource and
queue.
LSF Reports provides the following historical information about a cluster:
◆
Cluster load
Trends the LSF internal load indices: status, r15s, r1m, r15m, ut, pg, ls, it, swp, mem,
tmp, and io.
◆
Cluster service level
Shows the average cluster service level using the following metrics: CPU time,
memory and swap consumption, job runtime, job pending time, and job turnaround
time
◆
Cluster throughput
Shows the amount of work pushed through the cluster, using both accounting
information (total number of submitted, completed, and exited jobs) and sampled
information (the minimum, maximum, and average number of running and pending
jobs, by state and type).
◆
Shared resource usage
Shows the total, free, and used shared resources for the cluster.
◆
Reserved resource usage
Shows the actual usage of reserved resources.
◆
License usage
Shows peak, average, minimum, and maximum license usage by feature.
◆
License consumption
Shows license minutes consumed by user, feature, vendor, and server.
See Platform LSF Reports Reference for installation and configuration instructions.
Platform LSF Reports is available as separately installable add-on packages located in
/lsf_reports/ on the Platform FTP site (ftp.platform.com/).
Run-time enhancements
Thread limit
enforcement
Control job thread limit like other limits. Use bsub -T to set the limit of the number
of concurrent threads for the whole job. The default is no limit. In the queue, set
THREADLIMIT to limit the number of concurrent threads that can be part of a job.
Exceeding the limit causes the job to terminate.
See Chapter 29, “Runtime Resource Usage Limits” for more information.