Platform LSF Administration Guide Version 6.2

Welcome
Administering Platform LSF
33
You use the bsla command to track the progress of your projects and see whether they
are meeting the goals of your policy.
See Chapter 17, “Goal-Oriented SLA-Driven Scheduling” for more information.
Platform LSF
License Scheduler
Platform LSF License Scheduler ensures that higher priority work never has to wait for
a license. Prioritized sharing of application licenses allows you to make policies that
control the way software licenses are shared among different users in your organization.
You configure your software license distribution policy and LSF intelligently allocates
licenses to improve quality of service to your end users while increasing throughput of
high-priority work and reducing license costs.
It has the following features:
Applies license distribution policies fairly among multiple projects cluster-wide
Easily configurable distribution policies; instead of assigning equal share of licenses
to everyone, you can give more licenses to larger or more important projects
Guaranteed access to a minimum portion of licenses, no matter how heavily loaded
the system is
Controls the distribution of licenses among jobs and tasks it manages and still allows
users to check out licenses directly
Preempts lower priority jobs and releases their licenses to allow higher priority jobs
to get the license and run.
Provides visibility of license usage with blusers command
See Using Platform LSF License Scheduler for installation and configuration
instructions.
Platform LSF license-aware scheduling is available as separately installable add-on
packages located in
/license_scheduler/ on the Platform FTP site
(
ftp.platform.com/).
Job-level
exception
management
Configure hosts and queues so that LSF takes appropriate action automatically when it
detects exceptional conditions while jobs are running. Customize what exceptions are
detected, and their corresponding actions.
LSF detects:
Job exceptions:
Job underrunjob ends too soon (run time is less than expected). Underrun
jobs are detected when a job exits abnormally
Job overrunjob runs too long (run time is longer than expected)
Idle jobrunning job consumes less CPU time than expected (in terms of
cputime/runtime)
Host exceptions:
LSF detects “black hole” or “job-eating” hosts. LSF monitors the job exit rate
for hosts, and closes the host if the rate exceeds a threshold you configure.
A host can still be available to accept jobs, but some other problem prevents the
jobs from running. Typically jobs dispatched to such problem hosts exit
abnormally.
See Chapter 4, “Working with Hosts” for more information.