Platform LSF Administration Guide Version 6.2
Chapter 11
Managing Software Licenses with LSF
Administering Platform LSF
235
If the Verilog licenses are not cluster-wide, but can only be used by some hosts in the
cluster, the resource requirement string should include the
defined() tag in the
select section:
select[defined(verilog)] rusage[verilog=1]
Preventing underutilization of licenses
One limitation to using a dedicated queue for licensed jobs is that if a job does not
actually use the license, then the licenses will be under-utilized. This could happen if the
user mistakenly specifies that their application needs a license, or submits a non-licensed
job to a dedicated queue.
LSF assumes that each job indicating that it requires a
Verilog license will actually use
it, and simply subtracts the total number of jobs requesting
Verilog licenses from the
total number available to decide whether an additional job can be dispatched.
Use the
duration keyword in the queue resource requirement specification to release
the shared resource after the specified number of minutes expires. This prevents
multiple jobs started in a short interval from over-using the available licenses. By limiting
the duration of the reservation and using the actual license usage as reported by the
ELIM, underutilization is also avoided and licenses used outside of LSF can be
accounted for.
When interactive jobs compete for licenses
In situations where an interactive job outside the control of LSF competes with batch
jobs for a software license, it is possible that a batch job, having reserved the software
license, may fail to start as its license is intercepted by an interactive job. To handle this
situation, configure job requeue by using the REQUEUE_EXIT_VALUES parameter
in a queue definition in
lsb.queues. If a job exits with one of the values in the
REQUEUE_EXIT_VALUES, LSF will requeue the job.
Example
Jobs submitted to the following queue will use Verilog licenses:
Begin Queue
QUEUE_NAME = q_verilog
RES_REQ=rusage[verilog=1:duration=1]
# application exits with value 99 if it fails to get license
REQUEUE_EXIT_VALUES = 99
JOB_STARTER = lic_starter
End Queue
All jobs in the queue are started by the job starter lic_starter, which checks if the
application failed to get a license and exits with an exit code of 99. This causes the job
to be requeued and LSF will attempt to reschedule it at a later time.
lic_starter job
starter script
The lic_starter job starter can be coded as follows: