LSF Version 7.3 - Using Platform LSF HPC

About Platform LSF HPC for Linux/QsNet
What Platform LSF HPC for Linux/QsNet does
Assumptions and limitations
Compatibility with earlier releases
What Platform LSF HPC for Linux/QsNet does
The Platform LSF HPC for Linux/QsNet combines the strengths of Platform LSF,
Quadrics Resource Management System (RMS), and Quadrics QsNet data network to
provide a comprehensive Distributed Resource Management (DRM) solution on Linux.
LSF acts primarily as the workload scheduler, providing policy and topology-based
scheduling and fault tolerance. RMS acts as a parallel execution subsystem for CPU
allocation and node selection.
Assumptions and limitations
A single parallel LSF job must run within a single RMS partition.
LSF uses its own access control, usage limits and accounting mechanism. You
should not change the default RMS configuration for these features. Configuration
changes may interfere with the correct operation of LSF. Do not use the commands
or configure any of the following RMS features:
Idle time out
Memory limits
Maximum and minimum number of CPUs
Time limits
Time-sliced gang scheduling
Partition queue depth
If you use RMS_MCONT or RMS_SNODE allocation options, the ptile option
in the span section of the resource requirement string
(
bsub -R "span[ptile=
n
]") is not supported.
You should use
-extsched "RMS[ptile=n]" to define the locality of jobs
instead of
-R "span[ptile=n]".
Host preference (for example, bsub -m hostA) is only supported for
RMS_SLOAD allocation. LSF host preference is not taken into account for
RMS_SNODE and RMS_MCONT allocation.
Using an exclamation point (!) in host selection to indicate mandatory first
execution host (e.g.,
bsub -m "hostA! hostB") is supported for
RMS_SNODE and RMS_MCONT allocation. If you specify RMS_SLOAD with
mandatory first execution host, LSF changes the allocation type to RMS_SNODE.
Hosts are sorted by their position in the RMS partition; any host to the left of the
first execution host is ignored.
Application-level checkpointing is supported.
User-level checkpointing is not supported.