LSF Version 7.3 - Administering Platform LSF
Administering Platform LSF 355
Goal-Oriented SLA-Driven Scheduling
EGO-enabled SLA scheduling
By default, all host management for scheduling SLA jobs is handled by LSF. Under
EGO-enabled SLA scheduling, LSF uses EGO resource allocation facilities to get the
hosts it needs to run SLA jobs. Host allocation is the responsibility of EGO, while
job management remains managed by LSF.
EGO-enabled SLA scheduling is a new scheduling paradigm that replaces other
existing LSF scheduling policies. It effectively separates workload management
from resource management. Because it takes advantage of the dynamic host feature,
it allows fairsharing of compute host resources among multiple SLAs.
Hosts are assigned to a specific SLA and owned by it. Resources can be reclaimed
and reallocated from and to consumers based on EGO resource policies. Each SLA
defines one EGO consumer. Consumers can share the resources from one EGO
resource group.
Attaching SLA scheduling service classes to EGO consumers provides:
◆ Two levels of fairshare:
❖ Resource fairshare between service classes using EGO policies that
determine how many hosts are to be assigned to a given SLA. This is similar
to queue-level fairshare in LSF.
❖ User-based fairshare is supported by defining the fairshare tree in
lsb.users. The USER_GROUP parameter in lsb.serviceclasses
controls access to the SLA by determining which user’s job should run on
the allocated hosts.
◆ Simplified scheduling policies—LSF focuses on job scheduling, the resource
scheduling and distribution are managed by EGO
◆ Lending and borrowing of cluster resources among multiple consumers and
projects, including immediate reclaim by resource owners
Default EGO-enabled SLA behavior
With EGO-enabled SLA scheduling configured:
◆ Multiple LSF consumers are supported.
◆ Each LSF consumer requests exclusive host allocation.
◆ Hosts allocated to LSF are dedicated to LSF jobs; they cannot be shared with
other consumers. You can configure your resource allocation plan in EGO to
share hosts among consumers.
◆ By default, LSF schedule jobs based on the default LSF MXJ. Use
MBD_USE_EGO_MXJ in
lsb.params to configure slot-based allocation.
◆ LSF can only schedule jobs on hosts allocated from EGO, and the allocation can
change with workload and demand. EGO-enabled SLA facilitates dynamic
resource sharing and scheduling based on resource ownership.
◆ Resource requirements can be specified in the service class configuration to get
specific resource.
◆ As with existing SLA scheduling in general, preemption and chunk job
scheduling is not supported.