Platform LSF Administration Guide Version 6.2
Tuning CPU Factors
Administering Platform LSF
122
Tuning CPU Factors
CPU factors are used to differentiate the relative speed of different machines. LSF runs
jobs on the best possible machines so that response time is minimized.
To achieve this, it is important that you define correct CPU factors for each machine
model in your cluster.
How CPU factors affect performance
Incorrect CPU factors can reduce performance the following ways.
◆
If the CPU factor for a host is too low, that host may not be selected for job
placement when a slower host is available. This means that jobs would not always
run on the fastest available host.
◆
If the CPU factor is too high, jobs are run on the fast host even when they would
finish sooner on a slower but lightly loaded host. This causes the faster host to be
overused while the slower hosts are underused.
Both of these conditions are somewhat self-correcting. If the CPU factor for a host is
too high, jobs are sent to that host until the CPU load threshold is reached. LSF then
marks that host as busy, and no further jobs will be sent there. If the CPU factor is too
low, jobs may be sent to slower hosts. This increases the load on the slower hosts, making
LSF more likely to schedule future jobs on the faster host.
Guidelines for setting CPU factors
CPU factors should be set based on a benchmark that reflects your workload. If there is
no such benchmark, CPU factors can be set based on raw CPU power.
The CPU factor of the slowest hosts should be set to 1, and faster hosts should be
proportional to the slowest.
Example
Consider a cluster with two hosts: hostA and hostB. In this cluster, hostA takes 30
seconds to run a benchmark and
hostB takes 15 seconds to run the same test. The CPU
factor for
hostA should be 1, and the CPU factor of hostB should be 2 because it is
twice as fast as
hostA.
Viewing normalized ratings
Run lsload -N to display normalized ratings. LSF uses a normalized CPU
performance rating to decide which host has the most available CPU power. Hosts in
your cluster are displayed in order from best to worst. Normalized CPU run queue
length values are based on an estimate of the time it would take each host to run one
additional unit of work, given that an unloaded host with CPU factor 1 runs one unit of
work in one unit of time.
Tuning CPU factors
1
Log in as the LSF administrator on any host in the cluster.
2
Edit lsf.shared, and change the HostModel section: