LSF Version 7.3 - Administering Platform LSF
Administering Platform LSF 99
Working with Hosts
Example In the following diagram, the job exit rate of hostA exceeds the configured
threshold (EXIT_RATE for hostA in
lsb.hosts) LSF monitors hostA from time t1
to time t2 (t2=t1 + JOB_EXIT_RATE_DURATION in
lsb.params). At t2, the exit
rate is still high, and a host exception is detected. At t3
(EADMIN_TRIGGER_DURATION in
lsb.params), LSF invokes eadmin and the
host exception is handled. By default, LSF closes
hostA and sends email to the LSF
administrator. Since
hostA is closed and cannot accept any new jobs, the exit rate
drops quickly.
t0 t1 t2
t3
Time
Exit rate
hostA actual job exit rate
hostA EXIT_RATE
threshold
t2-t1=JOB_EXIT_RATE_DURATION
t3-t2=EADMIN_TRIGGER_DURATION