Platform LSF Administration Guide Version 6.2

Event Generation
Administering Platform LSF
550
Event Generation
LSF detects events occurring during the operation of LSF daemons. LSF provides a
program which translates LSF events into SNMP traps. You can also write your own
program that runs on the master host to interpret and respond to LSF events in other
ways. For example, your program could:
Page the system administrator
Send email to all users
Integrate with your existing network management software to validate and correct
the problem
On Windows, use the Windows Event Viewer to view LSF events.
Enabling event generation
SNMP trap
program
If you use the LSF SNMP trap program as the event handler, see the SNMP
documentation for instructions on how to enable event generation.
Custom event
handling
programs
If you use a custom program to handle the LSF events, take the following steps to enable
event generation.
1
Write a custom program to interpret the arguments passed by LSF. See “Arguments
passed to the LSF event program” on page 551 and “Events list” on page 550 for
more information.
2
To enable event generation, define LSF_EVENT_RECEIVER in lsf.conf. You
must specify an event receiver even if your program ignores it.
The event receiver maintains cluster-specific or changeable information that you do
not want to hard-code into the event program. For example, the event receiver
could be the path to a current log file, the email address of the cluster administrator,
or the host to send SNMP traps to.
3
Set LSF_EVENT_PROGRAM in lsf.conf and specify the name of your custom
event program. If you name your event program
genevent (genevent.exe on
Windows) and place it in LSF_SERVERDIR, you can skip this step.
4
Reconfigure the cluster with the commands lsadmin reconfig and
badmin reconfig.
Events list
The following daemon operations cause mbatchd or the master LIM to call the event
program to generate an event. Each LSF event is identified by a predefined number,
which is passed as an argument to the event program. Events 1-9 also return the name
of the host on which on an event occurred.
1
LIM goes down (detected by the master LIM). This event may also occur if LIM
temporarily stops communicating to the master LIM.
2
RES goes down (detected by the master LIM).
3
sbatchd goes down (detected by mbatchd).
4
An LSF server or client host becomes unlicensed (detected by the master LIM).
5
A host becomes the new master host (detected by the master LIM).
6
The master host stops being the master (detected by the master LIM).