3.5.1 Matrix Server Administration Guide

Chapter 17: Advanced Monitor Topics 273
Copyright © 1999-2007 PolyServe, Inc. All rights reserved.
Recovery script to reduce the frequency of failovers. The script could
contain the following line:
/etc/rc.d/init.d/myservice restart
When you add a recovery script to a service or device monitor, you can
set a timeout period, which is the maximum amount of time that the
monitor_agent daemon will wait for the Recovery script to complete.
Start and Stop Scripts
These scripts are run when a monitor is instantiated for a service (because
either the ClusterPulse daemon is starting or the configuration has
changed). The scripts establish the desired start/stop activity. The Start
script is run on the server where the monitor will be active. The Stop
script is run on all other servers.
Following are some typical uses of the Start and Stop scripts:
An application requires the ownership of a shared resource to be
effective. The Start script tries to take ownership of the resource
(returning non-zero if it fails to do so), and the Stop script yields
ownership. Be sure that script ordering is strict, which is the default.
(Script ordering is an advanced configuration option and is set on the
Scripts tab.)
Ensure the availability of non-shared resources. For example, a Start
script can start an auxiliary process needed by the monitored
application if it is not already running.
Perform cleanup tasks such as killing any unreaped children of a
failed application process. Stop scripts can be used for this purpose.
In some cases the monitored service or device is actually started by
something other than Matrix Server before ClusterPulse is started. The
Start script must be robust enough to run in this circumstance without
considering it to be an error. Similarly, Stop scripts must be robust
enough to run when the service is already stopped, without considering
this to be an error.
Start and Stop scripts must also handle recovery from events that could
cause them to run unsuccessfully. For example, the system might run out
of swap space while running a Start script, causing the script to fail and