3.1.2 Matrix Server Administration Guide

Chapter 12: Configure Service Monitors 170
Copyright © 1999-2006 PolyServe, Inc. All rights reserved.
Recovery script. Runs after a monitor probe failure is detected, in an
attempt to restore the service.
Start script. Runs as a service is becoming active on a server.
Stop script. Runs as a service is becoming inactive on a server.
When a monitor is instantiated for a service (because the ClusterPulse
daemon is starting or the configuration has changed), Matrix Server
chooses the best server to make the service active. The Start script is run
on this server. On all other servers configured for the monitor, the Stop
script is run to ensure that the service is not active.
Start scripts must be robust enough to run when the service is already
started, without considering this to be an error. Similarly, Stop scripts
must be robust enough to run when the service is already stopped,
without considering this to be an error. In both of these cases, the script
should exit with a zero exit status.
This behavior is necessary because Matrix Server runs the Start and Stop
scripts to establish the desired start/stop activity, even though the service
may actually have been started by something other than Matrix Server
before ClusterPulse was started. The Start and Stop scripts must also
handle recovery from events that may cause them to run unsuccessfully.
For example, if the system runs out of swap space while running a Start
script, the script will fail and exit non-zero. The service could then
become active on another server, causing the Stop script to run on the
original server even though the Start script did not complete successfully.
To configure scripts from the command line, use these options:
--recoveryScript <script>
--startScript <script>
--stopScript <script>
Event Severity
By default, Matrix Server treats the failure or timeout of a Start or Stop
script as a failure of the associated monitored service and may initiate
failover of the associated virtual hosts. Configuration errors can also
cause this behavior.