Managing Serviceguard Fifteenth Edition, reprinted May 2008
Planning and Documenting an HA Cluster
Package Configuration Planning
Chapter 4 189
Using Serviceguard Commands in an External Script
You can use Serviceguard commands (such as cmmodpkg) in an external
script run from a package. These commands must not interact with that
package itself (that is, the package that runs the external script) but can
interact with other packages. But be careful how you code these
interactions.
If a Serviceguard command interacts with another package, be careful to
avoid command loops. For instance, a command loop might occur under
the following circumstances. Suppose a script run by pkg1 does a
cmmodpkg -d of pkg2, and a script run by pkg2 does a cmmodpkg -d of
pkg1. If both pkg1 and pkg2 start at the same time, the pkg1 script now
tries to cmmodpkg pkg2. But that cmmodpkg command has to wait for
pkg2 startup to complete. The pkg2 script tries to cmmodpkg pkg1, but
pkg2 has to wait for pkg1 startup to complete, thereby causing a
command loop.
To avoid this situation, it is a good idea to specify a run_script_timeout
and halt_script_timeout for all packages, especially packages that use
Serviceguard commands in their external scripts. If a timeout is not
specified and your configuration has a command loop as described above,
inconsistent results can occur, including a hung cluster.
Determining Why a Package Has Shut Down
You can use an external script (or CUSTOMER DEFINED FUNCTIONS area of
a legacy package control script) to find out why a package has shut down.
Serviceguard sets the environment variable SG_HALT_REASON in the
package control script to one of the following values when the package
halts:
• failure - set if the package halts because of the failure of a subnet,
resource, or service it depends on
• user_halt - set if the package is halted by a cmhaltpkg or
cmhaltnode command, or by corresponding actions in Serviceguard
Manager
• automatic_halt - set if the package is failed over automatically
because of the failure of a package it depends on, or is failed back to
its primary node automatically (failback_policy =
automatic)