Managing Serviceguard Sixteenth Edition, March 2009
is configured, the node fails with a system reset. If a monitored data LAN interface
fails without a standby, the node fails with a system reset only if node_fail_fast_enabled
(page 264) is set to YES for the package. Otherwise any packages using that LAN interface
will be halted and moved to another node if possible (unless the LAN recovers
immediately; see “When a Service, Subnet, or Monitored Resource Fails, or a
Dependency is Not Met” (page 86)).
Disk protection is provided by separate products, such as Mirrordisk/UX in LVM or
Veritas mirroring in VxVM and related products. In addition, separately available EMS
disk monitors allow you to notify operations personnel when a specific failure, such
as a lock disk failure, takes place. Refer to the manual Using High Availability Monitors
(HP part number B5736-90074) for additional information; you can find it at http://
docs.hp.com -> High Availability -> Event Monitoring Service and
HA Monitors -> Installation and User’s Guide.
Serviceguard does not respond directly to power failures, although a loss of power to
an individual cluster component may appear to Serviceguard like the failure of that
component, and will result in the appropriate switching behavior. Power protection is
provided by HP-supported uninterruptible power supplies (UPS), such as HP
PowerTrust.
Responses to Package and Service Failures
In the default case, the failure of a failover package, or of a service within the package,
causes the package to shut down by running the control script with the ‘stop’ parameter,
and then restarting the package on an alternate node. A package will also fail if it is
configured to have a dependency on another package, and that package fails. If the
package manager receives a report of an EMS (Event Monitoring Service) event showing
that a configured resource dependency is not met, the package fails and tries to restart
on the alternate node.
You can modify this default behavior by specifying that the node should halt (system
reset) before the transfer takes place. You do this by setting failfast parameters in the
package configuration file.
In cases where package shutdown might hang, leaving the node in an unknown state,
failfast options can provide a quick failover, after which the node will be cleaned up
on reboot. Remember, however, that a system reset causes all packages on the node to
halt abruptly.
The settings of the failfast parameters in the package configuration file determine the
behavior of the package and the node in the event of a package or resource failure:
• If service_fail_fast_enabledis set to yes in the package configuration file, Serviceguard
will halt the node with a system reset if there is a failure of that specific service.
• If node_fail_fast_enabled is set to yes in the package configuration file, and the
package fails, Serviceguard will halt (system reset) the node on which the package
is running.
120 Understanding Serviceguard Software Components