Specifications
Example
Important Note: The values for both tunables MUST be the SAME on all servers in the cluster.
Example
Consider a LifeKeeper cluster in which both intervals are set to the default values. LifeKeeper sends a
heartbeat between servers every 5 seconds. If a communications problem causes the heartbeat to
skip two beats, but it resumes on third heartbeat, LifeKeeper takes no action. However, if the
communications path remains dead for 3 beats, LifeKeeper will label that communications path as
dead, but will initiate a failover only if the redundant communications path is also dead.
Configuring the Heartbeat
You must manually edit file /etc/default/LifeKeeper to add the tunable and its associated value.
Normally, the defaults file contains no entry for these tunables; you simply append the following lines
with the desired value as follows:
LCMHBEATTIME=x
LCMNUMHBEATS=y
If you assign the value to a number below the minimum value, LifeKeeper will ignore that value and
use the minimum value instead.
Configuration Considerations
l If you wish to set the interval at less than 5 seconds, then you should ensure that the
communications path is configured on a private network, since values lower than 5 seconds
create a high risk of false failovers due to network interruptions.
l Testing has shown that setting the number of heartbeats to less than 2 creates a high risk of
false failovers. This is why the value has been restricted to 2 or higher.
l The values for both the interval and number of heartbeats MUST be the SAME on all servers in
the cluster in order to avoid a false failovers. Because of this, LifeKeeper must be shutdown on
both servers before editing these values.If you wish to edit the heartbeat tunables after
LifeKeeper is in operation with protected applications, you may use the command
/etc/init.d/lifekeeper stop-daemons, which stops LifeKeeper but does not bring
down the protected applications.
l LifeKeeper does not impose an upper limit for the LCMHBEATTIME and LCMNUMHBEATS
values. But setting these values at a very high number can effectively disable LifeKeeper's
ability to detect a failure. For instance, setting both values to 25 would instruct LifeKeeper to
wait 625 seconds (over 10 minutes) to detect a server failure, which may be enough time for
the server to re-boot and re-join the cluster.
Note: If you are using both TTY and TCP communications paths, the value for each tunable applies
to both communications paths. The only exception is if the interval value is below 2, which is the
minimum for a TTY communications path.
For example, suppose you specify the lowest values allowed by LifeKeeper in order to detect failure
as quickly as possible:
LCMHBEATTIME=1
SteelEye Protection Suite for Linux73