Implement high-availability solutions with HP Instant Capacity - easily and effectively

22
Split groups and failback
In the case of a split group, both Group Managers have active status and each controls a subset of the managed group
members, depending on the individual member status at the time of the failover. Control operations can be carried out
on both active Group Managers, each communicating with the members that it (and only it) controls. Groups and
members can be added or removed on both Group Managers (subject to the set of members each can command), and
sharing rights can be added on both. In some cases, this can be valuable; for instance, when two data centers each
remain functional but some intervening network link has been broken. Each isolated set of systems can proceed with
independent disaster recovery operations within their group subset.
At some point, communication is restored and the split groups must be rejoined. This is accomplished through
issuing another icapmanage -Q command. It can be issued on either active Group Manager to confirm that
Group Manager as the active Group Manager and demote the other to standby status. However, doing this loses all
database changes made on the demoted Group Manager during the time that the group was split. This includes the
addition or removal of group members or whole groups and the application of codeword on the demoted Group Manager
and the group members it controlled. There is no method to merge the two databases.
Recovering from a split group situation requires deciding which of the two active Group Manager databases is to
become the only valid description of the groups controlled by the active Group Manager. Having made this decision,
the Group Manager with that target database must be made the sole active Group Manager with the icapmanage –Q
command issued on the target system, and issued at a time when both Group Managers are accessible and can exchange
information. After this has been done, the other Group Manager (now demoted to standby status) can take control of the
groups and members, if this is desired, with a second icapmanage -Q command.
HP Serviceguard considerations
To automate the failover process, commands can be incorporated into Serviceguard package control scripts. As
described previously, there are two possible types of failover that can be automated:
Resource shifting from one group member to another (using core usage rights seizure)
Group control shifting to a standby Group Manager (using a take control command)
For simplicity, these are referred to as “member failover” and “Group Manager failover,” respectively. Examples of each
are presented in later sections.
Performance implications
As with the previous Serviceguard solutions, application startup time is longer as compared to using a typical
Serviceguard package control script that does not invoke GiCAP commands. When using GiCAP, the time required to
activate a core in a GiCAP group can range from seconds to minutes, depending on the size of the group, the hardware
involved, and the network. The time to perform a core usage rights seizure is less but has generally the same range.
In particular, end-to-end time for member failover in a Serviceguard/GiCAP environment consists of:
Serviceguard failover time (no change; typically 2845 seconds depending on cluster configuration and
Serviceguard version)
Usage rights movement and activation (from seconds to a maximum of 10 minutes)
Application startup and recovery time (no change; typically in minutes)
The time needed to move usage rights is based on the time to locate the appropriate Group Manager (if a standby
Group Manager is defined); the time to locate available usage rights, and then perform the icapmodify operations.
The time required by these operations depends on the number of:
Blades
nPars
vPars
Complexes
Group Managers