User's Manual
Group Manager Failover Considerations
If the active Group Manager system becomes unavailable and a standby Group Manager has
previously been defined, the standby Group Manager can be used to take over GiCAP group
operations from the Group Manager. If both the active Group Manager and the standby Group
Manager are unavailable, or if the active Group Manager fails and the administrator chooses not
to have the standby Group Manager take over group operations, then the usage rights and
temporary capacity remain as per allocated to each group member, as described in the previous
section “Group Manager Availability (No Standby Manager)” (page 119).
However, if a standby Group Manager has been defined, an administrator can have a standby
Group Manager take control at any time using the icapmanage -Q command. While this can
be done routinely, for example to allow shutting down a functioning active Group Manager for
maintenance, normally this command is issued either when the active Group Manager has failed,
or when a network outage has made it unable to communicate with critical group members.
When a standby manager is told to take control, it attempts to update all members and the current
active Group Manager so that group operations can proceed smoothly.
However, in the case of a failure, it is possible that the icapmanage -Q command is unable to
contact the active Group Manager and some members of the groups that it now manages. When
this happens, the previously active Group Manager remains active, unaware of the change of
control. This is referred to as a “bifurcated” (or “split”) GiCAP group. Members that were
reachable by the standby Group Manager when it took control cannot accept commands from
the old active Group Manager; but unreachable members continue to consider it active. Control
operations can be carried out on both active Group Managers, each communicating with the
members that it (and only it) can reach. Groups and members can be added or removed on both
(subject to the set of members each can command), and sharing rights can be added on both. In
some cases this can be valuable; for example, when two data centers each remain functional but
some intervening network link has been broken. Each isolated set of systems can proceed with
independent disaster recovery operations within their group subset.
At some point, communication is restored and the split groups must be rejoined. This is
accomplished through issuing new icapmanage -Q command. It can be executed on either
active Group Manager to confirm that Group Manager as the active Group Manager and demote
the other to standby status. Be aware that doing this loses all database changes made on the
demoted Group Manager during the time that the group was split. There is no method to merge
the two databases, and in particular any new sharing rights applied to the Group Manager
designated now as standby are lost.
120 Global Instant Capacity