Cost-Effective High-Availability Solutions with HP Instant Capacity on HP-UX

The time needed to move usage rights is based on the time to locate the appropriate Group Manager

(if a standby Group Manager is defined), the time to locate available usage rights, and then perform

the icapmodify operations. The time required by these operations depends on the number of:

• nPartition cell boards

• nPartitions

• Virtual partitions

• Complexes

• Group Managers

End-to-end time for Group Manager failover consists of:

• Serviceguard failover time (no change; typically 28-45 seconds depending on cluster configuration

and Serviceguard version)

• The time needed for a standby to take control (from seconds to a maximum of 15 minutes)

The time needed for a standby Group Manager to take control depends on the number of:

• Groups

• Members in the groups

• Partitions in the groups that are not contactable

Remember that during failover to a standby Group Manager, the members of the group can still

operate normally with the usage rights they have, they just cannot borrow or lend usage rights within

the group.

Automation of member failover using core usage rights seizure

This is easier and more straightforward to implement in an nPartition environment.

When virtual partitions are involved, there are more complications due to the need to handle cases

where only a subset of virtual partitions in the nPartition have failed, and due to the likely requirement

that a manual intervention is needed to provide failback after any rights seizure operation.

Consider the case where a server goes down for a period of time that is long enough for automatic

failover to initiate a rights seizure on the nPartition containing virtual partitions—but before the restore

operation can be performed, the failing server starts an automatic reboot which “commits” the rights

seizure, and reduces the available usage rights. In this case, the reboot will likely fail until the virtual

partition assignments can be adjusted. The application can continue running on the failover node but

the server reboot will be blocked until an administrator can boot to nPar mode and make the virtual

partition adjustments.

Example: Automated (Serviceguard) member failover from a partial outage with nPartitions

The initial configuration, shown in Figure 15, includes a GiCAP group consisting of a single Group

Manager (no standby Group Manager) and two servers, each with two partitions. Partitions db1 and

db2 are also part of a Serviceguard cluster. db1 is defined to be the active node for the package,

while db2 is the adoptive (failover) node. The package requires eight active processor cores to run,

so db1 is configured with usage rights for eight cores. db2 has been configured with six inactive

cores. No temporary capacity is being used in this group.

There are 12 inactive cores (cores without usage rights) in this configuration. 12 GiCAP sharing rights

are configured to create this group.