Cost-Effective High-Availability Solutions with HP Instant Capacity on HP-UX

HP Serviceguard failover models

Typical HA configurations consist of multiple servers clustered together. The servers in the cluster might

run one or more applications within Serviceguard packages. The cluster configuration reflects the level

of redundancy and protection required. Depending on how applications are mapped to cluster

members, different cluster configurations are possible, each with a different risk and cost profile.

Upon a Serviceguard node failure, the Serviceguard packages protecting the applications on the failed

node are failed over to another node within the cluster as defined in the package configuration. The

following list shows several models for configuring Serviceguard clusters to handle package failovers:

• Active/Standby: One or more cluster nodes are reserved for failover use. Upon a failover,

applications maintain their level of performance by using the spare capacity provided by the

standby node.

• Rotating Standby: Upon a failover, the standby system becomes the new production system, and the

repaired system becomes the new standby system.

• Active/Active: All nodes in the cluster run different applications. Upon a failover, you have three choices:

1. Executing at a reduced capacity while the failover and existing applications run on the same

node,

2. Shutting down less critical applications to allow more system resources for the failover

application or,

3. Using VSE technologies (such as Instant Capacity) to help guarantee resource entitlements on

the failover node.

• Distributed Active/Active Applications: All nodes in the cluster run an instance of the same

application, such as Oracle Real Applications Cluster (RAC), which depends on having shared

read/write access to data. Upon a failure of a node (or instance), there is no failover of the

application as users are simply redirected to one of the remaining nodes.

In the Active/Standby and Rotating Standby failover models, the processing capacity requirements for

the standby node depend on the performance level required during active node downtime (that is,

running at reduced performance for a short time while the active node is down or running at the same

performance level as during normal operation). One advantage of this model is that you always have

a node available when a downtime event occurs. However, one disadvantage is the expense incurred

by adding to the cluster another server resource only used during downtime.

In each of the above models, Instant Capacity can be used to move resources from a failed node to

increase the capacity of either the failover node or the remaining nodes.

Determining high availability requirements

While HP servers are designed for the highest possible availability and reliability, downtime is still

inevitable for planned maintenance and unexpected failures. In this paper, all types of downtime are

referred to as “failover situations.”

The goal of any high-availability solution design is to minimize planned and unplanned downtime

associated with an application so that the service maintains the highest level of availability required

for the user. Availability and performance goals must be established to help define the hardware

requirements for a cost-effective HA design. You should consider the following questions when

designing an HA solution:

• What is the impact and cost of planned and unplanned downtime to the business or organization?

• What applications must remain operational in the event of a failure?