Managing HP Serviceguard A.11.20.00 for Linux, June 2012

to disk arrays. However, only one node at a time may access the data for a given group of disks.

In the figure, node 1 is shown with exclusive access to the top two disks (solid line), and node 2

is shown as connected without access to the top disks (dotted line). Similarly, node 2 is shown with

exclusive access to the bottom two disks (solid line), and node 1 is shown as connected without

access to the bottom disks (dotted line).

Disk arrays provide redundancy in case of disk failures. In addition, a total of four data buses are

shown for the disks that are connected to node 1 and node 2. This configuration provides the

maximum redundancy and also gives optimal I/O performance, since each package is using

different buses.

Note that the network hardware is cabled to provide redundant LAN interfaces on each node.

Serviceguard uses TCP/IP network services for reliable communication among nodes in the cluster,

including the transmission of heartbeat messages, signals from each functioning node which are

central to the operation of the cluster. TCP/IP services also are used for other types of inter-node

communication. (See, “Understanding Serviceguard Software Components” (page 27) for more

information about heartbeat.

Failover

Under normal conditions, a fully operating Serviceguard cluster simply monitors the health of the

cluster's components while the packages are running on individual nodes. Any host system running

in the Serviceguard cluster is called an active node. When you create the package, you specify

a primary node and one or more adoptive nodes.When a node or its network communications

fails, Serviceguard can transfer control of the package to the next available adoptive node. This

situation is shown in Figure 2 (page 18).

Figure 2 Typical Cluster After Failover

After this transfer, the package typically remains on the adoptive node as long the adoptive node

continues running. If you wish, however, you can configure the package to return to its primary

node as soon as the primary node comes back online. Alternatively, you may manually transfer

control of the package back to the primary node at the appropriate time.

Figure 2 (page 18) does not show the power connections to the cluster, but these are important

as well. In order to remove all single points of failure from the cluster, you should provide as many

18 Serviceguard for Linux at a Glance