VERITAS Storage Foundation 4.1 Cluster File System HP Serviceguard Storage Management Suite Extracts, December 2005

Overview of Cluster Volume Management

98 Installation and Administration Guide

Overview of Cluster Volume Management

Tightly coupled cluster systems have become increasingly popular in enterprise-scale

mission-critical data processing. The primary advantage of clusters is protection against

hardware failure. If the primary node fails or otherwise becomes unavailable, applications

can continue to run by transferring their execution to standby nodes in the cluster. This

ability to provide continuous availability of service by switching to redundant hardware

is commonly termed failover.

Another major advantage of clustered systems is their ability to reduce contention for

system resources caused by activities such as backup, decision support and report

generation. Enhanced value can be derived from cluster systems by performing such

operations on lightly loaded nodes in the cluster instead of on the heavily loaded nodes

that answer requests for service. This ability to perform some operations on the lightly

loaded nodes is commonly termed load balancing.

To implement cluster functionality, VxVM works together with the cluster monitor daemon

that is provided by the host operating system. The cluster monitor informs

VxVM of changes in cluster membership. Each node starts up independently and has its

own cluster monitor plus its own copies of the operating system and VxVM with support

for cluster functionality. When a node joins a cluster, it gains access to shared disks. When

a node leaves a cluster, it no longer has access to shared disks. A node joins a cluster when

the cluster monitor is started on that node.

The figure “Example of a 4-Node Cluster” on page 99 illustrates a simple cluster

arrangement consisting of four nodes with similar or identical hardware characteristics

(CPUs, RAM and host adapters), and configured with identical software (including the

operating system). The nodes are fully connected by a private network and they are also

separately connected to shared external storage (either disk arrays or JBODs: just a bunch

of disks) via Fibre Channel. Each node has two independent paths to these disks, which are

configured in one or more cluster-shareable disk groups.

The private network allows the nodes to share information about system resources and

about each other’s state. Using the private network, any node can recognize which other

nodes are currently active, which are joining or leaving the cluster, and which have failed.

The private network requires at least two communication channels to provide

redundancy against one of the channels failing. If only one channel were used, its failure

would be indistinguishable from node failure—a condition known as network partitioning.