Managing Serviceguard A.11.20, March 2013

ManualsBrandsHP ManualsSoftwareHP Serviceguard Software

100

RESERVED KEYS : NONE

RESERVED FROM NODE : NONE

The physical volume /dev/rdsk/c11t0d0 contains the registered and reserved key which is the

node_pr_key (See cmviewcl(5) for parameter explanation). The following command copies

this key to the other physical volumes:

vg_pr_key -s -v /dev/iscsi_vg3

Volume Group: /dev/iscsi_vg3

Successfully copied registration key 0xbcb00001 to /dev/rdsk/c12t0d0

Successfully copied registration key 0xbcb00002 to /dev/rdsk/c12t0d0

Successfully copied reservation key 0xbcb00002 to /dev/rdsk/c12t0d0

Successfully copied PR keys to all physical volumes in volume group /dev/iscsi_vg3

After copying, the output will look like:

vg_pr_key -g -v /dev/iscsi_vg3

Volume Group: /dev/iscsi_vg3

-Physical Volume: /dev/rdsk/c11t0d0

REGISTERED KEYS : 0xbcb00001,0xbcb00002

RESERVED KEYS : 0xbcb00002

RESERVED FROM NODE : jack4

-Physical Volume: /dev/rdsk/c12t0d0

REGISTERED KEYS : 0xbcb00001,0xbcb00002

RESERVED KEYS : 0xbcb00002

RESERVED FROM NODE : jack4

Responses to Failures

Serviceguard responds to different kinds of failures in specific ways. For most hardware failures,

the response is not user-configurable, but for package and service failures, you can choose the

system’s response, within limits.

System Reset When a Node Fails

The most dramatic response to a failure in a Serviceguard cluster is an HP-UX TOC or INIT, which

is a system reset without a graceful shutdown (normally referred to in this manual simply as a

system reset). This allows packages to move quickly to another node, protecting the integrity of

the data.

A system reset occurs if a cluster node cannot communicate with the majority of cluster members

for the predetermined time, or under other circumstances such as a kernel hang or failure of the

cluster daemon (cmcld).

The case is covered in more detail under “What Happens when a Node Times Out” (page 93).

See also “Cluster Daemon: cmcld” (page 41).

A system reset is also initiated by Serviceguard itself under specific circumstances; see “Responses

to Package and Service Failures ” (page 95).

What Happens when a Node Times Out

Each node sends a heartbeat message to all other nodes at an interval equal to one-fourth of the

value of the configured MEMBER_TIMEOUT or 1 second, whichever is less. You configure

MEMBER_TIMEOUT in the cluster configuration file (see “Cluster Configuration Parameters ”

(page 114)); the heartbeat interval is not directly configurable. If a node fails to send a heartbeat

message within the time set by MEMBER_TIMEOUT, the cluster is reformed minus the node no

longer sending heartbeat messages.

When a node detects that another node has failed (that is, no heartbeat message has arrived

within MEMBER_TIMEOUT microseconds), the following sequence of events occurs:

Responses to Failures 93