HP P6000 Cluster Extension Software Administrator Guide (5697-0986, June 2011)

ManualsBrandsHP ManualsSoftwareHP Cluster Software

101

102

103

104

105

106

107

108

109

110

If you have Secure Path installed and you still cannot see the drive letter when adding the physical

disk resource, you can use the Secure Path command spprutil device to find the mapping between

the Windows physical disk number and the shown partition numbers.

You can add the disk based on this mapping and failover/failback the disk. The correct drive letter

will show up when the disk is brought online on the node where you originally created the disk partition

and drive letter.

The FC link is down (RHCS)

In RHCS, the detection of a storage outage due to failure of all paths to the storage depends on the

monitoring capability of resources configured in the RHCS service. For example, the LVM and filesystem

resource agents distributed with RHCS can detect the loss of storage and take appropriate actions.

The stop operation on a service might fail due to the inability to stop individual resources cleanly.

This may be caused by the loss of paths to the storage. When the stop operation on a service fails,

RHCS marks the service as failed and the service does not automatically fail over to another node.

To recover from this situation, use the following procedure:

1. Remove the node that lost access to the storage by shutting down the node.

2. Follow the steps required to bring up a service in a failed state, as documented in the RHCS

administration guide. This process involves disabling the service, and then enabling it on the

node where the service is allowed to come online.

3. Restart the node that was shut down.

NOTE:

The time to detect a storage outage due to failure of all paths to storage depends on the

setting for no_path_retry in the multipath software configuration. A value of fail does

not queue I/O in the event of a failure in all paths and returns an immediate failure. For

information about the recommended value for your environment, see the DM-Multipath

documentation.

Some resource agents, such as LVM, offer a mechanism called self_fence to take themselves

out of a cluster through node reboot when an underlying logical volume can no longer be

accessed. For supported options, see the RHCS documentation.

A storage replication link is down (RHCS)

If a P6000 Cluster Extension configuration uses DR groups with failsafemode enabled, the array

disables access to the disk when it cannot replicate the I/O to the remote array.

In this situation, if a replication link is broken, the resource agents of configured resources, such as

lvm or fs, may be able to detect and take appropriate actions. The stop operation on a service might

fail due to the inability to stop individual resources cleanly because the disk is no longer accessible

for read/write operations. When the stop operation on a service fails, RHCS marks the service as

failed and the service does not automatically fail over to another node.

To recover from this situation, use the following procedure:

1. Remove the node that lost access to the storage by shutting down the node.

2. Follow the steps required to bring up a service in a failed state, as documented in the RHCS

administration guide. This process involves disabling the service, and then enabling it on the

node where the service is allowed to come online.

HP P6000 Cluster Extension Software Administrator Guide 105