HP Serviceguard Extended Distance Cluster for Linux A.12.00.00 Deployment Guide, March 2014

ManualsBrandsHP ManualsSoftwareHP SAP Linux Serviceguard Cluster Extension

Table 4 Disaster Scenarios and Their Handling (continued)

Recovery ProcessWhat Happens When This Disaster

Occurs

Disaster Scenario

In this scenario, no attempts are made to repair the

first failure until the second failure occurs. Typically

the second failure occurs before the first failure is

repaired.

1. To recover from the first failure, restore the FC

links between the data centers. As a result, S1

is accessible from N2.

NOTE: Manual intervention is required to add

back a disk to MD device on SUSE Linux Enterprise

Server. For more information, see Troubleshooting

serviceguard-xdc packages.

For the second failure, restore N1. Once it is

restored, it joins the cluster and can access S1 and

S2.

1. Run the following command to enable P1 to run

on N1

# cmmodpkg -e P1 -n N1

The package (P1) continues to run

on Node 1 after the first failure,

with the MD0 that consists of only

S1.

After the second failure, the

package P1 fails over to N2 and

starts with S2. Data that was written

to S1 after the FC link failure is now

lost because theRPO_TARGET was

set to IGNORE.

This is a multiple failure

scenario where the failures

occur in a particular

sequence in the

configuration that

corresponds to figure 2

where Ethernet and FC links

do not go over DWDM.

The RPO_TARGET for the

package P1 is set to

IGNORE.

The package is running on

Node 1. P1 uses a mirror

md0 consisting of S1 (local

to node N1, -

/dev/hpdev/mylink-sde)

and S2 (local to node N2).

The first failure occurs when

all FC links between the two

data centers fail, causing

Node 1 to lose access to S2

and Node 2 to lose access

to S1.

After sometime a second

failure occurs. Node 1 fails

(because of power failure)

In this scenario, no attempts are made to repair the

first failure until the second failure occurs. Typically,

the second failure occurs before the first failure is

repaired.

1. To recover from the first failure, restore the FC

links between the data centers. As a result, S1

(/dev/hpdev/mylink-sde) is accessible from

N2.

NOTE: Manual intervention is required to add

back a disk to MD device on SUSE Linux Enterprise

Server. For more information, see Troubleshooting

serviceguard-xdc packages.

For the second failure, restore N1. Once it is

restored, it joins the cluster and can access S1 and

S2.

1. Run the following command to enable P1 to run

on N1

# cmmodpkg -e P1 -n N1

Package P1 continues to run on N1

after the first failure with md0

consisting of only S1

After the second failure, package

P1 fails over to N2 and starts with

S2. This happens because the disk

S2 is non-current by less than 60

seconds. This time limit is set by the

RPO_TARGET parameter. Disk S2

has data that is older than the other

mirror half S1. However, all data

that was written to S1 after the FC

link failure is lost

This failure is the same as

the previous failure except

that the package (P1) is

configured with

RPO_TARGET set to 60

seconds.

In this case, initially the

package (P1) is running on

N 1. P1 uses a mirror md0

consisting of S1 (local to

node N1 -

/dev/hpdev/mylink-sde)

and S2 (local to node N2).

The first failure occurs when

all FC links between the two

data centers fail, causing

N1 to lose access to S2 and

N2 to lose access to S1.

After the package resumes

activity and runs for 20

seconds, a second failure

occurs causing N 1 to fail,

perhaps due to power

failure.