Managing Serviceguard A.11.20, March 2013

ManualsBrandsHP ManualsSoftwareHP Serviceguard Software

351

352

353

354

355

356

357

358

359

360

/usr/sbin/route delete net default 128.17.17.1 1 source 128.17.17.17

Once the per-interface default route(s) have been added, netstat –rn would show something

like the following, where 128.17.17.17 is the package relocatable address and 128.17.17.19

is the physical address on the same subnet:

Destination Gateway Flags Refs Interface Pmtu

127.0.0.1 127.0.0.1 UH 0 lo0 32808

128.17.17.19 128.17.17.19 UH 0 lan5 32808

128.17.17.17 128.17.17.17 UH 0 lan5:1 32808

192.168.69.82 192.168.69.82 UH 0 lan2 32808

128.17.17.0 128.17.17.19 U 3 lan5 1500

128.17.17.0 128.17.17.17 U 3 lan5:1 1500

192.168.69.0 192.168.69.82 U 2 lan2 1500

127.0.0.0 127.0.0.1 U 0 lo0 32808

default 128.17.17.1 UG 0 lan5:1 1500

default 128.17.17.1 UG 0 lan5 1500

NOTE: If your package has more than one relocatable address on a physical interface, you must

add a route statement for each relocatable address during package start up, and delete each of

these routes during package halt.

For more information about configuring modular, packages, see Chapter 6 (page 232); for legacy

packages, see “Configuring a Legacy Package” (page 307).

IMPORTANT: If you use a Quorum Server, make sure that you list all IP addresses or hostnames

by which the nodes communicate with the Quorum Server in the authorization file /etc/

cmcluster/qs_authfile.

For more information about the Quorum Server, see the latest version of the HP Serviceguard

Quorum Server Release Notes at http://www.hp.com/go/hpux-serviceguard-docs —> HP

Serviceguard Quorum Server Software.

Restoring Client Connections

How does a client reconnect to the server after a failure?

It is important to write client applications to specifically differentiate between the loss of a connection

to the server and other application-oriented errors that might be returned. The application should

take special action in case of connection loss.

One question to consider is how a client knows after a failure when to reconnect to the newly

started server. The typical scenario is that the client must simply restart their session, or relog in.

However, this method is not very automated. For example, a well-tuned hardware and application

system may fail over in 5 minutes. But if users, after experiencing no response during the failure,

give up after 2 minutes and go for coffee and don't come back for 28 minutes, the perceived

downtime is actually 30 minutes, not 5. Factors to consider are the number of reconnection attempts

to make, the frequency of reconnection attempts, and whether or not to notify the user of connection

loss.

There are a number of strategies to use for client reconnection:

• Design clients which continue to try to reconnect to their failed server.

Put the work into the client application rather than relying on the user to reconnect. If the server

is back up and running in 5 minutes, and the client is continually retrying, then after 5 minutes,

the client application will reestablish the link with the server and either restart or continue the

transaction. No intervention from the user is required.

• Design clients to reconnect to a different server.

If you have a server design which includes multiple active servers, the client could connect to

the second server, and the user would only experience a brief delay.

The problem with this design is knowing when the client should switch to the second server.

How long does a client retry to the first server before giving up and going to the second server?

There are no definitive answers for this. The answer depends on the design of the server

356 Designing Highly Available Cluster Applications