Managing HP Serviceguard A.11.20.
Legal Notices The information in this document is subject to change without notice. Hewlett-Packard makes no warranty of any kind with regard to this manual, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. Hewlett-Packard shall not be held liable for errors contained herein or direct, indirect, special, incidental or consequential damages in connection with the furnishing, performance, or use of this material. Warranty.
Contents Printing History ..........................................................................................14 Preface......................................................................................................15 1 Serviceguard for Linux at a Glance.............................................................17 What is Serviceguard for Linux? ..............................................................................................17 Failover............................................
Use of the Quorum Server as a Cluster Lock..........................................................................33 No Cluster Lock ................................................................................................................34 What Happens when You Change the Quorum Configuration Online.......................................35 How the Package Manager Works...........................................................................................35 Package Types........................
How Persistent Reservations Work........................................................................................66 Responses to Failures .............................................................................................................66 Reboot When a Node Fails ................................................................................................66 What Happens when a Node Times Out.........................................................................67 Example .............
Using Serviceguard Command to Set the Status/Value of a Simple/Extended Generic Resource.................................................................................................................99 Online Reconfiguration of Generic Resources..................................................................100 About Package Dependencies...........................................................................................100 Simple Dependencies.......................................................
Restarting Networking.................................................................................................127 Viewing the Configuration............................................................................................127 Implementing Channel Bonding (SUSE)...............................................................................128 Restarting Networking.................................................................................................129 Setting up a Lock LUN........
Base Package Modules................................................................................................155 Optional Package Modules..........................................................................................156 Package Parameter Explanations........................................................................................158 package_name...........................................................................................................159 module_name..................
user_host....................................................................................................................173 user_name.................................................................................................................173 user_role....................................................................................................................173 Additional Parameters Used Only by Legacy Packages.....................................................
Moving a Failover Package ..............................................................................................197 Changing Package Switching Behavior ..............................................................................197 Maintaining a Package: Maintenance Mode............................................................................197 Characteristics of a Package Running in Maintenance Mode or Partial-Startup Maintenance Mode ..........................................................
Renaming or Replacing an External Script Used by a Running Package...................................218 Reconfiguring a Package on a Halted Cluster .....................................................................218 Adding a Package to a Running Cluster..............................................................................218 Deleting a Package from a Running Cluster ........................................................................219 Resetting the Service Restart Counter................
Minimize the Use and Amount of Memory-Based Data ....................................................241 Keep Logs Small ........................................................................................................241 Eliminate Need for Local Data .....................................................................................241 Use Restartable Transactions .............................................................................................241 Use Checkpoints ......................
Aggregatable Global Unicast Addresses........................................................................259 Link-Local Addresses....................................................................................................259 Site-Local Addresses....................................................................................................259 Multicast Addresses.....................................................................................................
Printing History Table 1 Printing Date Part Number Edition November 2001 B9903-90005 First November 2002 B9903-90012 First December 2002 B9903-90012 Second November 2003 B9903-90033 Third February 2005 B9903-90043 Fourth June 2005 B9903-90046 Fifth August 2006 B9903-90050 Sixth July 2007 B9903-90054 Seventh March 2008 B9903-90060 Eighth April 2009 B9903-90068 Ninth July 2009 B9903-90073 Tenth June 2012 701460-001 The last printing date and part number indicate the curre
Preface This guide describes how to configure and manage Serviceguard for Linux on HP ProLiant server under the Linux operating system. It is intended for experienced Linux system administrators. (For Linux system administration tasks that are not specific to Serviceguard, use the system administration documentation and manpages for your distribution of Linux.) IMPORTANT: version. SUSE Linux Enterprise Server (SLES) is not supported in Serviceguard A.11.20.
Information about supported configurations is in the HP Serviceguard for Linux Configuration Guide. For updated information on supported hardware and Linux distributions refer to the HP Serviceguard for Linux Certification Matrix. Both documents are available at: http://www.hp.com/info/sglx Problem Reporting If you have any problems with the software or documentation, please contact your local Hewlett-Packard Sales Office or Customer Service Center.
1 Serviceguard for Linux at a Glance This chapter introduces Serviceguard for Linux and shows where to find different kinds of information in this book. It includes the following topics: • What is Serviceguard for Linux? • Using Serviceguard Manager (page 19) • Configuration Roadmap (page 20) If you are ready to start setting up Serviceguard clusters, skip ahead to Chapter 4 (page 70). Specific steps for setup are in Chapter 5 (page 121).
to disk arrays. However, only one node at a time may access the data for a given group of disks. In the figure, node 1 is shown with exclusive access to the top two disks (solid line), and node 2 is shown as connected without access to the top disks (dotted line). Similarly, node 2 is shown with exclusive access to the bottom two disks (solid line), and node 1 is shown as connected without access to the bottom disks (dotted line). Disk arrays provide redundancy in case of disk failures.
separate power circuits as needed to prevent a single point of failure of your nodes, disks and disk mirrors. Each power circuit should be protected by an uninterruptible power source. For more details, see “Power Supply Planning ” (page 74) section.
Configuration Roadmap This manual presents the tasks you need to perform in order to create a functioning HA cluster using Serviceguard. These tasks are shown in Figure 3. Figure 3 Tasks in Configuring a Serviceguard Cluster HP recommends that you gather all the data that is needed for configuration before you start. See Chapter 4 (page 70) for tips on gathering data.
2 Understanding Hardware Configurations for Serviceguard for Linux This chapter gives a broad overview of how the server hardware components operate with Serviceguard for Linux. The following topics are presented: • Redundant Cluster Components • Redundant Network Components (page 21) • Redundant Disk Storage (page 25) • Redundant Power Supplies (page 26) Refer to the next chapter for information about Serviceguard software components.
Rules and Restrictions • A single subnet cannot be configured on different network interfaces (NICs) on the same node. • In the case of subnets that can be used for communication between cluster nodes, the same network interface must not be used to route more than one subnet configured on the same node. • For IPv4 subnets, Serviceguard does not support different subnets on the same LAN interface. ◦ For IPv6, Serviceguard supports up to two subnets per LAN interface (site-local and global).
Figure 4 Redundant LANs In Linux configurations, the use of symmetrical LAN configurations is strongly recommended, with the use of redundant hubs or switches to connect Ethernet segments. The software bonding configuration should be identical on each node, with the active interfaces connected to the same hub or switch. Cross-Subnet Configurations As of Serviceguard A.11.
• You should not use the wildcard (*) for node_name in the package configuration file, as this could allow the package to fail over across subnets when a node on the same subnet is eligible; failing over across subnets can take longer than failing over on the same subnet. List the nodes in order of preference instead of using the wildcard. • You should configure IP monitoring for each subnet; see “Monitoring LAN Interfaces and Detecting Failure: IP Level” (page 58).
IMPORTANT: Although cross-subnet topology can be implemented on a single site, it is most commonly used by extended-distance clusters. For more information about such clusters, see the latest edition of HP Serviceguard Extended Distance Cluster for Linux Deployment Guide at http:// www.hp.com/go/linux-serviceguard-docs.
Figure 5 Mirrored Disks Connected for High Availability Redundant Power Supplies You can extend the availability of your hardware by providing battery backup to your nodes and disks. HP-supported uninterruptible power supplies (UPS) can provide this protection from momentary power loss. Disks should be attached to power circuits in such a way that disk array copies are attached to different power sources. The boot disk should be powered from the same circuit as its corresponding node.
3 Understanding Serviceguard Software Components This chapter gives a broad overview of how the Serviceguard software components work.
• cmlogd—cluster system log daemon • cmdisklockd—cluster lock LUN daemon • cmresourced—Serviceguard Generic Resource Assistant Daemon • cmserviced—Service Assistant daemon • qs—Quorum Server daemon • cmlockd—utility daemon • cmsnmpd—cluster SNMP subagent (optionally running) • cmwbemd—WBEM daemon • cmproxyd—proxy daemon Each of these daemons logs to the Linux system logging files. The quorum server daemon logs to the user specified log file, such as, /usr/local/qs/log/qs.
Log Daemon: cmlogd cmlogd is used by cmcld to write messages to the system log file. Any message written to the system log by cmcld it written through cmlogd. This is to prevent any delays in writing to syslog from impacting the timing of cmcld. The path for this daemon is $SGLBIN/cmlogd. Network Manager Daemon: cmnetd This daemon monitors the health of cluster networks. It also handles the addition and deletion of relocatable package IPs, for both IPv4 and IPv6 addresses.
Serviceguard Quorum Server release notes at http://www.hp.com/go/ hpux-serviceguard-docs -> HP Serviceguard Quorum Server Software. See also “Use of the Quorum Server as a Cluster Lock” (page 33). The path for this daemon is: • For SUSE: /opt/qs/bin/qs • For Red Hat: /usr/local/qs/bin/qs Utility Daemon: cmlockd Runs on every node on which cmcld is running. It maintains the active and pending cluster resource locks.
heartbeat, cluster lock information, and timing parameters (discussed in detail in Chapter 4 (page 70) ). Cluster parameters are entered by editing the cluster configuration file (see “Configuring the Cluster” (page 138)). The parameters you enter are used to build a binary configuration file which is propagated to all nodes in the cluster. This binary cluster configuration file must be the same on all the nodes in the cluster.
Automatic cluster startup will take place if the flag AUTOSTART_CMCLD is set to 1 in the $SGCONF/ cmcluster.rc file. When any node reboots with this parameter set to 1, it will rejoin an existing cluster, or if none exists it will attempt to form a new cluster. Dynamic Cluster Re-formation A dynamic re-formation is a temporary change in cluster membership that takes place as nodes join or leave a running cluster.
Use of a Lock LUN as the Cluster Lock A lock LUN can be used for clusters up to and including four nodes in size. The cluster lock LUN is a special piece of storage (known as a partition) that is shareable by all nodes in the cluster. When a node obtains the cluster lock, this partition is marked so that other nodes will recognize the lock as “taken.
Figure 8 Quorum Server Operation A quorum server can provide quorum services for multiple clusters. Figure 9 illustrates quorum server use across four clusters. Figure 9 Quorum Server to Cluster Distribution IMPORTANT: For more information about the quorum server, see the latest version of the HP Serviceguard Quorum Server release notes at http://www.hp.com/go/ hpux-serviceguard-docs->HP Serviceguard Quorum Server Software.
In a cluster with four or more nodes, you may not need a cluster lock since the chance of the cluster being split into two halves of equal size is very small. However, be sure to configure your cluster to prevent the failure of exactly half the nodes at one time. For example, make sure there is no potential single point of failure such as a single LAN between equal numbers of nodes, and that you don’t have exactly half of the nodes on a single power circuit.
The rest of this section describes failover packages. Failover Packages A failover package starts up on an appropriate node (see node_name (page 159)) when the cluster starts. In the case of a service, network, or other resource or dependency failure, package failover takes place. A package failover involves both halting the existing package and starting the new instance of the package on a new node.
behavior. These are the auto_run parameter, the failover_policy parameter, and the failback_policy parameter.
Figure 11 Before Package Switching In Figure 12, node1 has failed and pkg1 has been transferred to node2. pkg1's IP address was transferred to node2 along with the package. pkg1 continues to be available and is now running on node2. Also note that node2 now has access both to pkg1's disk and pkg2's disk. NOTE: For design and configuration information about clusters that span subnets, see the documents listed under “Cross-Subnet Configurations” (page 23).
Figure 12 After Package Switching Failover Policy The Package Manager selects a node for a failover package to run on based on the priority list included in the package configuration file together with the failover_policy parameter, also in the configuration file. The failover policy governs how the package manager selects which node to run a package on when a specific node has not been identified and the package needs to be started.
Table 2 Package Configuration Data Package Name NODE_NAME List FAILOVER_POLICY pkgA node1, node2, node3, node4 min_package_node pkgB node2, node3, node4, node1 min_package_node pkgC node3, node4, node1, node2 min_package_node When the cluster starts, each package starts as shown in Figure 13.
Figure 14 Rotating Standby Configuration after Failover NOTE: Under the min_package_node policy, when node2 is repaired and brought back into the cluster, it will then be running the fewest packages, and thus will become the new standby node. If these packages had been set up using the configured_node failover policy, they would start initially as in Figure 13, but the failure of node2 would cause the package to start on node3, as shown in Figure 15.
Figure 15 configured_node Policy Packages after Failover If you use configured_node as the failover policy, the package will start up on the highest-priority eligible node in its node list. When a failover occurs, the package will move to the next eligible node in the list, in the configured order of priority.
Figure 16 Automatic Failback Configuration before Failover Table 3 Node Lists in Sample Cluster Package Name NODE_NAME List FAILOVER POLICY FAILBACK POLICY pkgA node1, node4 configured_node automatic pkgB node2, node4 configured_node automatic pkgC node3, node4 configured_node automatic node1 panics, and after the cluster reforms, pkgA starts running on node4: How the Package Manager Works 43
Figure 17 Automatic Failback Configuration After Failover After rebooting, node1 rejoins the cluster. At that point, pkgA will be automatically stopped on node4 and restarted on node1.
NOTE: Setting the failback_policy to automatic can result in a package failback and application outage during a critical production period. If you are using automatic failback, you may want to wait to add the package’s primary node back into the cluster until you can allow the package to be taken out of service temporarily while it switches back to the primary node. Serviceguard automatically chooses a primary node for a package when the NODE_NAME is set to '*'.
If there is a common generic resource that needs to be monitored as a part of multiple packages, then the monitoring script for that resource can be launched as part of one package and all other packages can use the same monitoring script. There is no need to launch multiple monitors for a common resource. If the package that has started the monitoring script fails or is halted, then all the other packages that are using this common resource also fail.
What Makes a Package Run? There are 3 types of packages: • The failover package is the most common type of package. It runs on one node at a time. If a failure occurs, it can switch to another node listed in its configuration file. If switching is enabled for several nodes, the package manager will use the failover policy to determine where to start the package. • A system multi-node package runs on all the active cluster nodes at the same time.
Figure 19 Legacy Package Time Line Showing Important Events The following are the most important moments in a package’s life: 1. Before the control script starts. (For modular packages, this is the master control script.) 2. During run script execution. (For modular packages, during control script execution to start the package.) 3. While services are running 4. If there is a generic resource configured and it fails, then the package will be halted. 5.
During Run Script Execution Once the package manager has determined that the package can start on a particular node, it launches the script that starts the package (that is, a package’s control script or master control script is executed with the start parameter). This script carries out the following steps: 1. 2. 3. 4. 5. 6. 7. Executes any external_pre_scripts (modular packages only; see “About External Scripts” (page 114)) Activates volume groups or disk groups. Mounts file systems.
Normal starts are recorded in the log, together with error messages or warnings related to starting the package. NOTE: After the package run script has finished its work, it exits, which means that the script is no longer executing once the package is running normally. After the script exits, the PIDs of the services started by the script are monitored by the package manager directly.
While Services are Running During the normal operation of cluster services, the package manager continuously monitors the following: • Process IDs of the services • Subnets configured for monitoring in the package configuration file • Generic resources configured for monitoring in the package configuration file If a service fails but the restart parameter for that service is set to a value greater than 0, the service will restart, up to the configured number of restarts, without halting the package.
script is executed with the stop parameter. This script carries out the following steps (also shown in Figure 21) : 1. 2. 3. 4. 5. 6. 7. 8. Halts all package services. Executes any customer-defined halt commands (legacy packages only) or external_scripts (modular packages only; see “external_script” (page 173)). Removes package IP addresses from the LAN card on the node. Unmounts file systems. Deactivates volume groups.
Normal and Abnormal Exits from the Halt Script The package’s ability to move to other nodes is affected by the exit conditions on leaving the halt script. The following are the possible exit codes: • 0—normal exit. The package halted normally, so all services are down on this node. • 1—abnormal exit, also known as no_restart exit. The package did not halt normally. Services are killed, and the package is disabled globally. It is not disabled on the current node, however.
How the Network Manager Works The purpose of the network manager is to detect and recover from network card failures so that network services remain highly available to clients. In practice, this means assigning IP addresses for each package to LAN interfaces on the node where the package is running and monitoring the health of all interfaces, switching them when necessary. NOTE: Serviceguard monitors the health of the network interfaces (NICs) and can monitor the IP level (layer 3) network.
IMPORTANT: Any subnet that is used by a package for relocatable addresses should be configured into the cluster via NETWORK_INTERFACE and either STATIONARY_IP or HEARTBEAT_IP in the cluster configuration file. For more information about those parameters, see “Cluster Configuration Parameters ” (page 80). For more information about configuring relocatable addresses, see the descriptions of the package ip_ parameters (page 165).
Once bonding is enabled, each interface can be viewed as a single logical link of multiple physical ports with only one IP and MAC address. There is no limit to the number of slaves (ports) per bond, and the number of bonds per system is limited to the number of Linux modules you can load. You can bond the ports within a multi-ported networking card (cards with up to four ports are currently available). Alternatively, you can bond ports from different cards. HP recommends that use different cards.
In the bonding model, individual Ethernet interfaces are slaves, and the bond is the master. In the basic high availability configuration (mode 1), one slave in a bond assumes an active role, while the others remain inactive until a failure is detected. (In Figure 3-18, both eth0 slave interfaces are active.) It is important that during configuration, the active slave interfaces on all nodes are connected to the same hub.
Figure 25 Bonded NICs Configured for Load Balancing Monitoring LAN Interfaces and Detecting Failure: Link Level At regular intervals, determined by the NETWORK_POLLING_INTERVAL (see “Cluster Configuration Parameters ” (page 80)), Serviceguard polls all the network interface cards specified in the cluster configuration file (both bonded and non-bonded).
◦ Inbound failures ◦ Errors that prevent packets from being received but do not affect the link-level health of an interface IMPORTANT: You should configure the IP Monitor in a cross-subnet configuration, because IP monitoring will detect some errors that link-level monitoring will not. See also “Cross-Subnet Configurations” (page 23).
NOTE: This is the default if cmquerycl detects a gateway for the subnet in question; see SUBNET under “Cluster Configuration Parameters ” (page 80) for more information. IMPORTANT: By default, cmquerycl does not verify that the gateways it detects will work correctly for monitoring. But if you use the -w full option, cmquerycl will validate them as polling targets. SUBNET 192.168.1.0 IP_MONITOR ON POLLING_TARGET 192.168.1.
The following constraints apply to peer polling when there are only two interfaces on a subnet: • If one interface fails, both interfaces and the entire subnet will be marked down on each node, unless bonding is configured and there is a working standby. • If the node that has one of the interfaces goes down, the subnet on the other node will be marked down. • In a 2-node cluster, there is only a single peer for polling.
When a package switch occurs, TCP connections are lost. TCP applications must reconnect to regain connectivity; this is not handled automatically. Note that if the package is dependent on multiple subnets (specified as monitored_subnets in the package configuration file), all those subnets must normally be available on the target node before the package will be started.
Additional Heartbeat Requirements VLAN technology allows great flexibility in network configuration. To maintain Serviceguard’s reliability and availability in such an environment, the heartbeat rules are tightened as follows when the cluster is using VLANs: 1. VLAN heartbeat networks must be configured on separate physical NICs or Channel Bonds, to avoid single points of failure. 2. Heartbeats are still recommended on all cluster networks, including VLANs. 3.
Figure 26 Physical Disks Combined into LUNs NOTE: LUN definition is normally done using utility programs provided by the disk array manufacturer. Since arrays vary considerably, you should refer to the documentation that accompanies your storage unit. For information about configuring multipathing, see “Multipath for Storage ” (page 73). Monitoring Disks Each package configuration includes information about the disks that are to be activated by the package at startup.
Unlike exclusive activation for volume groups, which does not prevent unauthorized access to the underlying LUNs, PR controls access at the LUN level. Registration and reservation information is stored on the device and enforced by its firmware; this information persists across device resets and system reboots. NOTE: Persistent Reservations coexist with, and are independent of, activation protection of volume groups.
How Persistent Reservations Work You do not need to do any configuration to enable or activate PR, and in fact you cannot enable it or disable it, either at the cluster or the package level; Serviceguard makes the decision for each cluster and package on the basis of the Rules and Limitations described above. When you run cmapplyconf (1m) to configure a new cluster, or add a new node, Serviceguard sets the variable cluster_pr_mode to either pr_enabled or pr_disabled.
A reboot is done if a cluster node cannot communicate with the majority of cluster members for the pre-determined time, or under other circumstances such as a kernel hang or failure of the cluster daemon (cmcld). When this happens, you may see the following message on the console: DEADMAN: Time expired, initiating system restart. The case is covered in more detail under “What Happens when a Node Times Out”. See also “Cluster Daemon: cmcld” (page 28).
Responses to Hardware Failures If a serious system problem occurs, such as a system panic or physical disruption of the SPU's circuits, Serviceguard recognizes a node failure and transfers the packages currently running on that node to an adoptive node elsewhere in the cluster. (System multi-node and multi-node packages do not fail over.) The new location for each package is determined by that package's configuration file, which lists primary and alternate nodes for the package.
For failover packages, the package is halted on the node where the resource failure occurred and started on an available alternative node. For multi-node packages, failure of a generic resources causes the package to be halted only on the node where the failure occurred. • In case of simple resources, failure of a resource must trigger the monitoring script to set the status of a resource to 'down' using the cmsetresource command.
4 Planning and Documenting an HA Cluster Building a Serviceguard cluster begins with a planning phase in which you gather and record information about all the hardware and software components of the configuration.
your cluster without having to bring it down, you need to plan the initial configuration carefully. Use the following guidelines: • Set the Maximum Configured Packages parameter (described later in this chapter under “Cluster Configuration Planning ” (page 76)) high enough to accommodate the additional packages you plan to add. • Networks should be pre-configured into the cluster configuration if they will be needed for packages you will add later while the cluster is running.
For more information, see the white paper Using Serviceguard for Linux with VMware Virtual Machines at http://www.hp.com/go/linux-serviceguard-docs. Hardware Planning Hardware planning requires examining the physical hardware itself. One useful procedure is to sketch the hardware configuration in a diagram that shows adapter cards and buses, cabling, disks and peripherals.
Shared Storage SCSI can be used for up to four-node clusters; FibreChannel can be used for clusters of up to 16 nodes. FibreChannel FibreChannel cards can be used to connect up to 16 nodes to a disk array containing storage. After installation of the cards and the appropriate driver, the LUNs configured on the storage unit are presented to the operating system as device files, which can be used to build LVM volume groups.
You can obtain information about available disks by using the following commands; your system may provide other utilities as well. • ls /dev/sd* (Smart Array cluster storage) • ls /dev/hd* (non-SCSI/FibreChannel disks) • ls /dev/sd* (SCSI and FibreChannel disks) • du • df • mount • vgdisplay -v • lvdisplay -v See the manpages for these commands for information about specific usage. The commands should be issued from all nodes after installing the hardware and rebooting the system.
Be sure to follow UPS, power circuit, and cabinet power limits as well as SPU power limits. Power Supply Configuration Worksheet The Power Supply Planning worksheet (page 253) will help you organize and record your specific power supply configuration. Make as many copies as you need. Cluster Lock Planning The purpose of the cluster lock is to ensure that only one new cluster is formed in the event that exactly half of the previously clustered nodes try to form a new cluster.
Supported Node Names The name (39 characters or fewer) of each cluster node that will be supported by this quorum server. These entries will be entered into qs_authfile on the system that is running the quorum server process. Volume Manager Planning When designing your disk layout using LVM, you should consider the following: • The volume groups that contain high availability applications, services, or data must be on a bus or buses available to the primary node and all adoptive nodes.
If the cluster has only a single heartbeat network, and a network card on that network fails, heartbeats will be lost while the failure is being detected and the IP address is being switched to a standby interface. The cluster may treat these lost heartbeats as a failure and re-form without one or more nodes. To prevent this, a minimum MEMBER_TIMEOUT value of 14 seconds is required for clusters with a single heartbeat network.
NOTE: How the clients of IPv6-only cluster applications handle hostname resolution is a matter for the discretion of the system or network administrator; there are no HP requirements or recommendations specific to this case. In IPv6-only mode, all Serviceguard daemons will normally use IPv6 addresses for communication among the nodes, although local (intra-node) communication may occur on the IPv4 loopback address. For more information about IPv6, see Appendix D (page 257).
• Cross-subnet configurations are not supported in IPv6-only mode. • Virtual machines are not supported. You cannot have a virtual machine that is either a node or a package if HOSTNAME_ADDRESS_FAMILY is set to ANY or IPV6. Recommendations for IPv6-Only Mode IMPORTANT: Check the latest Serviceguard for Linux release notes for the latest instructions and recommendations. • If you decide to migrate the cluster to IPv6-only mode, you should plan to do so while the cluster is down.
Cluster Configuration Parameters You need to define a set of cluster parameters. These are stored in the binary cluster configuration file, which is distributed to each node in the cluster. You configure these parameters by editing the cluster configuration template file created by means of the cmquerycl command, as described under “Configuring the Cluster” (page 138). NOTE: See “Reconfiguring a Cluster” (page 202) for a summary of changes you can make while the cluster is running.
QS_HOST The fully-qualified hostname or IP address of a host system outside the current cluster that is providing quorum server functionality. It must be (or resolve to) an IPv4 address on Red Hat 5. On SLES 11, it can be (or resolve to) either an IPv4 or an IPv6 address if HOSTNAME_ADDRESS_FAMILY is set to ANY, but otherwise must match the setting of HOSTNAME_ADDRESS_FAMILY. This parameter is used only when you employ a quorum server for tie-breaking services in the cluster.
300,000,000 microseconds (5 minutes). Minimum is 10,000,000 (10 seconds). Maximum is 2,147,483,647 (approximately 35 minutes). Can be changed while the cluster is running; see “What Happens when You Change the Quorum Configuration Online” (page 35) for important information.
Can be changed while the cluster is running; see “Updating the Cluster Lock LUN Configuration Online” (page 209). See also “What Happens when You Change the Quorum Configuration Online” (page 35) for important information. NETWORK_INTERFACE The name of each LAN that will be used for heartbeats or for user data on the node identified by the preceding NODE_NAME. An example is eth0. See also HEARTBEAT_IP, STATIONARY_IP, and “About Hostname Address Families: IPv4-Only, IPv6-Only, and Mixed Mode” (page 77).
given subnet must all be of the same type: IPv4 or IPv6 site-local or IPv6 global. For information about changing the configuration online, see “Changing the Cluster Networking Configuration while the Cluster Is Running” (page 207).
NOTE: IPv6 heartbeat subnets are not supported in a cross-subnet configuration. NOTE: The use of a private heartbeat network is not advisable if you plan to use Remote Procedure Call (RPC) protocols and services. RPC assumes that each network adapter device or I/O card is connected to a route-able network. An isolated or private heartbeat LAN is not route-able, and could cause an RPC request-reply, directed to that LAN, to timeout without being serviced.
package weight to determine if the package can run on that node. CAPACITY_NAME name can be any string that starts and ends with an alphanumeric character, and otherwise contains only alphanumeric characters, dot (.), dash (-), or underscore (_). Maximum length is 39 characters. CAPACITY_NAME must be unique in the cluster. CAPACITY_VALUE specifies a value for the CAPACITY_NAME that precedes it. It must be a floating-point value between 0 and 1000000.
If you enter a value greater than 60 seconds (60,000,000 microseconds), cmcheckconf and cmapplyconf will note the fact, as confirmation that you intend to use a large value. Minimum supported values: • 3 seconds for a cluster with more than one heartbeat subnet. • 14 seconds for a cluster that has only one heartbeat LAN With the lowest supported value of 3 seconds, a failover time of 4 to 5 seconds can be achieved.
the default will lead to slower re-formations than the default. A value in this range is appropriate for most installations See also “What Happens when a Node Times Out” (page 67), “Cluster Daemon: cmcld” (page 28), and the white paper Optimizing Failover Time in a Serviceguard Environment (version A.11.19 and later) on docs.hp.com under High Availability —> Serviceguard —> White Papers. Can be changed while the cluster is running.
The following are the failure/recovery detection times for different values of Network Polling Interval (NPI) for an IP monitored Ethernet interface: Table 5 Failure Recovery Detection Times for an IP Monitored Ethernet Interface Values of Network Polling Failure/Recovery Detection Times (in seconds) Interval (NPI) (in seconds) 1 ~ NPI x 8 - NPI x 9 2 ~ NPI x 4 - NPI x 5 3 ~ NPI x 3 - NPI x 4 >=4 ~ NPI x 2 - NPI x 3 IMPORTANT: HP strongly recommends using the default.
and the cluster nodes, sum the values for each path and use the largest number. CAUTION: Serviceguard supports NFS-mounted file systems only over switches and routers that support MBTD. If you are using NFS-mounted file systems, you must set CONFIGURED_IO_TIMEOUT_EXTENSION as described here. For a fuller discussion of MBTD, see the white paper Support for NFS as a filesystem type with HP Serviceguard A.11.20 on HP-UX and Linux available at http://www.hp.com/go/hpux-serviceguard-docs.
to use peer polling instead, set IP_MONITOR to ON for this SUBNET, but do not use POLLING_TARGET (comment out or delete any POLLING_TARGET entries that are already there). If a network interface in this subnet fails at the IP level and IP_MONITOR is set to ON, the interface will be marked down. If it is set to OFF, failures that occur only at the IP-level will not be detected. Can be changed while the cluster is running; must be removed if the preceding SUBNET entry is removed.
NOTE: A weight (WEIGHT_NAME, WEIGHT_DEFAULT) has no meaning on a node unless a corresponding capacity (CAPACITY_NAME, CAPACITY_VALUE) is defined for that node. For the reserved weight and capacity package_limit, the default weight is always one. This default cannot be changed in the cluster configuration file, but it can be overridden for an individual package in the package configuration file.
Logical Volume and File System Planning Use logical volumes in volume groups as the storage infrastructure for package operations on a cluster. When the package moves from one node to another, it must still be able to access the same data on the same disk as it did when it was running on the previous node. This is accomplished by activating the volume group and mounting the file system that resides on it.
For information about creating, exporting, and importing volume groups, see “Creating the Logical Volume Infrastructure ” (page 130). Planning for NFS-mounted File Systems As of Serviceguard A.11.20.00, you can use NFS-mounted (imported) file systems as shared storage in packages. The same package can mount more than one NFS-imported file system, and can use both cluster-local shared storage and NFS imports. The following rules and restrictions apply.
NOTE: If network connectivity to the NFS Server is lost, the applications using the imported file system may hang and it may not be possible to kill them. If the package attempts to halt at this point, it may not halt successfully. • Do not use the automounter; otherwise package startup may fail. • If storage is directly connected to all the cluster nodes and shared, configure it as a local file system rather than using NFS.
Table 6 Package Failover Behavior Switching Behavior Parameters in Configuration File Package switches normally after detection of service or network failure, generic resource failure or when a configured dependency is not met. Halt script runs before switch takes place. (Default) • node_fail_fast_enabled set to no. (Default) • service_fail_fast_enabled set to no for all services. (Default) • auto_run set to yes for the package.
extended resource. This parameter requires an operator and a value. The operators ==, !=, >, <, >=, and <= are allowed. Values must be positive integer values ranging from 1 to 2147483647. The following is an example of how to configure simple and extended resources.
NOTE: Generic resources must be configured to use the monitoring script. It is the monitoring script that contains the logic to monitor the resource and set the status of a generic resource accordingly by using cmsetresource(1m). These scripts must be written by end-users according to their requirements. The monitoring script must be configured as a service in the package if the monitoring of the resource is required to be started and stopped as a part of the package.
ATTRIBUTE_NAME Style Priority ATTRIBUTE_VALUE modular no_priority The cmviewcl -v -f line output (snippet) will be as follows: cmviewcl -v -f line -p pkg1 | grep generic_resource generic_resource:sfm_disk|name=sfm_disk generic_resource:sfm_disk|evaluation_type=during_package_start generic_resource:sfm_disk|up_criteria=”N/A” generic_resource:sfm_disk|node:node1|status=unknown generic_resource:sfm_disk|node:node1|current_value=0 generic_resource:sfm_disk|node:node2|status=unknown generic_resource:sfm_disk|n
Online Reconfiguration of Generic Resources Online operations such as addition, deletion, and modification of generic resources in packages are supported. The following operations can be performed online: • Addition of a generic resource of generic_resource_evaluation_type set to during_package_start, whose status is not down. Please ensure that while adding a generic resource, the equivalent monitor is available; if not add the monitor while adding a generic resource.
NOTE: pkg1 can depend on more than one other package, and pkg2 can depend on another package or packages; we are assuming only two packages in order to make the rules as clear as possible. • pkg1 will not start on any node unless pkg2 is running on that node. • pkg1’s package_type (page 159) and failover_policy (page 162) constrain the type and characteristics of pkg2, as follows: ◦ If pkg1 is a multi-node package, pkg2 must be a multi-node or system multi-node package.
pkg1 will wait forever for pkg3). You can modify this behavior by means of the successor_halt_timeout parameter (page 161)). (The successor of a package depends on that package; in our example, pkg1 is a successor of pkg2; conversely pkg2 can be referred to as a predecessor of pkg1.
If pkg1 depends on pkg2, and pkg1’s priority is lower than or equal to pkg2’s, pkg2’s node order dominates. Assuming pkg2’s node order is node1, node2, node3, then: • On startup: ◦ • pkg2 will start on node1, or node2 if node1 is not available or does not at present meet all of its dependencies, etc. – pkg1 will start on whatever node pkg2 has started on (no matter where that node appears on pkg1’s node_name list) provided all of pkg1’s other dependencies are met there.
Note that the nodes will be tried in the order of pkg1’s node_name list, and pkg2 will be dragged to the first suitable node on that list whether or not it is currently running on another node. • • On failover: ◦ If pkg1 fails on node1, pkg1 will select node2 to fail over to (or node3 if it can run there and node2 is not available or does not meet all of its dependencies; etc.) ◦ pkg2 will be dragged to whatever node pkg1 has selected, and restart there; then pkg1 will restart there.
Extended Dependencies To the capabilities provided by Simple Dependencies (page 100), extended dependencies add the following: • You can specify whether the package depended on must be running or must be down. You define this condition by means of the dependency_condition, using one of the literals UP or DOWN (the literals can be upper or lower case). We'll refer to the requirement that another package be down as an exclusionary dependency; see “Rules for Exclusionary Dependencies” (page 105).
• dependency_location must be either same_node or all_nodes, and must be the same for both packages. • Both packages must be failover packages whose failover_policy (page 162) is configured_node. Rules for different_node and any_node Dependencies These rules apply to packages whose dependency_condition is UP and whose dependency_location is different_node or any_node.
package to halt after the successor_halt_timeout number of seconds whether or not the dependent packages have completed their halt scripts. 2. Halts the failing package. After the successor halt timer has expired or the dependent packages have all halted, Serviceguard starts the halt script of the failing package, regardless of whether the dependents' halts succeeded, failed, or timed out. 3. Halts packages the failing package depends on, starting with the package this package immediately depends on.
Configuring Weights and Capacities You can configure multiple capacities for nodes, and multiple corresponding weights for packages, up to four capacity/weight pairs per cluster. This allows you considerable flexibility in managing package use of each node's resources — but it may be more flexibility than you need. For this reason Serviceguard provides two methods for configuring capacities and weights: a simple method and a comprehensive method. The subsections that follow explain each of these methods.
weight_value 6 • For pkg3: weight_name package_limit weight_value 3 Now node1, which has a CAPACITY_VALUE of 10 for the reserved CAPACITY_NAME package_limit, can run any two of the packages at one time, but not all three. If in addition you wanted to ensure that the larger packages, pkg2 and pkg3, did not run on node1 at the same time, you could raise the weight_value of one or both so that the combination exceeded 10 (or reduce node1's capacity to 8).
of the names you assign to node capacities and package weights are outside the scope of Serviceguard. Serviceguard simply ensures that for each capacity configured for a node, the combined weight of packages currently running on that node does not exceed that capacity.
NOTE: You do not have to define capacities for every node in the cluster. If any capacity is not defined for any node, Serviceguard assumes that node has an infinite amount of that capacity.
NOTE: Option 4 means that the package is “weightless” as far as this particular capacity is concerned, and can run even on a node on which this capacity is completely consumed by other packages. (You can make a package “weightless” for a given capacity even if you have defined a cluster-wide default weight; simply set the corresponding weight to zero in the package's cluster configuration file.
(page 113)). This is true whenever a package has a weight that exceeds the available amount of the corresponding capacity on the node. Rules and Guidelines The following rules and guidelines apply to both the Simple Method (page 108) and the Comprehensive Method (page 109) of configuring capacities and weights. • You can define a maximum of four capacities, and corresponding weights, throughout the cluster.
package that has no priority. Between two down packages without priority, Serviceguard will decide which package to start if it cannot start them both because there is not enough node capacity to support their weight. Example 1 • pkg1 is configured to run on nodes turkey and griffon. It has a weight of 1 and a priority of 10. It is down and has switching disabled. • pkg2 is configured to run on nodes turkey and griffon. It has a weight of 1 and a priority of 20.
Each external script must have three entry points: start, stop, and validate, and should exit with one of the following values: • 0 - indicating success. • 1 - indicating the package will be halted, and should not be restarted, as a result of failure in this script. • 2 - indicating the package will be restarted on another node, or halted if no other node is available.
ret=1 fi # check monitoring service we are expecting for this package is configured while (( i < ${#SG_SERVICE_NAME[*]} )) do case ${SG_SERVICE_CMD[i]} in *monitor.
Determining Why a Package Has Shut Down You can use an external script (or CUSTOMER DEFINED FUNCTIONS area of a legacy package control script) to find out why a package has shut down.
monitored_subnet_access unconfigured for a monitored subnet is equivalent to FULL). (For legacy packages, see “Configuring Cross-Subnet Failover” (page 216)). • You should not use the wildcard (*) for node_name in the package configuration file, as this could allow the package to fail over across subnets when a node on the same subnet is eligible; failing over across subnets can take longer than failing over on the same subnet. List the nodes in order of preference instead of using the wildcard.
Assuming nodeA is pkg1’s primary node (where it normally starts), create node_name entries in the package configuration file as follows: node_name nodeA node_name nodeB node_name nodeC node_name nodeD Configuring monitored_subnet_access In order to monitor subnet 15.244.65.0 or 15.244.56.0, depending on where pkg1 is running, you would configure monitored_subnet and monitored_subnet_access in pkg1’s package configuration file as follows: monitored_subnet 15.244.65.
If you intend to remove a node from the cluster configuration while the cluster is running, ensure that the resulting cluster configuration will still conform to the rules for cluster locks described above. See “Cluster Lock Planning” (page 75) for more information. If you are planning to add a node online, and a package will run on the new node, ensure that any existing cluster-bound volume groups for the package have been imported to the new node.
5 Building an HA Cluster Configuration This chapter and the next take you through the configuration tasks required to set up a Serviceguard cluster. You carry out these procedures on one node, called the configuration node, and Serviceguard distributes the resulting binary file to all the nodes in the cluster. In the examples in this chapter, the configuration node is named ftsys9, and the sample target node is called ftsys10.
SGRUN=/opt/cmcluster/run # location of core dumps from daemons SGAUTOSTART=/opt/cmcluster/conf/cmcluster.rc # SG Autostart file Throughout this document, system filenames are usually given with one of these location prefixes. Thus, references to $SGCONF/ can be resolved by supplying the definition of the prefix that is found in this file. For example, if SGCONF is /usr/local/cmcluster/conf, then the complete pathname for file $SGCONF/cmclconfig would be /usr/local/cmcluster/conf/cmclconfig.
# Serviceguard will not consult this file. ########################################################### The format for entries in cmclnodelist is as follows: [hostname] [user] [#Comment] For example: gryf sly bit root root root #cluster1, node1 #cluster1, node2 #cluster1, node3 This example grants root access to the node on which this cmclnodelist file resides to root users on the nodes gryf, sly, and bit.
NOTE: If you are using private IP addresses for communication within the cluster, and these addresses are not known to DNS (or the name resolution service you use) these addresses must be listed in /etc/hosts. For requirements and restrictions that apply to IPv6–only clusters and mixed-mode clusters, see “Rules and Restrictions for IPv6-Only Mode” (page 78) and “Rules and Restrictions for Mixed Mode” (page 79), respectively, and the latest version of the Serviceguard release notes.
Safeguarding against Loss of Name Resolution Services When you employ any user-level Serviceguard command (including cmviewcl), the command uses the name service you have configured (such as DNS) to obtain the addresses of all the cluster nodes. If the name service is not available, the command could hang or return an unexpected networking error message. NOTE: If such a hang or error occurs, Serviceguard and all protected applications will continue working even though the command you issued does not.
Ensuring Consistency of Kernel Configuration Make sure that the kernel configurations of all cluster nodes are consistent with the expected behavior of the cluster during failover. In particular, if you change any kernel parameters on one cluster node, they may also need to be changed on other cluster nodes that can run the same packages. Enabling the Network Time Protocol HP strongly recommends that you enable network time protocol (NTP) services on each node in the cluster.
ONBOOT=yes BOOTPROTO=none USERCTL=no For Red Hat 5 and Red Hat 6 only, add the following line to the ifcfg-bond0file: BONDING OPTS=’miimon=100 mode=1’ 2. Create an ifcfg-ethn file for each interface in the bond. All interfaces should have SLAVE and MASTER definitions.
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:7224794 errors:0 dropped:0 overruns:0 frame:0 TX packets:3286647 errors:1 dropped:0 overruns:1 carrier:0 collisions:0 txqueuelen:0 eth0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.
The above example configures bond0 with mii monitor equal to 100 and active-backup mode. Adjust the IP, BROADCAST, NETMASK, and NETWORK parameters to correspond to your configuration. As you can see, you are adding the configuration options BONDING_MASTER, BONDING-MODULE_OPTS, and BONDING_SLAVE. BONDING-MODULE_OPTS are the additional options you want to pass to the bonding module.
Command (m for help): n Partition number (1-4): 1 HEX code (type L to list codes): 83 Command (m for help): 1 Command (m for help): 1 Command (m for help): p Disk /dev/sdc: 64 heads, 32 sectors, 4067 cylinders Units = cylinders of 2048 * 512 bytes Device Boot /dev/sdc Start 1 End 1 Blocks 1008 Id 83 System Linux Command (m for help): w The partition table has been altered! NOTE: Follow these rules: • Do not try to use LVM to configure the lock LUN. • The partition type must be 83.
root file systems. To provide space for application data on shared disks, create disk partitions using the fdisk, and build logical volumes with LVM. You can build a cluster (next section) before or after defining volume groups for shared data storage. If you create the cluster first, information about storage can be added to the cluster and package configuration files after the volume groups are created.
In this example, the disk described by device file /dev/sda has already been partitioned for Linux, into partitions named /dev/sda1 - /dev/sda7. The second internal device /dev/sdb and the two external devices /dev/sdc and /dev/sdd have not been partitioned. NOTE: fdisk may not be available for SUSE on all platforms. In this case, using YAST2 to set up the partitions is acceptable.
Prompt Response Action Performed Command (m for help): p Display partition data Command (m for help): w Write data to the partition table The following example of the fdisk dialog describes that the disk on the device file /dev/ sdc is set to Smart Array type partition, and appears as follows: fdisk /dev/sdc Command (m for help): t Partition number (1-4): 1 HEX code (type L to list codes): 8e Command (m for help): p Disk /dev/sdc: 64 heads, 32 sectors, 4067 cylinders Units = cylinders of 2048 * 512
where node is the value of uname -n. 5. Run vgscan: vgscan NOTE: At this point, the setup for volume-group activation protection is complete. Serviceguard adds a tag matching the uname -n value of the owning node to each volume group defined for a package when the package runs and deletes the tag when the package halts. The command vgs -o +tags vgname will display any tags that are set for a volume group.
Building Volume Groups and Logical Volumes 1. Use Logical Volume Manager (LVM) to create volume groups that can be activated by Serviceguard packages. For an example showing volume-group creation on LUNs, see “Building Volume Groups: Example for Smart Array Cluster Storage (MSA 2000 Series)” (page 134). (For Fibre Channel storage you would use device-file names such as those used in the section “Creating Partitions” (page 132)). 2. 3.
NOTE: Use vgchange --deltag only if you are implementing volume-group activation protection. Remember that volume-group activation protection, if used, must be implemented on each node. 2. To get the node ftsys10 to see the new disk partitioning that was done on ftsys9, reboot: reboot The partition table on the rebooted node is then rebuilt using the information placed on the disks when they were partitioned on the other node. NOTE: 3. You must reboot at this time.
2. On ftsys10, activate the volume group, mount the file system, write a date stamp on to the shared file, and then look at the content of the file: vgchange --addtag $(uname -n) vgpkgB vgchange -a y vgpkgB mount /dev/vgpkgB/lvol1 /extra echo ‘Written by’ ‘hostname‘ ‘on’ ‘date‘ >> /extra/datestamp cat /extra/datestamp You should see something like the following, including the date stamp written by the other node: Written by ftsys9.mydomain on Mon Jan 22 14:23:44 PST 2006 Written by ftsys10.
NOTE: Be careful if you use YAST or YAST2 to configure volume groups, as that may cause all volume groups to be activated. After running YAST or YAST2, check that volume groups for Serviceguard packages not currently running have not been activated, and use LVM commands to deactivate any that have. For example, use the command vgchange -a n /dev/sgvg00 to deactivate the volume group sgvg00. Red Hat It is not necessary to prevent vgscan on Red Hat.
cmquerycl Options Speeding up the Process In a larger or more complex cluster with many nodes, networks or disks, the cmquerycl command may take several minutes to complete. To speed up the configuration process, you can direct the command to return selected information only by using the -k and -w options: -k eliminates some disk probing, and does not return information about potential cluster lock volume groups and lock physical volumes.
• If you don't use the -h option, Serviceguard will choose the best available configuration to meet minimum requirements, preferring an IPv4 LAN over IPv6 where both are available. The resulting configuration could be IPv4 only, IPv6 only, or a mix of both. You can override Serviceguard's default choices by means of the HEARTBEAT_IP parameter, discussed under “Cluster Configuration Parameters ” (page 80); that discussion also spells out the heartbeat requirements.
Enter the QS_HOST (IPv4 or IPv6 on SLES 11; IPv4 only on Red Hat 5 and Red Hat 6), optional QS_ADDR (IPv4 or IPv6 on SLES 11; IPv4 only on Red Hat 5 and Red Hat 6) , QS_POLLING_INTERVAL, and optionally a QS_TIMEOUT_EXTENSION; and also check the HOSTNAME_ADDRESS_FAMILY setting, which defaults to IPv4. See the parameter descriptions under Cluster Configuration Parameters (page 80).
15.244.65.0 15.244.56.0 lan3 lan3 lan4 lan4 (nodeA) (nodeB) (nodeC) (nodeD) lan3 lan3 lan3 lan3 (nodeA) (nodeB) (nodeC) (nodeD) IPv6: 3ffe:1111::/64 3ffe:2222::/64 Possible Heartbeat IPs: 15.13.164.0 15.13.164.1 15.13.164.2 15.13.172.0 15.13.172.158 15.13.172.159 15.13.165.0 15.13.165.1 15.13.165.2 15.13.182.0 15.13.182.158 15.13.182.159 Route connectivity(full probing performed): 1 2 3 4 (nodeA) (nodeB) (nodeC) (nodeD) (nodeA) (nodeB) (nodeC) (nodeD) 15.13.164.0 15.13.172.0 15.13.165.0 15.13.182.
Modifying the MEMBER_TIMEOUT Parameter The cmquerycl command supplies a default value of 14 seconds for the MEMBER_TIMEOUT parameter. Changing this value will directly affect the cluster’s re-formation and failover times. You may need to increase the value if you are experiencing cluster node failures as a result of heavy system load or heavy network traffic; or you may need to decrease it if cluster re-formations are taking a long time. You can change MEMBER_TIMEOUT while the cluster is running.
Figure 27 Access Roles Levels of Access Serviceguard recognizes two levels of access, root and non-root: • Root access: Full capabilities; only role allowed to configure the cluster. As Figure 27 shows, users with root access have complete control over the configuration of the cluster and its packages. This is the only role allowed to use the cmcheckconf, cmapplyconf, cmdeleteconf, and cmmodnet -a commands.
IMPORTANT: Users on systems outside the cluster can gain Serviceguard root access privileges to configure the cluster only via a secure connection (rsh or ssh). • Non-root access: Other users can be assigned one of four roles: ◦ Full Admin: Allowed to perform cluster administration, package administration, and cluster and package view operations. These users can administer the cluster, but cannot configure or create a cluster. Full Admin includes the privileges of the Package Admin role.
Access control policies are defined by three parameters in the configuration file: • Each USER_NAME can consist either of the literal ANY_USER, or a maximum of 8 login names from the /etc/passwd file on USER_HOST. The names must be separated by spaces or tabs, for example: # Policy 1: USER_NAME john fred patrick USER_HOST bit USER_ROLE PACKAGE_ADMIN • USER_HOST is the node where USER_NAME will issue Serviceguard commands.
USER_ROLE PACKAGE_ADMIN If this policy is defined in the cluster configuration file, it grants user john the PACKAGE_ADMIN role for any package on node bit. User john also has the MONITOR role for the entire cluster, because PACKAGE_ADMIN includes MONITOR. If the policy is defined in the package configuration file for PackageA, then user john on node bit has the PACKAGE_ADMIN role only for PackageA. Plan the cluster’s roles and validate them as soon as possible.
you when you create roles for a package; use cmgetconf to get a listing of the cluster configuration file. If a role is configured for a username/hostname in the cluster configuration file, do not specify a role for the same username/hostname in the package configuration file; and note that there is no point in assigning a package administration role to a user who is root on any node in the cluster; this user already has complete control over the administration of the cluster and its packages.
Duplicate cluster lock, line 55. Quorum Server already specified. Distributing the Binary Configuration File After specifying all cluster parameters, use the cmapplyconf command to apply the configuration. This action distributes the binary configuration file to all the nodes in the cluster. HP recommends doing this separately before you configure packages (described in the next chapter).
4. • Start the node. You can use Serviceguard Manager or the cmrunnode command. • Verify that the node has returned to operation. You can use Serviceguard Manager or the cmviewcl command again. Bring down the cluster. You can use Serviceguard Manager or the cmhaltcl -v -f command. See the manpages for more information about these commands. See Chapter 8: “Troubleshooting Your Cluster” (page 225) for more information about cluster testing.
Changing the System Message You may find it useful to modify the system's login message to include a statement such as the following: This system is a node in a high availability cluster. Halting this system may cause applications and services to start up on another node in the cluster. You may want to include a list of all cluster nodes in this message, together with additional cluster-specific information. The /etc/motd file may be customized to include cluster-related information.
1. 2. Change the value of the server_args parameter in the file /etc/xinetd.d/hacl-cfg from -c to -c -i Restart xinetd: /etc/init.d/xinetd restart Deleting the Cluster Configuration You can delete a cluster configuration by means of the cmdeleteconf command. The command prompts for a verification before deleting the files unless you use the -f option. You can delete the configuration only when the cluster is down.
6 Configuring Packages and Their Services Serviceguard packages group together applications and the services and resources they depend on. The typical Serviceguard package is a failover package that starts on one node but can be moved (“failed over”) to another if necessary. For more information, see “What is Serviceguard for Linux? ” (page 17), “How the Package Manager Works” (page 35), and“Package Configuration Planning ” (page 92).
When you have made these decisions, you are ready to generate the package configuration file; see “Generating the Package Configuration File” (page 174). Types of Package: Failover, Multi-Node, System Multi-Node There are three types of packages: • Failover packages. This is the most common type of package. Failover packages run on one node at a time.
and start the package for the first time. But if you then halt the multi-node package via cmhaltpkg, it can be re-started only by means of cmrunpkg, not cmmodpkg. • If a multi-node package is halted via cmhaltpkg, package switching is not disabled. This means that the halted package will start to run on a rebooted node, if it is configured to run on that node and its dependencies are met.
Table 8 Base Modules (continued) Module Name Parameters (page) Comments package_type (page 159) Cannot be used if package_type package_description (page 159) * (page 159) is multi_node or system_multi_node node_name (page 159) auto_run (page 160) node_fail_fast_enabled (page 160) run_script_timeout (page 160) halt_script_timeout (page 161) successor_halt_script_timeout (page 161) * script_log_file (page 161) operation_sequence (page 161) log_level (page 162) * failover_policy (page 162) failback_policy
Table 9 Optional Modules Module Name Parameters (page) Comments dependency dependency_name (page 163) * dependency_condition (page 163) dependency_location (page 163) Add to a base module to create a package that depends on one or more other packages. weight weight_name (page 163) * weight value (page 163) * Add to a base module to create a package that has weight that will be counted against a node's capacity.
Table 9 Optional Modules (continued) Module Name Parameters (page) Comments acp user_name (page 173) user_host (page 173) user_role (page 173) Add to a base module to configure Access Control Policies for the package. all all parameters Use if you are creating a complex package that requires most or all of the optional parameters; or if you want to see the specifications and comments for all available parameters.
package_name Any name, up to a maximum of 39 characters, that: • starts and ends with an alphanumeric character • otherwise contains only alphanumeric characters or dot (.), dash (-), or underscore (_) • is unique among package names in this cluster IMPORTANT: Restrictions on package names in previous Serviceguard releases were less stringent.
IMPORTANT: node names. See “Cluster Configuration Parameters ” (page 80) for important information about See “About Cross-Subnet Failover” (page 117) for considerations when configuring cross-subnet packages, which are further explained under “Cross-Subnet Configurations” (page 23). auto_run Can be set to yes or no. The default is yes.
If the package does not complete its startup in the time specified by run_script_timeout, Serviceguard will terminate it and prevent it from switching to another node. In this case, if node_fail_fast_enabled is set to yes, the node will be halted (rebooted). If no timeout is specified (no_timeout), Serviceguard will wait indefinitely for the package to start. If a timeout occurs: • Switching will be disabled. • The current node will be disabled from running the package.
log_level Determines the amount of information printed to stdout when the package is validated, and to the script_log_file when the package is started and halted. Valid values are 0 through 5, but you should normally use only the first two (0 or 1); the remainder (2 through 5) are intended for use by HP Support.
increments of 20 so as to leave gaps in the sequence; otherwise you may have to shuffle all the existing priorities when assigning priority to a new package. IMPORTANT: Because priority is a matter of ranking, a lower number indicates a higher priority (20 is a higher priority than 40). A numerical priority is higher than no_priority. New as of A.11.18 (for both modular and legacy packages). See “About Package Dependencies” (page 100) for more information.
Both parameters are optional, but if weight_value is specified, weight_name must also be specified, and must come first. You can define up to four weights, corresponding to four different capacities, per cluster. To specify more than one weight for this package, repeat weight_name and weight_value. NOTE: But if weight_name is package_limit, you can use only that one weight and capacity throughout the cluster. package_limit is a reserved value, which, if used, must be entered exactly in that form.
See also ip_subnet_node (page 166) and “About Cross-Subnet Failover” (page 117). New for modular packages. For legacy packages, see “Configuring Cross-Subnet Failover” (page 216). monitored_subnet_access In cross-subnet configurations, specifies whether each monitored_subnet is accessible on all nodes in the package’s node list (see node_name (page 159)), or only some.
In a cross-subnet configuration, you also need to specify which nodes the subnet is configured on; see ip_subnet_node below. See also monitored_subnet_access (page 164) and “About Cross-Subnet Failover” (page 117). This parameter can be set for failover packages only. ip_subnet_node In a cross-subnet configuration, specifies which nodes an ip_subnet is configured on. If no ip_subnet_nodes are listed under an ip_subnet, it is assumed to be configured on all nodes in this package’s node_name list (page 159).
service_cmd The command that runs the program or function for this service_name, for example, /usr/bin/X11/xclock -display 15.244.58.208:0 An absolute pathname is required; neither the PATH variable nor any other environment variable is passed to the command. The default shell is /bin/sh. NOTE: Be careful when defining service run commands. Each run command is executed in the following way: • The cmrunserv command executes the run command.
You can configure a maximum of 100 generic resources per cluster. Each generic resource is defined by three parameters: • generic_resource_name • generic_resource_evaluation_type • generic_resource_up_criteria See the descriptions that follow. The following is an example of defining generic resource parameters: generic_resource_name generic_resource_evaluation_type generic_resource_up_criteria cpu_monitor during_package_start <50 See the package configuration file for more examples.
NOTE: Operators other than the ones mentioned above are not supported. This attribute does not accept more than one up criterion. For example, >> 10, << 100 are not valid.
fs_fsck_opt "" fs_type "ext3" A logical volume must be built on an LVM volume group. Logical volumes can be entered in any order. A gfs file system can be configured using only the fs_name, fs_directory, and fs_mount_opt parameters; see the configuration file for an example. Additional rules apply for gfs as explained under fs_type. NOTE: Red Hat GFS is not supported in Serviceguard A.11.20.00. For an NFS-imported file system, see the discussion under fs_name (page 170) and fs_server (page 171).
For an NFS-imported file system, the additional parameters required are fs_server, fs_directory, fs_type, and fs_mount_opt; see fs_server (page 171) for an example. CAUTION: Before configuring an NFS-imported file system into a package, make sure you have read and understood the rules and guidelines under “Planning for NFS-mounted File Systems” (page 94), and configured the cluster parameter CONFIGURED_IO_TIMEOUT_EXTENSION, described under “Cluster Configuration Parameters ” (page 80).
NOTE: A package using gfs (Red Hat Global File System, or GFS) cannot use any other file systems of a different type. vg and vgchange_cmd (page 169) are not valid for GFS file systems. For more information about using GFS with Serviceguard, see Clustering Linux Servers with the Concurrent Deployment of HP Serviceguard for Linux and Red Hat Global File Systems for RHEL5 on docs.hp.
If more than one external_pre_script is specified, the scripts will be executed on package startup in the order they are entered into the package configuration file, and in the reverse order during package shutdown. See “About External Scripts” (page 114), as well as the comments in the configuration file, for more information and examples. external_script The full pathname of an external script.
SUBNET Specifies the IP subnets that are to be monitored for the package. RUN_SCRIPTand HALT_SCRIPT Use the full pathname of each script. These two parameters allow you to separate package run instructions and package halt instructions for legacy packages into separate scripts if you need to. In this case, make sure you include identical configuration information (such as node names, IP addresses, etc.) in both scripts.
NOTE: If you do not include a base module (or default or all) on the cmmakepkg command line, cmmakepkg will ignore the modules you specify and generate a default configuration file containing all the parameters. For a complex package, or if you are not yet sure which parameters you will need to set, the default may be the best choice; see the first example below. You can use the-v option with cmmakepkg to control how much information is displayed online or included in the configuration file.
1. 2. 3. Configure volume groups and mount points only. Check and apply the configuration; see “Verifying and Applying the Package Configuration” (page 178). Run the package and ensure that it can be moved from node to node. NOTE: etc. 4. 5. 6. cmcheckconf and cmapplyconf check for missing mount points, volume groups, Halt the package. Configure package IP addresses and application services.
• If this package will depend on another package or packages, enter values for dependency_name, dependency_condition, dependency_location, and optionally priority. See “About Package Dependencies” (page 100) for more information. NOTE: The package(s) this package depends on must already be part of the cluster configuration by the time you validate this package (via cmcheckconf; see “Verifying and Applying the Package Configuration” (page 178)); otherwise validation will fail.
• If the package needs to mount LVM volumes to file systems (other than Red Hat GFS; see fs_type (page 171)), use the vg parameters to specify the names of the volume groups to be activated, and select the appropriate vgchange_cmd. Use the fs_ parameters (page 170) to specify the characteristics of file systems and how and where to mount them. See the comments in the FILESYSTEMS section of the configuration file for more information and examples.
• File systems and volume groups are valid. • Services are executable. • Any package that this package depends on is already be part of the cluster configuration. For more information, see the manpage for cmcheckconf (1m) and “Checking Cluster Components” (page 188). When cmcheckconf has completed without errors, apply the package configuration, for example: cmapplyconf -P $SGCONF/pkg1/pkg1.
7 Cluster and Package Maintenance This chapter describes the cmviewcl command, then shows how to start and halt a cluster or an individual node, how to perform permanent reconfiguration, and how to start, halt, move, and modify packages during routine maintenance of the cluster.
• starting - The cluster is in the process of determining its active membership. At least one cluster daemon is running. • unknown - The node on which the cmviewcl command is issued cannot communicate with other nodes in the cluster. Node Status and State The status of a node is either up (active as a member of the cluster) or down (inactive in the cluster), depending on whether its cluster daemon is running or not.
• detached - A package is said to be detached from the cluster or node where it was running, when the cluster or node is halted with —d option. Serviceguard no longer monitors this package. The last known status of the package before it is detached from the cluster was up. • unknown - Serviceguard could not determine the status at the time cmviewcl was run. A system multi-node package is up when it is running on all the active cluster nodes.
Package Switching Attributes cmviewcl shows the following package switching information: • AUTO_RUN: Can be enabled or disabled. For failover packages, enabled means that the package starts when the cluster starts, and Serviceguard can switch the package to another node in the event of failure. For system multi-node packages, enabled means an instance of the package can start on a new node joining the cluster (disabled means it will not).
Failover packages can also be configured with one of two values for the failback_policy parameter (page 162), and these are also displayed in the output of cmviewcl -v: • automatic: Following a failover, a package returns to its primary node when the primary node becomes available again. • manual: Following a failover, a package will run on the adoptive node until moved back to its original node by a system administrator.
NOTE: The Script_Parameters section of the PACKAGE output of cmviewcl shows the Subnet status only for the node that the package is running on. In a cross-subnet configuration, in which the package may be able to fail over to a node on another subnet, that other subnet is not shown (see “Cross-Subnet Configurations” (page 23)).
UNOWNED_PACKAGES PACKAGE pkg2 STATUS down STATE unowned AUTO_RUN disabled NODE unowned Policy_Parameters: POLICY_NAME CONFIGURED_VALUE Failover configured_node Failback manual Script_Parameters: ITEM STATUS NODE_NAME Service down Generic Resource up ftsys9 Subnet up Generic Resource up ftsys10 Node_Switching_Parameters: NODE_TYPE STATUS SWITCHING Primary up enabled Alternate up enabled NAME service2 sfm_disk1 15.13.168.
Service up Subnet Generic Resource up 0 0 up Node_Switching_Parameters: NODE_TYPE STATUS SWITCHING Primary up enabled Alternate up enabled NODE ftsys10 STATUS up sfm_disk_monitor 0 sfm_disk 0 NAME ftsys10 ftsys9 15.13.168.
Policy_Parameters: POLICY_NAME CONFIGURED_VALUE Failover min_package_node Failback automatic Script_Parameters: ITEM STATUS Subnet up Generic Resource unknown Subnet up Generic Resource unknown Subnet up Generic Resource unknown Subnet up Generic Resource unknown NODE_NAME manx manx burmese burmese tabby tabby persian persian Node_Switching_Parameters: NODE_TYPE STATUS SWITCHING Primary up enabled Alternate up enabled Alternate up enabled Alternate up enabled NAME 192.8.15.0 sfm_disk 192.8.15.
Table 10 Verifying Cluster Components (continued) Component (Context) Tool or Command; More Information Comments Lock LUN (cluster) cmcheckconf (1m), cmapplyconf (1m) Commands check that all nodes are be accessing the same physical device and that the lock LUN device file is a block device file To check file consistency across all nodes in the cluster, do the following: Mount points (package) cmcheckconf (1m), cmapplyconf (1m) See also “Verifying the Package Configuration” (page 214).
In Serviceguard A.11.16 and later, these tasks can be performed by non-root users with the appropriate privileges. See Controlling Access to the Cluster (page 143) for more information about configuring access. You can use Serviceguard Manager or the Serviceguard command line to start or stop the cluster, or to add or halt nodes. Starting the cluster means running the cluster daemon on one or more of the nodes in a cluster.
network and find the check takes a very long time, you can use the -w none option to bypass the validation. Since the node's cluster is already running, the node joins the cluster and packages may be started, depending on the package configuration (see node_name (page 159)). If the node does not find its cluster running, or the node is not part of the cluster configuration, the command fails.
Halting a Node or the Cluster while Keeping Packages Running There may be circumstances where you want to do maintenance that involves halting a node, or the entire cluster, without halting or failing over the affected packages. Such maintenance might consist of anything short of rebooting the node or nodes, but a likely case is networking changes that will disrupt the heartbeat. New command options in Serviceguard A.11.20.
• You cannot detach package that is in maintenance mode, and you cannot place a package into maintenance mode if any of its dependent packages are detached. Also, you cannot put a detached package in maintenance mode. For more information about maintenance mode, see“Maintaining a Package: Maintenance Mode” (page 197). For more information about dependencies, see “About Package Dependencies” (page 100). • You cannot make configuration changes to a package or a cluster in which any packages are detached.
CAUTION: Serviceguard does not check LVM volume groups, mount points, and relocatable IP addresses when re-attaching packages. • cmviewcl (1m) reports the status and state of detached packages as detached. This is true even if a problem has occurred since the package was detached and some or all of the package components are not healthy or not running. • Because Serviceguard assumes that a detached package has remained healthy, the package is considered to be UP for dependency purposes.
2. Halt the package; for example: cmhaltpkg node1 pkg1 Halting the Cluster and Detaching its Packages 1. 2. Make sure that the conditions spelled out under “Rules and Restrictions” (page 192) are met. Halt any packages that do not qualify for Live Application Detach, such as legacy and system multi-node packages. For example: cmhaltpkg legpak1 legpak2 legpak3 smnp1 NOTE: 3. If you do not do this, the cmhaltcl in the next step will fail.
Starting a Package Ordinarily, a package configured as part of the cluster will start up on its primary node when the cluster starts up. You may need to start a package manually after it has been halted manually. You can do this either in Serviceguard Manager, or with Serviceguard commands as described below.
Moving a Failover Package You can use Serviceguard Manager to move a failover package from one node to another, or Serviceguard commands as shown below. Before you move a failover package to a new node, it is a good idea to run cmviewcl -v -l package and look at dependencies. If the package has dependencies, be sure they can be met on the new node. To move the package, first halt it where it is running using the cmhaltpkg command. This action not only halts the package, but also disables package switching.
NOTE: If you need to do maintenance that requires halting a node, or the entire cluster, you should consider Live Application Detach; see “Halting a Node or the Cluster while Keeping Packages Running” (page 192). • Maintenance mode is chiefly useful for modifying networks while the package is running. See “Performing Maintenance Using Maintenance Mode” (page 200). • Partial-startup maintenance mode allows you to work on package services, file systems, and volume groups.
◦ A script does not exist or cannot run because of file permissions ◦ A script times out ◦ The limit of a restart count is exceeded Rules for a Package in Maintenance Mode or Partial-Startup Maintenance Mode IMPORTANT: See the latest Serviceguard release notes for important information about version requirements for package maintenance. • The package must have package switching disabled before you can put it in maintenance mode. • You can put a package in maintenance mode only on one node.
Dependency Rules for a Package in Maintenance Mode or Partial-Startup Maintenance Mode You cannot configure new dependencies involving a package running in maintenance mode, and in addition the following rules apply (we'll call the package in maintenance mode pkgA). • The packages that depend on pkgA must be down and disabled when you place pkgA in maintenance mode. This applies to all types of dependency (including exclusionary dependencies) as described under “About Package Dependencies” (page 100).
Procedure Follow this procedure to perform maintenance on a package. In this example, we'll assume a package pkg1 is running on node1, and that we want to do maintenance on the package's services. 1. Halt the package: cmhaltpkg pkg1 2. Place the package in maintenance mode: cmmodpkg -m on -n node1 pkg1 NOTE: 3. The order of the first two steps can be reversed. Run the package in maintenance mode.
NOTE: The full execution sequence for starting a package is: 1. The master control script itself 2. Persistent reservation Reconfiguring a Cluster You can reconfigure a cluster either when it is halted or while it is still running. Some operations can only be done when the cluster is halted. The table that follows shows the required cluster state for many kinds of changes.
if another, currently disabled, package is enabled, or if a package halts and cannot restart because none of the nodes on its node_list is available. Serviceguard provides two ways to do this: you can use the preview mode of Serviceguard commands, or you can use the cmeval (1m) command to simulate different cluster states. Alternatively, you might want to model changes to the cluster as a whole; cmeval allows you to do this; see “Using cmeval” (page 204).
This shows that pkg1, when enabled, will “drag” pkg2 and pkg3 to its primary node, node1. It can do this because of its higher priority; see “Dragging Rules for Simple Dependencies” (page 102). Running the preview confirms that all three packages will successfully start on node2 (assuming conditions do not change between now and when you actually enable pkg1, and there are no failures in the run scripts). NOTE: The preview cannot predict run and halt script failures.
IMPORTANT: For detailed information and examples, see the cmeval (1m) manpage. Reconfiguring a Halted Cluster You can make a permanent change in cluster configuration when the cluster is halted. This procedure must be used for changes marked “Cluster must not be running” in Table 11, but it can be used for any other cluster configuration changes as well. Use the following steps: 1. Halt the cluster on all nodes. 2.
Use cmrunnode to start the new node, and, if you so decide, set the AUTOSTART_CMCLD parameter to 1 in the $SGAUTOSTART file (see “Understanding the Location of Serviceguard Files” (page 121)) to enable the new node to join the cluster automatically each time it reboots. Removing Nodes from the Cluster while the Cluster Is Running You can use Serviceguard Manager to delete nodes, or Serviceguard commands as shown below. The following restrictions apply: • The node must be halted.
Changing the Cluster Networking Configuration while the Cluster Is Running What You Can Do Online operations you can perform include: • Add a network interface and its HEARTBEAT_IP or STATIONARY_IP. • Delete a network interface and its HEARTBEAT_IP or STATIONARY_IP. • Change a HEARTBEAT_IP or STATIONARY_IP interface from IPv4 to IPv6, or vice versa. • Change the designation of an existing interface from HEARTBEAT_IP to STATIONARY_IP, or vice versa. • Change the NETWORK_POLLING_INTERVAL.
Examples of when you must do this include: ◦ moving a NIC from one subnet to another ◦ adding an IP address to a NIC ◦ removing an IP address from a NIC CAUTION: Do not add IP addresses to network interfaces that are configured into the Serviceguard cluster, unless those IP addresses themselves will be immediately configured into the cluster as stationary IP addresses.
4. Apply the changes to the configuration and distribute the new binary configuration file to all cluster nodes: cmapplyconf -C clconfig.conf If you were configuring the subnet for data instead, and wanted to add it to a package configuration, you would now need to: 1. Halt the package 2. Add the new networking information to the package configuration file 3. In the case of a legacy package, add the new networking information to the package control script if necessary 4.
Changing MAX_CONFIGURED_PACKAGES As of Serviceguard A.11.18, you can change MAX_CONFIGURED_PACKAGES while the cluster is running. The default for MAX_CONFIGURED_PACKAGES is the maximum number allowed in the cluster. You can use Serviceguard Manager to change MAX_CONFIGURED_PACKAGES, or Serviceguard commands as shown below. Use the cmgetconf command to obtain a current copy of the cluster's existing configuration, for example: cmgetconf -C clconfig.conf Edit the clconfig.
1. Create a subdirectory for each package you are configuring in the $SGCONF directory: mkdir $SGCONF/pkg1 You can use any directory names you like. (See “Understanding the Location of Serviceguard Files” (page 121) for the name of Serviceguard directories on your version of Linux.) 2. Generate a package configuration file for each package, for example: cmmakepkg -p $SGCONF/pkg1/pkg1.conf You can use any file name you like for the configuration file. 3.
• RUN_SCRIPT and HALT_SCRIPT. Specify the pathname of the package control script (described in the next section). No default is provided. Permissions on the file and directory should be set to rwxr-xr-x or r-xr-xr-x (755 or 555). (Script timeouts): Enter the run_script_timeout (page 160) and halt_script_timeout (page 161). SCRIPT_LOG_FILE. (optional). Specify the full pathname of the file where the RUN_SCRIPT and HALT_SCRIPT will log messages.
You can use a single script for both run and halt operations, or, if you wish, you can create separate scripts. Use cmmakepkg to create the control script, then edit the control script. Use the following procedure to create the template for the sample failover package pkg1. First, generate a control script template, for example: cmmakepkg -s $SGCONF/pkg1/pkg1.sh Next, customize the script; see “Customizing the Package Control Script ”.
An example of this portion of the script follows, showing the date and echo commands logging starts and halts of the package to a file. # START OF CUSTOMER DEFINED FUNCTIONS # This function is a place holder for customer defined functions. # You should define all actions you want to happen here, before the service is # started. You can create as many functions as you need. function customer_defined_run_cmds { # ADD customer defined run commands.
If you are using the command line, use the following command to verify the content of the package configuration you have created: cmcheckconf -v -P $SGCONF/pkg1/pkg1.conf Errors are displayed on the standard output. If necessary, edit the file to correct any errors, then run the command again until it completes without errors. The following items are checked (whether you use Serviceguard Manager or cmcheckconf command): • Package name is valid, and at least one NODE_NAME entry is included.
NOTE: You must use cmcheckconf and cmapplyconf again any time you make changes to the cluster and package configuration files. Configuring Cross-Subnet Failover To configure a legacy package to fail over across subnets (see “Cross-Subnet Configurations” (page 23)), you need to do some additional configuration. Suppose that you want to configure a package, pkg1, so that it can fail over among all the nodes in a cluster comprising NodeA, NodeB, NodeC, and NodeD. NodeA and NodeB use subnet 15.244.65.
SUBNET[0] 15.244.65.0 IP[1] = 15.244.65.83 SUBNET[1] 15.244.65.0 Control-script entries for nodeC and nodeD IP[0] = 15.244.56.100 SUBNET[0] = 15.244.56.0 IP[1] = 15.244.56.101 SUBNET[1] =15.244.56.0 Reconfiguring a Package You reconfigure a package in much the same way as you originally configured it; for modular packages, see Chapter 6: “Configuring Packages and Their Services ” (page 153); for older packages, see “Configuring a Legacy Package” (page 210).
See “Allowable Package States During Reconfiguration ”to determine whether this step is needed. 2. If it is not already available, you can obtain a copy of the package's configuration file by using the cmgetconf command, specifying the package name. cmgetconf -p pkg1 pkg1.conf 3. Edit the package configuration file. IMPORTANT: Restrictions on package names, dependency names, and service names have become more stringent as of A.11.18.
To create the package, follow the steps in the chapter Chapter 6: “Configuring Packages and Their Services ” (page 153). Then use a command such as the following to verify the configuration of the newly created pkg1 on a running cluster: cmcheckconf -P $SGCONF/pkg1/pkg1conf.conf Use a command such as the following to distribute the new package configuration to all nodes in the cluster: cmapplyconf -P $SGCONF/pkg1/pkg1conf.
CAUTION: is running. Be extremely cautious about changing a package's configuration while the package If you reconfigure a package online (by executing cmapplyconf on a package while the package itself is running) it is possible that the package will fail, even if the cmapplyconf succeeds, validating the changes with no errors. For example, if a file system is added to the package while the package is running, cmapplyconf does various checks to verify that the file system and its mount point exist.
Table 12 Types of Changes to Packages (continued) Change to the Package Required Package State Add or delete a service: legacy Package must not be running. package Change service_restart: Package can be running. modular package Serviceguard will not allow the change if the new value is less than the current restart count. (You can use cmmodpkg -R to reset the restart count if you need to.) Change SERVICE_RESTART: Package must not be running.
Table 12 Types of Changes to Packages (continued) Change to the Package Required Package State with the new options; the CAUTION under “Remove a file system: modular package” applies in this case as well. If only fs_umount_opt is being changed, the file system will not be unmounted; the new option will take effect when the package is halted or the file system is unmounted for some other reason. Add a file system: modular package Package can be running.
Table 12 Types of Changes to Packages (continued) Change to the Package Required Package State Remove a generic resource Package can be running. Change the Package can be running if the status of generic resource is 'up'. generic_resource_evaluation_type Not allowed if changing the generic_resource_evaluation_type causes the package to fail. For information on online changes to generic resources, see “Online Reconfiguration of Generic Resources” (page 100).
Single-Node Operation In a multi-node cluster, you could have a situation in which all but one node has failed, or you have shut down all but one node, leaving your cluster in single-node operation. This remaining node will probably have applications running on it. As long as the Serviceguard daemon cmcld is active, other nodes can rejoin the cluster.
8 Troubleshooting Your Cluster This chapter describes how to verify cluster operation, how to review cluster status, how to add and replace hardware, and how to solve some typical cluster problems.
cmviewcl -v -p 2. Set the status of generic resource to DOWN using the following command: cmsetresource -r –s down 3. To view the package status, enter cmviewcl -v The package should be running on the specified adoptive node. 4. Move the package back to the primary node (see “Moving a Failover Package ” (page 197)).
on all configured HA devices. The presence of errors relating to a device will show the need for maintenance. Replacing Disks The procedure for replacing a faulty disk mechanism depends on the type of disk configuration you are using. Refer to your Smart Array documentation for issues related to your Smart Array. Replacing a Faulty Mechanism in a Disk Array You can replace a failed disk mechanism by simply removing it from the array and replacing it with a new mechanism of the same type.
pr_cleanup lun -v -k [-f | ] • lun, if used, specifies that a LUN, rather than a volume group, is to be operated on. • -v, if used, specifies verbose output detailing the actions the script performs and their status. • -k , if used, specifies the key to be used in the clear operation. • -f , if used, specifies that the name of the DSFs to be operated on are listed in the file specified by .
HP recommends that you update the new MAC address in the cluster binary configuration file by re-applying the cluster configuration. Use the following steps for online reconfiguration: 1. Use the cmgetconf command to obtain a fresh ASCII configuration file, as follows: cmgetconf config.conf 2. Use the cmapplyconf command to apply the configuration and copy the new binary file to all cluster nodes: cmapplyconf -C config.
Request for lock /sg/ succeeded. New lock owners: N1, N2 7. To check that the quorum server has been correctly configured and to verify the connectivity of a node to the quorum server, you can execute the following command from your cluster nodes as follows: cmquerycl -q -n -n ... The command will output an error message if the specified nodes cannot communicate with the quorum server.
Interrupt:9 Base address:0xda00 eth1:2 Link encap:Ethernet HWaddr 00:50:DA:64:8A:7C inet addr:192.168.1.201 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:9 Base address:0xda00 lo Link encap:Local Loopback inet addr:127.0.0.1 Bcast:192.168.1.255 Mask:255.255.255.
Reviewing Configuration Files Review the following ASCII configuration files: • Cluster configuration file. • Package configuration files. Ensure that the files are complete and correct according to your configuration planning worksheets. Reviewing the Package Control Script For legacy packages, ensure that the package control script is found on all nodes where the package can run and that the file is identical on all nodes. Ensure that the script is executable on all nodes.
Solving Problems Problems with Serviceguard may be of several types. The following is a list of common categories of problem: • Serviceguard Command Hangs. • Cluster Re-formations. • System Administration Errors. • Package Control Script Hangs. • Package Movement Errors. • Node and Network Failures. • Quorum Server Messages. Name Resolution Problems Many Serviceguard commands, including cmviewcl, depend on name resolution services to look up the addresses of cluster nodes.
you should solve the networking or load problem if you can. Failing that, you can increase the value of MEMBER_TIMEOUT, as described in the next section. Cluster Re-formations Caused by MEMBER_TIMEOUT Being Set too Low If you have set the MEMBER_TIMEOUT parameter too low, the cluster demon, cmcld, will write warnings to syslog that indicate the problem. There are three in particular that you should watch for: 1. Warning: cmcld was unable to run for the last seconds.
You can use the following commands to check the status of your disks: • df - to see if your package’s volume group is mounted. • vgdisplay -v - to see if all volumes are present. • strings /etc/lvmconf/*.conf - to ensure that the configuration is correct. • fdisk -v /dev/sdx - to display information about a disk.
vgchange -a n 4. Finally, re-enable the package for switching. cmmodpkg -e If after cleaning up the node on which the timeout occurred it is desirable to have that node as an alternate for running the package, remember to re-enable the package to run on the node: cmmodpkg -e -n The default Serviceguard control scripts are designed to take the straightforward steps needed to get an application running or stopped.
Node and Network Failures These failures cause Serviceguard to transfer control of a package to another node. This is the normal action of Serviceguard, but you have to be able to recognize when a transfer has taken place and decide to leave the cluster in its current condition or to restore it to its original condition.
Messages The coordinator node in Serviceguard sometimes sends a request to the quorum server to set the lock state. (This is different from a request to obtain the lock in tie-breaking.
A Designing Highly Available Cluster Applications This appendix describes how to create or port applications for high availability, with emphasis on the following topics: • Automating Application Operation • Controlling the Speed of Application Failover (page 240) • Designing Applications to Run on Multiple Systems (page 243) • Restoring Client Connections (page 246) • Handling Application Failures (page 247) • Minimizing Planned Downtime (page 248) Designing for high availability means reducing
• Minimize the reentry of data. • Engineer the system for reserve capacity to minimize the performance degradation experienced by users. Define Application Startup and Shutdown Applications must be restartable without manual intervention. If the application requires a switch to be flipped on a piece of hardware, then automated restart is impossible. Procedures for application startup, shutdown and monitoring must be created so that the HA software can perform these functions automatically.
filesystems recovery (fsck) before the data can be accessed. To help reduce this recovery time, the smaller these filesystems are, the faster the recovery will be. Therefore, it is best to keep anything that can be replicated off the data filesystem. For example, there should be a copy of the application executables on each system rather than having one copy of the executables on a shared filesystem.
A common example is a print job. Printer applications typically schedule jobs. When that job completes, the scheduler goes on to the next job.
Design for Replicated Data Sites Replicated data sites are a benefit for both fast failover and disaster recovery. With replicated data, data disks are not shared between systems. There is no data recovery that has to take place. This makes the recovery time faster. However, there may be performance trade-offs associated with replicating data. There are a number of ways to perform data replication, which should be fully investigated by the application designer.
Obtain Enough IP Addresses Each application receives a relocatable IP address that is separate from the stationary IP address assigned to the system itself. Therefore, a single system might have many IP addresses, one for itself and one for each of the applications that it normally runs. Therefore, IP addresses in a given subnet range will be consumed faster than without high availability. It might be necessary to acquire additional IP addresses.
name for a call to gethostbyname(3) should also be avoided for the same reason. Also, the gethostbyaddr() call may return different answers over time if called with a stationary IP address. Instead, the application should always refer to the application name and relocatable IP address rather than the hostname and stationary IP address. It is appropriate for the application to call gethostbyname(3), specifying the application name rather than the hostname.
the stationary IP address rather than the relocatable application IP address. Therefore, when creating a UDP socket for listening, the application must always call bind(2) with the appropriate relocatable application IP address rather than INADDR_ANY. Call bind() before connect() When an application initiates its own connection, it should first call bind(2), specifying the application IP address before calling connect(2).
There are a number of strategies to use for client reconnection: • Design clients which continue to try to reconnect to their failed server. Put the work into the client application rather than relying on the user to reconnect. If the server is back up and running in 5 minutes, and the client is continually retrying, then after 5 minutes, the client application will reestablish the link with the server and either restart or continue the transaction. No intervention from the user is required.
Another alternative is for the failure of one component to still allow bringing down the other components cleanly. If a database SQL server fails, the database should still be able to be brought down cleanly so that no database recovery is necessary. The worse case is for a failure of one component to cause the entire system to fail. If one component fails and all other components need to be restarted, the downtime will be high.
Do Not Change the Data Layout Between Releases Migration of the data to a new format can be very time intensive. It also almost guarantees that rolling upgrade will not be possible. For example, if a database is running on the first node, ideally, the second node could be upgraded to the new revision of the database. When that upgrade is completed, a brief downtime could be scheduled to move the database server from the first node to the newly upgraded second node.
B Integrating HA Applications with Serviceguard The following is a summary of the steps you should follow to integrate an application into the Serviceguard environment: 1. Read the rest of this book, including the chapters on cluster and package configuration, and the appendix “Designing Highly Available Cluster Applications.” 2.
Defining Baseline Application Behavior on a Single System 1. Define a baseline behavior for the application on a standalone system: • Install the application, database, and other required resources on one of the systems. Be sure to follow Serviceguard rules in doing this: ◦ Install all shared data on separate external volume groups. ◦ Use a Journaled filesystem (JFS) as appropriate. • Perform some sort of standard test to ensure the application is running correctly.
# cmhaltpkg pkg1 # cmrunpkg -n node1 pkg1 # cmmodpkg -e pkg1 2. 3. • Fail one of the systems. For example, turn off the power on node 1. Make sure the package starts up on node 2. • Repeat failover from node 2 back to node 1. Be sure to test all combinations of application load during the testing. Repeat the failover processes under different application states such as heavy user load versus no user load, batch jobs versus online transactions, etc.
C Blank Planning Worksheets This appendix reprints blank versions of the planning worksheets described in the “Planning” chapter. You can duplicate any of these worksheets that you find useful and fill them in as a part of the planning process.
Disk Unit __________________________ Power Supply _______________________ Disk Unit __________________________ Power Supply _______________________ Disk Unit __________________________ Power Supply _______________________ Disk Unit __________________________ Power Supply _______________________ Disk Unit __________________________ Power Supply _______________________ ============================================================================ Tape Backup Power: Tape Unit __________________________
Physical Volume Name: _________________ Physical Volume Name: _________________ ============================================================================= Volume Group Name: ___________________________________ Physical Volume Name: _________________ Physical Volume Name: _________________ Physical Volume Name: _________________ Cluster Configuration Worksheet =============================================================================== Name and Nodes: ==================================================
Failover Policy:_____________ Failback_policy:___________________________________ Access Policies: User:_________________ From node:_______ Role:_____________________________ User:_________________ From node:_______ Role:______________________________________________ Log level____ Log file:_______________________________________________________________________________________ Priority_____________ Successor_halt_timeout____________ dependency_name _____ dependency_condition _____ dependency_location _______
D IPv6 Network Support This appendix describes some of the characteristics of IPv6 network addresses, specifically: • IPv6 Address Types • Network Configuration Restrictions (page 260) • Configuring IPv6 on Linux (page 260) IPv6 Address Types Several IPv6 types of addressing schemes are specified in the RFC 2373 (IPv6 Addressing Architecture). IPv6 addresses are 128-bit identifiers for interfaces and sets of interfaces. There are various address formats for IPv6 defined by the RFC 2373.
IPv6 Address Prefix IPv6 Address Prefix is similar to CIDR in IPv4 and is written in CIDR notation. An IPv6 address prefix is represented by the notation: IPv6-address/prefix-length where ipv6-address is an IPv6 address in any notation listed above and prefix-length is a decimal value representing how many of the leftmost contiguous bits of the address comprise the prefix. Example: fec0:0:0:1::1234/64 The first 64-bits of the address fec0:0:0:1 forms the address prefix.
Table 16 80 bits 16 bits 32 bits zeros FFFF IPv4 address Example: ::ffff:192.168.0.1 Aggregatable Global Unicast Addresses The global unicast addresses are globally unique IPv6 addresses. This address format is very well defined in the RFC 2374 (An IPv6 Aggregatable Global Unicast Address Format). The format is: Table 17 3 13 8 24 16 64 bits FP TLA ID RES NLA ID SLA ID Interface ID where FP = Format prefix. Value of this is “001” for Aggregatable Global unicast addresses.
“FF” at the beginning of the address identifies the address as a multicast address. The “flags” field is a set of 4 flags “000T”. The higher order 3 bits are reserved and must be zero. The last bit ‘T’ indicates whether it is permanently assigned or not. A value of zero indicates that it is permanently assigned otherwise it is a temporary assignment. The “scop” field is a 4-bit field which is used to limit the scope of the multicast group.
Enabling IPv6 on Red Hat Linux Add the following lines to /etc/sysconfig/network: NETWORKING_IPV6=yes IPV6FORWARDING=no IPV6_AUTOCONF=no IPV6_AUTOTUNNEL=no # Enable global IPv6 initialization # Disable global IPv6 forwarding # Disable global IPv6 autoconfiguration # Disable automatic IPv6 tunneling Adding persistent IPv6 Addresses on Red Hat Linux This can be done by modifying the system configuration script, for example, /etc/sysconfig/ network-scripts/ifcfg-eth1: DEVICE=eth1BOOTPROTO=static BROADCAST=19
BOOTPROTO=static BROADCAST=10.0.2.255 IPADDR=10.0.2.10 NETMASK=255.255.0.0 NETWORK=0.0.2.0 REMOTE_IPADDR="" STARTMODE=onboot IPADDR1=3ffe::f101:10/64IPADDR2=fec0:0:0:1::10/64 BONDING_MASTER=yes BONDING_MODULE_OPTS="mode=active-backup miimon=100" BONDING_SLAVE0=eth1BONDING_SLAVE1=eth2 For each additional IPv6 address, specify an additional parameter with IPADDR in the configuration file.
E Using Serviceguard Manager HP Serviceguard Manager is a web-based, HP System Management Homepage (HP SMH) tool that replaces the functionality of the earlier Serviceguard management tools. Serviceguard Manager allows you to monitor, administer and configure a Serviceguard cluster from any system with a supported web browser. The Serviceguard Manager Main Page provides you with a summary of the health of the cluster including the status of each node and its packages.
1. Enter the standard URL http://:2301/. For example, http://clusternode1.cup.hp.com:2301/ 2. When the System Management Homepage login screen appears, enter your login credentials and click Sign In. The System Management Homepage for the selected server appears. 3. From the Serviceguard Cluster box, click the name of the cluster. NOTE: If a cluster is not yet configured, you will not see the Serviceguard Cluster section on this screen.
NOTE: Serviceguard Manager can be launched by HP Systems Insight Manager version 5.10 or later if Serviceguard Manager is installed on an HP Systems Insight Manager Central Management Server. For a Serviceguard A.11.19 cluster, Systems Insight Manager will attempt to launch Serviceguard Manager B.02.00 from one of the nodes in the cluster; for a Serviceguard A.11.18 cluster, Systems Insight Manager will attempt to launch Serviceguard Manager B.01.01 from one of the nodes in the cluster.
F Maximum and Minimum Values for Parameters Table 21 shows the range of possible values for cluster configuration parameters. Table 21 Minimum and Maximum Values of Cluster Configuration Parameters Cluster Parameter Minimum Value Maximum Value Default Value Member Timeout See MEMBER_TIMEOUT under “Cluster Configuration Parameters” in Chapter 4. See MEMBER_TIMEOUT under “Cluster Configuration Parameters” in Chapter 4.
G Monitoring Script for Generic Resources Monitoring scripts are the scripts written by an end-user and must contain the core logic to monitor a resource and set the status of a generic resource. These scripts are started as a part of the package start. • You can set the status/value of a simple/extended resource respectively using the cmsetresource(1m) command. • You can define the monitoring interval in the script.
For resources of evaluation_type: before_package_start • Monitoring scripts can also be launched outside of the Serviceguard environment, init, rc scripts, etc. (Serviceguard does not monitor them). • The monitoring scripts for all the resources in a cluster of type before_package_start can be configured in a single multi-node package by using the services functionality and any packages that require the resources can mention the generic resource name in their package configuration file.
generic_resource_evaluation_type before_package_start generic_resource_name generic_resource_evaluation_type lan1 before_package_start dependency_name dependency_condition dependency_location generic_resource_monitors generic_resource_monitors = up same_node Thus, the monitoring scripts for all the generic resources of type before_package_start are configured in one single multi-node package and any package that requires this generic resource can just configure the generic resource name.
# * --------------------------------* # * The following utility functions are sourced in from $SG_UTILS * # * ($SGCONF/scripts/mscripts/utils.sh) and available for use: * # * * # * sg_log * # * * # * By default, only log messages with a log level of 0 will * # * be output to the log file.
{ sg_log 5 "start_command" # ADD your service start steps here return 0 } ######################################################################### # # stop_command # # This function should define actions to take when the package halts # # ######################################################################### function stop_command { sg_log 5 "stop_command" # ADD your halt steps here exit 1 } ################ # main routine ################ sg_log 5 "customer defined monitor script" #####################
H HP Serviceguard Contributed Toolkit The HP Serviceguard Contributed Toolkit Suite (Contrib toolkit) is a collection of toolkits for popular applications to integrate them with the Serviceguard on linux environment. This Toolkit suite contains a user guide that explains how to customize the package for your needs. For more information, see HP Serviceguard Contributed Toolkit Suite on Linux Release Notes Version A.04.02.01 at http://www.hp.com/go/linux-serviceguard-docs.
Index A Access Control Policies, 143 active node, 18 adding a package to a running cluster, 218 adding cluster nodes advance planning, 119 adding nodes to a running cluster, 190 adding packages on a running cluster, 179 administration adding nodes to a running cluster, 190 halting a package, 196 halting the entire cluster, 191 moving a package, 197 of packages and services, 195 of the cluster, 189 reconfiguring a package while the cluster is running, 217 reconfiguring a package with the cluster offline, 218
heartbeat subnet parameter, 83 initial configuration of the cluster, 30 main functions, 30 maximum configured packages parameter, 92 member timeout parameter, 87 monitored non-heartbeat subnet, 85 network polling interval parameter, 88, 92 planning the configuration, 80 quorum server parameter, 81 testing, 226 cluster node parameter in cluster manager configuration, 80, 81, 82 cluster parameters initial configuration, 30 cluster re-formation scenario, 67 cluster startup manual, 31 cmapplyconf, 205, 215 cmap
F failback policy used by package manager, 42 FAILBACK_POLICY parameter used by package manager, 42 failover controlling the speed in applications, 240 defined, 18 failover behavior in packages, 95 failover package, 35, 154 failover policy used by package manager, 39 FAILOVER_POLICY parameter used by package manager, 39 failure kinds of responses, 66 network communication, 69 response to hardware failures, 68 responses to package and service failures, 68 restarting a service after failure, 69 failures of ap
understanding Serviceguard software, 27 IP in sample package control script, 213 IP address adding and deleting in packages, 55 for nodes and packages, 54 hardware planning, 72, 75, 76 portable, 54 reviewing for packages, 230 switching, 37, 38, 62 IP_MONITOR defined, 90 J JFS, 241 K kernel hang, and TOC, 67 safety timer, 28 kernel consistency in cluster configuration, 126 kernel interrupts and possible TOC, 87 L LAN heartbeat, 31 interface name, 72, 75 LAN failure Serviceguard behavior, 21 LAN interfaces
binding to port addresses, 245 IP addresses and naming, 243 node and package IP addresses, 54 packages using IP addresses, 244 supported types in Serviceguard, 21 writing network applications as HA services, 240 no cluster lock choosing, 34 node basic concepts, 21 halt (TOC), 67 in Serviceguard cluster, 17 IP addresses, 54 timeout and TOC example, 67 node types active, 18 primary, 18 NODE_FAIL_FAST_ENABLED effect of setting, 68 NODE_NAME parameter in cluster configuration, 82 parameter in cluster manager co
power, 74 quorum server, 75 SPU information, 72 volume groups and physical volumes, 76 worksheets, 74 planning and documenting an HA cluster, 70 planning for cluster expansion, 70 planning worksheets blanks, 253 point of failure in networking, 22 POLLING_TARGET defined, 91 ports dual and single aggregated, 57 power planning power sources, 74 worksheet, 75, 254 power supplies blank planning worksheet, 253 power supply and cluster lock, 26 UPS, 26 primary LAN interfaces defined, 21 primary node, 18 Q QS_ADDR
in monitored resource failure, 21 in software failure, 21 Serviceguard commands to configure a package, 210 Serviceguard Manager, 19 overview, 19 Serviceguard software components figure, 27 shared disks planning, 73 shutdown and startup defined for applications, 240 single point of failure avoiding, 17 single-node operation, 151, 224 size of cluster preparing for changes, 119 SMN package, 35 SNA applications, 246 software failure Serviceguard behavior, 21 software planning LVM, 76 solving problems, 233 SPU
W WEIGHT_DEFAULT defined, 91 WEIGHT_NAME defined, 91 What is Serviceguard?, 17 worksheet blanks, 253 cluster configuration, 92, 255 hardware configuration, 74, 253 package configuration, 255, 256 power supply configuration, 75, 253, 254 use in planning, 70 280 Index