Configuring and Managing a Red Hat Cluster Red Hat Cluster for Red Hat Enterprise Linux 5.
Configuring and Managing a Red Hat Cluster Configuring and Managing a Red Hat Cluster describes the configuration and management of Red Hat cluster systems for Red Hat Enterprise Linux 5.2 It does not include information about Red Hat Linux Virtual Servers (LVS). Information about installing and configuring LVS is in a separate document.
Configuring and Managing a Red Hat Cluster: Red Hat Cluster for Red Hat Enterprise Linux Copyright © 2008 Red Hat, Inc. Copyright © 2008 Red Hat, Inc. This material may only be distributed subject to the terms and conditions set forth in the Open Publication License, V1.0 or later with the restrictions noted below (the latest version of the OPL is presently available at http://www.opencontent.org/openpub/).
Configuring and Managing a Red Hat Cluster
Introduction .............................................................................................................. vii 1. Document Conventions ................................................................................ viii 2. Feedback ...................................................................................................... ix 1. Red Hat Cluster Configuration and Management Overview ....................................... 1 1. Configuration Basics .................................
Configuring and Managing a Red Hat Cluster 10. Configuring Cluster Storage .........................................................................45 4. Managing Red Hat Cluster With Conga ..................................................................47 1. Starting, Stopping, and Deleting Clusters ........................................................47 2. Managing Cluster Nodes ...............................................................................48 3. Managing High-Availability Services .
Introduction This document provides information about installing, configuring and managing Red Hat Cluster components. Red Hat Cluster components are part of Red Hat Cluster Suite and allow you to connect a group of computers (called nodes or members) to work together as a cluster. This document does not include information about installing, configuring, and managing Linux Virtual Server (LVS) software. Information about that is in a separate document.
Introduction environment. • Global File System: Configuration and Administration — Provides information about installing, configuring, and maintaining Red Hat GFS (Red Hat Global File System). • Using Device-Mapper Multipath — Provides information about using the Device-Mapper Multipath feature of Red Hat Enterprise Linux 5. • Using GNBD with Global File System — Provides an overview on using Global Network Block Device (GNBD) with Red Hat GFS.
Feedback Italic Courier font represents a variable, such as an installation directory: install_dir/bin/ bold font Bold font represents application programs and text found on a graphical interface. When shown like this: OK , it indicates a button on a graphical application interface. Additionally, the manual uses different strategies to draw your attention to pieces of information.
Introduction If you spot a typo, or if you have thought of a way to make this manual better, we would love to hear from you. Please submit a report in Bugzilla (http://bugzilla.redhat.com/bugzilla/) against the component Documentation-cluster. Be sure to mention the manual's identifier: Cluster_Administration(EN)-5.2 (2008-06-01T17:11) By mentioning this manual's identifier, we know exactly which version of the guide you have.
Chapter 1. Red Hat Cluster Configuration and Management Overview Red Hat Cluster allows you to connect a group of computers (called nodes or members) to work together as a cluster. You can use Red Hat Cluster to suit your clustering needs (for example, setting up a cluster for sharing files on a GFS file system or setting up service failover). 1. Configuration Basics To set up a cluster, you must connect the nodes to certain cluster hardware and configure the nodes into the cluster environment.
Chapter 1. Red Hat Cluster Configuration and Management Overview Other options are available for storage according to the type of storage interface; for example, iSCSI or GNBD. A Fibre Channel switch can be configured to perform fencing. • Storage — Some type of storage is required for a cluster. The type required depends on the purpose of the cluster. Figure 1.1. Red Hat Cluster Hardware Overview 1.2.
Configuring Red Hat Cluster Software relationship among the cluster components. Figure 1.2, “Cluster Configuration Structure” shows an example of the hierarchical relationship among cluster nodes, high-availability services, and resources. The cluster nodes are connected to one or more fencing devices. Nodes can be grouped into a failover domain for a cluster service. The services comprise resources such as NFS exports, IP addresses, and shared GFS partitions. Figure 1.2.
Chapter 1. Red Hat Cluster Configuration and Management Overview A brief overview of each configuration tool is provided in the following sections: • Section 2, “Conga” • Section 3, “system-config-cluster Cluster Administration GUI” • Section 4, “Command Line Administration Tools” In addition, information about using Conga and system-config-cluster is provided in subsequent chapters of this document. Information about the command line tools is available in the man pages for the tools. 2.
Conga To administer a cluster or storage, an administrator adds (or registers) a cluster or a computer to a luci server. When a cluster or a computer is registered with luci, the FQDN hostname or IP address of each computer is stored in a luci database. You can populate the database of one luci instance from another luciinstance. That capability provides a means of replicating a luci server instance and provides an efficient upgrade and testing path.
Chapter 1. Red Hat Cluster Configuration and Management Overview Figure 1.4.
system-config-cluster Cluster Figure 1.5. luci storage Tab 3. system-config-cluster Cluster Administration GUI This section provides an overview of the cluster administration graphical user interface (GUI) available with Red Hat Cluster Suite — system-config-cluster. It is for use with the cluster infrastructure and the high-availability service management components. system-config-cluster consists of two major functions: the Cluster Configuration Tool and the Cluster Status Tool.
Chapter 1. Red Hat Cluster Configuration and Management Overview While system-config-cluster provides several convenient tools for configuring and managing a Red Hat Cluster, the newer, more comprehensive tool, Conga, provides more convenience and flexibility than system-config-cluster. 3.1. Cluster Configuration Tool You can access the Cluster Configuration Tool (Figure 1.6, “Cluster Configuration Tool”) through the Cluster Configuration tab in the Cluster Administration GUI. Figure 1.6.
Administration GUI The Cluster Configuration Tool represents cluster configuration components in the configuration file (/etc/cluster/cluster.conf) with a hierarchical graphical display in the left panel. A triangle icon to the left of a component name indicates that the component has one or more subordinate components assigned to it. Clicking the triangle icon expands and collapses the portion of the tree below a component.
Chapter 1. Red Hat Cluster Configuration and Management Overview Services. Using configuration buttons at the bottom of the right frame (below Properties), you can create services (when Services is selected) or edit service properties (when a service is selected). 3.2. Cluster Status Tool You can access the Cluster Status Tool (Figure 1.7, “Cluster Status Tool”) through the Cluster Management tab in Cluster Administration GUI. Figure 1.7.
Command Line Administration Tools The nodes and services displayed in the Cluster Status Tool are determined by the cluster configuration file (/etc/cluster/cluster.conf). You can use the Cluster Status Tool to enable, disable, restart, or relocate a high-availability service. 4.
12
Chapter 2.
Chapter 2. Before Configuring a Red Hat Cluster Cluster Nodes” lists the IP port numbers, their respective protocols, the components to which the port numbers are assigned, and references to iptables rule examples. At each cluster node, enable IP ports according to Table 2.1, “Enabled IP Ports on Red Hat Cluster Nodes”. (All examples are in Section 2.3, “Examples of iptables Rules”.
Examples of iptables Rules If a cluster node is running luci, port 11111 should already have been enabled. IP Port Number Protocol Component Reference to Example of iptables Rules 8084 TCP luci (Conga user interface server) Example 2.2, “Port 8084: luci (Cluster Node or Computer Running luci)” 11111 TCP ricci (Conga remote agent) Example 2.3, “Port 11111: ricci (Cluster Node and Computer Running luci)” Table 2.2. Enabled IP Ports on a Computer That Runs luci 2.3.
Chapter 2. Before Configuring a Red Hat Cluster 10.10.10.0/24 -d 10.10.10.0/24 --dports 11111 -j ACCEPT Example 2.3. Port 11111: ricci (Cluster Node and Computer Running luci) -A INPUT -i 10.10.10.200 -m state --state NEW -m multiport -p tcp -s 10.10.10.0/24 -d 10.10.10.0/24 --dports 14567 -j ACCEPT Example 2.4. Port 14567: gnbd -A INPUT -i 10.10.10.200 -m state --state NEW -m multiport -p tcp -s 10.10.10.0/24 -d 10.10.10.0/24 --dports 16851 -j ACCEPT Example 2.5.
Configuring ACPI For Use with Integrated 10.10.10.0/24 -d 10.10.10.0/24 --dports 50007 -j ACCEPT Example 2.9. Port 50007: ccsd (UDP) 3. Configuring ACPI For Use with Integrated Fence Devices If your cluster uses integrated fence devices, you must configure ACPI (Advanced Configuration and Power Interface) to ensure immediate and complete fencing. Note For the most current information about integrated fence devices supported by Red Hat Cluster Suite, refer to http://www.redhat.
Chapter 2. Before Configuring a Red Hat Cluster • Changing the BIOS setting to "instant-off" or an equivalent setting that turns off the node without delay Note Disabling ACPI Soft-Off with the BIOS may not be possible with some computers. • Appending acpi=off to the kernel boot command line of the /boot/grub/grub.conf file Important This method completely disables ACPI; some computers do not boot correctly if ACPI is completely disabled.
Fence Devices management. — OR — • chkconfig --level 2345 acpid off — This command turns off acpid. 2. Reboot the node. 3. When the cluster is configured and running, verify that the node turns off immediately when fenced. Tip You can fence the node with the fence_node command or Conga. 3.2. Disabling ACPI Soft-Off with the BIOS The preferred method of disabling ACPI Soft-Off is with chkconfig management (Section 3.1, “Disabling ACPI Soft-Off with chkconfig Management”).
Chapter 2. Before Configuring a Red Hat Cluster may vary among computers. However, the objective of this procedure is to configure the BIOS so that the computer is turned off via the power button without delay. 4. Exit the BIOS CMOS Setup Utility program, saving the BIOS configuration. 5. When the cluster is configured and running, verify that the node turns off immediately when fenced. Tip You can fence the node with the fence_node command or Conga.
Disabling ACPI Completely in the grub.conf 3.3. Disabling ACPI Completely in the grub.conf File The preferred method of disabling ACPI Soft-Off is with chkconfig management (Section 3.1, “Disabling ACPI Soft-Off with chkconfig Management”). If the preferred method is not effective for your cluster, you can disable ACPI Soft-Off with the BIOS power management (Section 3.2, “Disabling ACPI Soft-Off with the BIOS”).
Chapter 2. Before Configuring a Red Hat Cluster title Red Hat Enterprise Linux Server (2.6.18-36.el5) root (hd0,0) kernel /vmlinuz-2.6.18-36.el5 ro root=/dev/VolGroup00/LogVol00 console=ttyS0,115200n8 acpi=off initrd /initrd-2.6.18-36.el5.img In this example, acpi=off has been appended to the kernel boot command line — the line starting with "kernel /vmlinuz-2.6.18-36.el5". Example 2.11. Kernel Boot Command Line with acpi=off Appended to It 4.
File Resource Behavior. 5. Configuring max_luns It is not necessary to configure max_luns in Red Hat Enterprise Linux 5. In Red Hat Enterprise Linux releases prior to Red Hat Enterprise Linux 5, if RAID storage in a cluster presents multiple LUNs, it is necessary to enable access to those LUNs by configuring max_luns (or max_scsi_luns for 2.4 kernels) in the /etc/modprobe.conf file of each node.
Chapter 2. Before Configuring a Red Hat Cluster CMAN membership timeout value The CMAN membership timeout value (the time a node needs to be unresponsive before CMAN considers that node to be dead, and not a member) should be at least two times that of the qdiskd membership timeout value. The reason is because the quorum daemon must detect failed nodes on its own, and can take much longer to do so than CMAN. The default value for CMAN membership timeout is 10 seconds.
Considerations for Using Conga multicast addressing and IGMP are enabled. Without multicast and IGMP, not all nodes can participate in a cluster, causing the cluster to fail. Note Procedures for configuring network switches and associated networking equipment vary according each product. Refer to the appropriate vendor documentation or other information about configuring network switches and associated networking equipment to enable multicast addresses and IGMP. 8.
Chapter 2. Before Configuring a Red Hat Cluster and corrupting it. It is strongly recommended that fence devices (hardware or software solutions that remotely power, shutdown, and reboot cluster nodes) are used to guarantee data integrity under all failure conditions. Watchdog timers provide an alternative way to to ensure correct operation of cluster service failover. Ethernet channel bonding Cluster quorum and node health is determined by communication of messages among cluster nodes via Ethernet.
Chapter 3.
Chapter 3. Configuring Red Hat Cluster With Conga 2. Starting luci and ricci To administer Red Hat Clusters with Conga, install and run luci and ricci as follows: 1. At each node to be administered by Conga, install the ricci agent. For example: # yum install ricci 2. At each node to be administered by Conga, start ricci. For example: # service ricci start Starting ricci: [ OK ] 3. Select a computer to host luci and install the luci software on that computer.
Creating A Cluster 5. Start luci using service luci restart. For example: # service luci restart Shutting down luci: Starting luci: generating https SSL certificates... [ OK ] [ OK ] done Please, point your web browser to https://nano-01:8084 to access luci 6. At a Web browser, place the URL of the luci server into the URL address box and click Go (or the equivalent). The URL syntax for the luci server is https://luci_server_hostname:8084.
Chapter 3. Configuring Red Hat Cluster With Conga 4. Global Cluster Properties When a cluster is created, or if you select a cluster to configure, a cluster-specific page is displayed. The page provides an interface for configuring cluster-wide properties and detailed properties. You can configure cluster-wide properties with the tabbed interface below the cluster name. The interface provides the following tabs: General, Fence, Multicast, and Quorum Partition.
Global Cluster Properties 3. Multicast tab — This tab provides an interface for configuring these Multicast Configuration parameters: Let cluster choose the multicast address and Specify the multicast address manually. Red Hat Cluster software chooses a multicast address for cluster management communication among cluster nodes; therefore, the default setting is Let cluster choose the multicast address.
Chapter 3. Configuring Red Hat Cluster With Conga Parameter Description Minimum Score The minimum score for a node to be considered "alive". If omitted or set to 0, the default function, floor((n+1)/2), is used, where n is the sum of the heuristics scores. The Minimum Score value must never exceed the sum of the heuristic scores; otherwise, the quorum disk cannot be available. Device The storage device the quorum daemon uses. The device must be the same on all nodes.
Configuring Fence Devices The following shared fence devices are available: • APC Power Switch • Brocade Fabric Switch • Bull PAP • Egenera SAN Controller • GNBD • IBM Blade Center • McData SAN Switch • QLogic SANbox2 • SCSI Fencing • Virtual Machine Fencing • Vixel SAN Switch • WTI Power Switch The following non-shared fence devices are available: • Dell DRAC • HP iLO • IBM RSA II • IPMI LAN • RPS10 Serial Switch This section provides procedures for the following tasks: • Creating shared fence devices —
Chapter 3. Configuring Red Hat Cluster With Conga The starting point of each procedure is at the cluster-specific page that you navigate to from Choose a cluster to administer displayed on the cluster tab. 5.1. Creating a Shared Fence Device To create a shared fence device, follow these steps: 1. At the detailed menu for the cluster (below the clusters menu), click Shared Fence Devices.
Modifying or Deleting a Fence Device Figure 3.1. Fence Device Configuration 3. At the Add a Sharable Fence Device page, click the drop-down box under Fencing Type and select the type of fence device to configure. 4. Specify the information in the Fencing Type dialog box according to the type of fence device. Refer to Appendix B, Fence Device Parameters for more information about fence device parameters. 5. Click Add this shared fence device. 6.
Chapter 3. Configuring Red Hat Cluster With Conga 5.2. Modifying or Deleting a Fence Device To modify or delete a fence device, follow these steps: 1. At the detailed menu for the cluster (below the clusters menu), click Shared Fence Devices. Clicking Shared Fence Devices causes the display of the fence devices for a cluster and causes the display of menu items for fence device configuration: Add a Fence Device and Configure a Fence Device. 2. Click Configure a Fence Device.
Adding a Member to a Running Cluster Creating a cluster consists of selecting a set of nodes (or members) to be part of the cluster. Once you have completed the initial step of creating a cluster and creating fence devices, you need to configure cluster nodes. To initially configure cluster nodes after creating a new cluster, follow the steps in this section.
Chapter 3. Configuring Red Hat Cluster With Conga 4. Click Submit. Clicking Submit causes the following actions: a. Cluster software packages to be downloaded onto the added node. b. Cluster software to be installed (or verification that the appropriate software packages are installed) onto the added node. c. Cluster configuration file to be updated and propagated to each node in the cluster — including the added node. d. Joining the added node to cluster.
Configuring a Failover Domain 1. Click the link of the node to be deleted. Clicking the link of the node to be deleted causes a page to be displayed for that link showing how that node is configured. Note To allow services running on a node to fail over when the node is deleted, skip the next step. 2. Disable or relocate each service that is running on the node to be deleted: Note Repeat this step for each service that needs to be disabled or started on another node. a.
Chapter 3. Configuring Red Hat Cluster With Conga be started (either manually or by the cluster software). • Unordered — When a cluster service is assigned to an unordered failover domain, the member on which the cluster service runs is chosen from the available failover domain members with no priority ordering. • Ordered — Allows you to specify a preference order among the members of a failover domain.
Modifying a Failover Domain 7.1. Adding a Failover Domain To add a failover domain, follow the steps in this section. The starting point of the procedure is at the cluster-specific page that you navigate to from Choose a cluster to administer displayed on the cluster tab. 1. At the detailed menu for the cluster (below the clusters menu), click Failover Domains.
Chapter 3. Configuring Red Hat Cluster With Conga displayed on the cluster tab. 1. At the detailed menu for the cluster (below the clusters menu), click Failover Domains. Clicking Failover Domains causes the display of failover domains with related services and the display of menu items for failover domains: Add a Failover Domain and Configure a Failover Domain . 2. Click Configure a Failover Domain.
Adding Cluster Resources 9. To make additional changes to the failover domain, continue modifications at the Failover Domain Form page and click Submit when you are done. 8. Adding Cluster Resources To add a cluster resource, follow the steps in this section. The starting point of the procedure is at the cluster-specific page that you navigate to from Choose a cluster to administer displayed on the cluster tab. 1. At the detailed menu for the cluster (below the clusters menu), click Resources.
Chapter 3. Configuring Red Hat Cluster With Conga Tip Use a descriptive name that clearly distinguishes the service from other services in the cluster. 4. Add a resource to the service; click Add a resource to this service. Clicking Add a resource to this service causes the display of two drop-down boxes: Add a new local resource and Use an existing global resource. Adding a new local resource adds a resource that is available only to this service.
Configuring Cluster Storage inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: mtu 1356 qdisc pfifo_fast qlen 1000 link/ether 00:05:5d:9a:d8:91 brd ff:ff:ff:ff:ff:ff inet 10.11.4.31/22 brd 10.11.7.255 scope global eth0 inet6 fe80::205:5dff:fe9a:d891/64 scope link inet 10.11.4.240/22 scope global secondary eth0 valid_lft forever preferred_lft forever 10.
Chapter 3. Configuring Red Hat Cluster With Conga • Hard Drives • Partitions • Volume Groups Each section is set up as an expandable tree, with links to property sheets for specific devices, partitions, and storage entities. Configure the storage for your cluster to suit your cluster requirements. If you are configuring Red Hat GFS, configure clustered logical volumes first, using CLVM. For more information about CLVM and GFS refer to Red Hat documentation for those products.
Chapter 4. Managing Red Hat Cluster With Conga This chapter describes various administrative tasks for managing a Red Hat Cluster and consists of the following sections: • Section 1, “Starting, Stopping, and Deleting Clusters” • Section 2, “Managing Cluster Nodes” • Section 3, “Managing High-Availability Services” • Section 4, “Diagnosing and Correcting Problems in a Cluster” 1.
Chapter 4. Managing Red Hat Cluster With Conga • Delete this cluster — Selecting this action halts a running cluster, disables cluster software from starting automatically, and removes the cluster configuration file from each node. You can select this action for any state the cluster is in. Deleting a cluster frees each node in the cluster for use in another cluster. 2. Select one of the functions and click Go. 3. Clicking Go causes a progress page to be displayed.
Managing High-Availability Services Selecting Have node leave cluster shuts down cluster software and makes the node leave the cluster. Making a node leave a cluster prevents the node from automatically joining the cluster when it is rebooted. Selecting Have node join cluster starts cluster software and makes the node join the cluster. Making a node join a cluster allows the node to automatically join the cluster when it is rebooted.
Chapter 4. Managing Red Hat Cluster With Conga • If service is running — Configure this service, Restart this service, and Stop this service. • If service is not running — Configure this service, Start this service, and Delete this service. The actions of each function are summarized as follows: • Configure this service — Configure this service is available when the service is running or not running. Selecting Configure this service causes the services configuration page for the service to be displayed.
Chapter 5.
Chapter 5. Configuring Red Hat Cluster With system-config-cluster 3. Creating fence devices. Refer to Section 4, “Configuring Fence Devices”. 4. Creating cluster members. Refer to Section 5, “Adding and Deleting Members”. 5. Creating failover domains. Refer to Section 6, “Configuring a Failover Domain”. 6. Creating resources. Refer to Section 7, “Adding Cluster Services”. 7. Creating cluster services. Refer to Section 8, “Adding a Cluster Service to the Cluster”. 8.
Starting the Cluster Configuration Tool Figure 5.1. Starting a New Configuration File Note The Cluster Management tab for the Red Hat Cluster Suite management GUI is available after you save the configuration file with the Cluster Configuration Tool, exit, and restart the Red Hat Cluster Suite management GUI (system-config-cluster). (The Cluster Management tab displays the status of the cluster service manager, cluster nodes, and resources, and shows statistics concerning cluster service operation.
Chapter 5. Configuring Red Hat Cluster With system-config-cluster dialog box if you enable Use a Quorum disk: Interval, TKO, Votes, Minimum Score, Device, Label, and Quorum Disk Heuristic. Table 5.1, “Quorum-Disk Parameters” describes the parameters. Important Quorum-disk parameters and heuristics depend on the site environment and special requirements needed. To understand the use of quorum-disk parameters and heuristics, refer to the qdisk(5) man page.
Starting the Cluster Configuration Tool Figure 5.2. Creating A New Configuration 4. When you have completed entering the cluster name and other parameters in the New Configuration dialog box, click OK. Clicking OK starts the Cluster Configuration Tool, displaying a graphical representation of the configuration (Figure 5.3, “The Cluster Configuration Tool”).
Chapter 5. Configuring Red Hat Cluster With system-config-cluster Figure 5.3. The Cluster Configuration Tool Parameter Description Use a Quorum Disk Enables quorum disk. Enables quorum-disk parameters in the New Configuration dialog box. Interval The frequency of read/write cycles, in seconds. TKO The number of cycles a node must miss in order to be declared dead. Votes The number of votes the quorum daemon advertises to CMAN when it has a high enough score.
Configuring Cluster Properties Parameter Description Device The storage device the quorum daemon uses. The device must be the same on all nodes. Label Specifies the quorum disk label created by the mkqdisk utility. If this field contains an entry, the label overrides the Device field. If this field is used, the quorum daemon reads /proc/partitions and checks for qdisk signatures on every block device found, comparing the label against the specified label.
Chapter 5. Configuring Red Hat Cluster With system-config-cluster 5. Specify the Fence Daemon Properties parameters: Post-Join Delay and Post-Fail Delay. a. The Post-Join Delay parameter is the number of seconds the fence daemon (fenced) waits before fencing a node after the node joins the fence domain. The Post-Join Delay default value is 3. A typical setting for Post-Join Delay is between 20 and 30 seconds, but can vary according to cluster and network performance. b.
Adding and Deleting Members Figure 5.4. Fence Device Configuration 2. At the Fence Device Configuration dialog box, click the drop-down box under Add a New Fence Device and select the type of fence device to configure. 3. Specify the information in the Fence Device Configuration dialog box according to the type of fence device. Refer to Appendix B, Fence Device Parameters for more information about fence device parameters. 4. Click OK. 5.
Chapter 5. Configuring Red Hat Cluster With system-config-cluster 2. At the bottom of the right frame (labeled Properties), click the Add a Cluster Node button. Clicking that button causes a Node Properties dialog box to be displayed. The Node Properties dialog box presents text boxes for Cluster Node Name and Quorum Votes (refer to Figure 5.5, “Adding a Member to a New Cluster”). Figure 5.5. Adding a Member to a New Cluster 3. At the Cluster Node Name text box, specify a node name.
Adding a Member to a Running Cluster box to be displayed. c. At the Fence Configuration dialog box, bottom of the right frame (below Properties), click Add a New Fence Level. Clicking Add a New Fence Level causes a fence-level element (for example, Fence-Level-1, Fence-Level-2, and so on) to be displayed below the node in the left frame of the Fence Configuration dialog box. d. Click the fence-level element. e. At the bottom of the right frame (below Properties), click Add a New Fence to this Level.
Chapter 5. Configuring Red Hat Cluster With system-config-cluster nodes, follow these steps: 1. Add the node and configure fencing for it as in Section 5.1, “Adding a Member to a Cluster”. 2. Click Send to Cluster to propagate the updated configuration to other running nodes in the cluster. 3. Use the scp command to send the updated /etc/cluster/cluster.conf file from one of the existing cluster nodes to the new node. 4.
Deleting a Member from a Cluster 2. Click Send to Cluster to propagate the updated configuration to other running nodes in the cluster. 3. Use the scp command to send the updated /etc/cluster/cluster.conf file from one of the existing cluster nodes to the new node. 4. Start cluster services on the new node by running the following commands in this order: a. service cman start b. service clvmd start, if CLVM has been used to create clustered volumes c. service gfs start, if you are using Red Hat GFS d.
Chapter 5. Configuring Red Hat Cluster With system-config-cluster Figure 5.6. Confirm Deleting a Member d. At that dialog box, click Yes to confirm deletion. e. Propagate the updated configuration by clicking the Send to Cluster button. (Propagating the updated configuration automatically saves the configuration.) 4. Stop the cluster software on the remaining running nodes by running the following commands at each node in this order: a. service rgmanager stop b.
Configuring a Failover Domain • Unrestricted — Allows you to specify that a subset of members are preferred, but that a cluster service assigned to this domain can run on any available member. • Restricted — Allows you to restrict the members that can run a particular cluster service. If none of the members in a restricted failover domain are available, the cluster service cannot be started (either manually or by the cluster software).
Chapter 5. Configuring Red Hat Cluster With system-config-cluster • Section 6.1, “Adding a Failover Domain” • Section 6.2, “Removing a Failover Domain” • Section 6.3, “Removing a Member from a Failover Domain” 6.1. Adding a Failover Domain To add a failover domain, follow these steps: 1. At the left frame of the Cluster Configuration Tool, click Failover Domains. 2. At the bottom of the right frame (labeled Properties), click the Create a Failover Domain button.
Adding a Failover Domain Figure 5.7. Failover Domain Configuration: Configuring a Failover Domain 4. Click the Available Cluster Nodes drop-down box and select the members for this failover domain. 5. To restrict failover to members in this failover domain, click (check) the Restrict Failover To This Domains Members checkbox. (With Restrict Failover To This Domains Members checked, services assigned to this failover domain fail over only to nodes in this failover domain.) 6.
Chapter 5. Configuring Red Hat Cluster With system-config-cluster Figure 5.8. Failover Domain Configuration: Adjusting Priority b. For each node that requires a priority adjustment, click the node listed in the Member Node/Priority columns and adjust priority by clicking one of the Adjust Priority arrows. Priority is indicated by the position in the Member Node column and the value in the Priority column.
Removing a Member from a Failover Domain 6.2. Removing a Failover Domain To remove a failover domain, follow these steps: 1. At the left frame of the Cluster Configuration Tool, click the failover domain that you want to delete (listed under Failover Domains). 2. At the bottom of the right frame (labeled Properties), click the Delete Failover Domain button. Clicking the Delete Failover Domain button causes a warning dialog box do be displayed asking if you want to remove the failover domain.
Chapter 5. Configuring Red Hat Cluster With system-config-cluster • New cluster — If this is a new cluster, choose File => Save to save the changes to the cluster configuration. • Running cluster — If this cluster is operational and running, and you want to propagate the change immediately, click the Send to Cluster button. Clicking Send to Cluster automatically saves the configuration change.
Adding a Cluster Service to the Cluster Figure 5.9. Adding a Cluster Service 4. If you want to restrict the members on which this cluster service is able to run, choose a failover domain from the Failover Domain drop-down box. (Refer to Section 6, “Configuring a Failover Domain” for instructions on how to configure a failover domain.) 5. Autostart This Service checkbox — This is checked by default.
Chapter 5. Configuring Red Hat Cluster With system-config-cluster types of services you can leave the Run Exclusive unchecked. Note Circumstances that require enabling Run Exclusive are rare. Enabling Run Exclusive can render a service offline if the node it is running on fails and no other nodes are empty. 7. Select a recovery policy to specify how the resource manager should recover from a service failure.
Propagating The Configuration File: New Note To verify the existence of the IP service resource used in a cluster service, you must use the /sbin/ip addr list command on a cluster node. The following output shows the /sbin/ip addr list command executed on a node running a cluster service: 1: lo: mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.
Chapter 5. Configuring Red Hat Cluster With system-config-cluster each node or start the cluster software on each cluster node by running the following commands at each node in this order: 1. service cman start 2. service clvmd start, if CLVM has been used to create clustered volumes Note Shared storage for use in Red Hat Cluster Suite requires that you be running the cluster logical volume manager daemon (clvmd) or the High Availability Logical Volume Management agents (HA-LVM).
Chapter 6.
Chapter 6. Managing Red Hat Cluster With system-config-cluster 3. service clvmd stop, if CLVM has been used to create clustered volumes 4. service cman stop Stopping the cluster services on a member causes its services to fail over to an active member. 2. Managing High-Availability Services You can manage cluster services with the Cluster Status Tool (Figure 6.1, “Cluster Status Tool”) through the Cluster Management tab in Cluster Administration GUI. Figure 6.1.
Modifying the Cluster Configuration You can use the Cluster Status Tool to enable, disable, restart, or relocate a high-availability service. The Cluster Status Tool displays the current cluster status in the Services area and automatically updates the status every 10 seconds. To enable a service, you can select the service in the Services area and click Enable. To disable a service, you can select the service in the Services area and click Disable.
Chapter 6. Managing Red Hat Cluster With system-config-cluster 3. Modifying the Cluster Configuration To modify the cluster configuration (the cluster configuration file (/etc/cluster/cluster.conf), use the Cluster Configuration Tool. For more information about using the Cluster Configuration Tool, refer to Chapter 5, Configuring Red Hat Cluster With system-config-cluster. Warning Do not manually edit the contents of the /etc/cluster/cluster.
Backing Up and Restoring the Cluster 3. Clicking Send to Cluster causes a Warning dialog box to be displayed. Click Yes to save and propagate the configuration. 4. Clicking Yes causes an Information dialog box to be displayed, confirming that the current configuration has been propagated to the cluster. Click OK. 5. Click the Cluster Management tab and verify that the changes have been propagated to the cluster members. 4.
Chapter 6. Managing Red Hat Cluster With system-config-cluster 9. Propagate the updated configuration file throughout the cluster by clicking Send to Cluster. Note The Cluster Configuration Tool does not display the Send to Cluster button if the cluster is new and has not been started yet, or if the node from which you are running the Cluster Configuration Tool is not a member of the cluster.
Database in the order shown to restart cluster software: 1. service cman start 2. service clvmd start, if CLVM has been used to create clustered volumes 3. service gfs start, if you are using Red Hat GFS 4. service rgmanager start 6. Diagnosing and Correcting Problems in a Cluster For information about diagnosing and correcting problems in a cluster, contact an authorized Red Hat support representative.
82
Appendix A. Example of Setting Up Apache HTTP Server This appendix provides an example of setting up a highly available Apache HTTP Server on a Red Hat Cluster. The example describes how to set up a service to fail over an Apache HTTP Server. Variables in the example apply to this example only; they are provided to assist setting up a service that suits your requirements. Note This example uses the Cluster Configuration Tool (system-config-cluster).
Appendix A. Example of Setting Up Apache HTTP Server systems from accessing the same data simultaneously, which may result in data corruption. Therefore, do not include the file systems in the /etc/fstab file. 2. Configuring Shared Storage To set up the shared file system resource, perform the following tasks as root on one cluster system: 1. On one cluster node, use the interactive parted utility to create a partition to use for the document root directory.
Installing and Configuring the Apache HTTP your configuration. For example: • Specify the directory that contains the HTML files. Also specify this mount point when adding the service to the cluster configuration. It is only required to change this field if the mount point for the web site's content differs from the default setting of /var/www/html/. For example: DocumentRoot "/mnt/httpdservice/html" • Specify a unique IP address to which the service will listen for requests. For example: Listen 192.
Appendix A. Example of Setting Up Apache HTTP Server Before the service is added to the cluster configuration, ensure that the Apache HTTP Server directories are not mounted. Then, on one node, invoke the Cluster Configuration Tool to add the service, as follows. This example assumes a failover domain named httpd-domain was created for this service. 1. Add the init script for the Apache HTTP Server service. • Select the Resources tab and click Create a Resource.
Server or leave it as None. • Click the Add a Shared Resource to this service button. From the available list, choose each resource that you created in the previous steps. Repeat this step until all resources have been added. • Click OK. 6. Choose File => Save to save your changes.
88
Appendix B. Fence Device Parameters This appendix provides tables with parameter descriptions of fence devices. Note Certain fence devices have an optional Password Script parameter. The Password Scriptparameter allows specifying that a fence-device password is supplied from a script rather than from the Password parameter. Using the Password Script parameter supersedes the Password parameter, allowing passwords to not be visible in the cluster configuration file (/etc/cluster/cluster.conf).
Appendix B. Fence Device Parameters Field Description IP Address The IP address assigned to the PAP console. Login The login name used to access the PAP console. Password The password used to authenticate the connection to the PAP console. Password Script (optional) The script that supplies a password for access to the fence device. Using this supersedes the Password parameter. Domain Domain of the Bull PAP system to power cycle Table B.3.
Table B.6. GNBD (Global Network Block Device) Field Description Name A name for the server with HP iLO support. Hostname The hostname assigned to the device. Login The login name used to access the device. Password The password used to authenticate the connection to the device. Password Script (optional) The script that supplies a password for access to the fence device. Using this supersedes the Password parameter. Table B.7.
Appendix B. Fence Device Parameters Field Description Login The login name of a user capable of issuing power on/off commands to the given IPMI port. Password The password used to authenticate the connection to the IPMI port. Password Script (optional) The script that supplies a password for access to the fence device. Using this supersedes the Password parameter. Authentication Type none, password, md2, or md5 Use Lanplus True or 1. If blank, then value is False. Table B.10.
Field Description Port The switch outlet number. Table B.13. RPS-10 Power Switch (two-node clusters only) Field Description Name A name for the SANBox2 device connected to the cluster. IP Address The IP address assigned to the device. Login The login name used to access the device. Password The password used to authenticate the connection to the device. Password Script (optional) The script that supplies a password for access to the fence device. Using this supersedes the Password parameter.
Appendix B. Fence Device Parameters Field Description Name A name for the WTI power switch connected to the cluster. IP Address The IP address assigned to the device. Password The password used to authenticate the connection to the device. Password Script (optional) The script that supplies a password for access to the fence device. Using this supersedes the Password parameter. Table B.18.
Appendix C. HA Resource Parameters This appendix provides descriptions of HA resource parameters. You can configure the parameters with Luci, system-config-cluster, or by editing etc/cluster/cluster.conf. Table C.1, “HA Resource Summary” lists the resources, their corresponding resource agents, and references to other tables containing parameter descriptions. To understand resource agents in more detail you can view them in /usr/share/cluster of any cluster node.
Appendix C. HA Resource Parameters Resource Resource Agent Reference to Parameter Description machines. Table C.1. HA Resource Summary Field Description Name The name of the Apache Service. Server Root The default value is /etc/httpd. Config File Specifies the Apache configuration file. The default valuer is /etc/httpd/conf. httpd Options Other command line options for httpd. Shutdown Wait (seconds) Specifies the number of seconds to wait for correct end of service shutdown. Table C.2.
Field Description unmount Force Unmount kills all processes using the mount point to free up the mount when it tries to unmount. Reboot host node if unmount fails If enabled, reboots the node if unmounting this file system fails. The default setting is disabled. Check file system before mounting If enabled, causes fsck to be run on the file system before mounting it. The default setting is disabled. Table C.3. File System Field Description Name The name of the file system resource.
Appendix C. HA Resource Parameters Field Description IP Address The IP address for the resource. This is a virtual IP address. IPv4 and IPv6 addresses are supported, as is NIC link monitoring for each IP address. Monitor Link Enabling this causes the status check to fail if the link on the NIC to which this IP address is bound is not present. Table C.5. IP Address Field Description Name A unique name for this LVM resource. Volume Group Name A descriptive name of the volume group being managed.
Field Description rights. For more information, refer to the exports (5) man page, General Options. Table C.8. NFS Client Field Description Name Descriptive name of the resource. The NFS Export resource ensures that NFS daemons are running. It is fully reusable; typically, only one NFS Export resource is needed. Tip Name the NFS Export resource so it is clearly distinguished from other NFS resources. Table C.9. NFS Export Field Description Name Symbolic name for the NFS mount.
Appendix C. HA Resource Parameters Field Description Name Specifies a service name for logging and other purposes. Config File Specifies an absolute path to a configuration file. The default value is /etc/openldap/slapd.conf. URL List The default value is ldap:///. slapd Other command line options for slapd. Options Shutdown Wait (seconds) Specifies the number of seconds to wait for correct end of service shutdown. Table C.11. Open LDAP Field Description Instance Instance name.
Field Description Wait (seconds) Table C.13. PostgreSQL 8 Field Description SAP Database Name Specifies a unique SAP system identifier. For example, P01. SAP executable directory Specifies the fully qualified path to sapstartsrv and sapcontrol. Database type Specifies one of the following database types: Oracle, DB6, or ADA. Oracle TNS listener name Specifies Oracle TNS listener name.
Appendix C. HA Resource Parameters Field Description profile Name of the SAP START profile Specifies name of the SAP START profile. Table C.15. SAP® Instance Note Regarding Table C.16, “Samba Service”, when creating or editing a cluster service, connect a Samba-service resource directly to the service, not to a resource within a service. Field Description Name Specifies the name of the Samba server. Workgroup Specifies a Windows workgroup name or Windows NT domain of the Samba service. Table C.
Field Description running on it. If no nodes are available for a service to run exclusively, the service is not restarted after a failure. Additionally, other services do not automatically relocate to a node running this service as Run exclusive. You can override this option by manual start or relocate operations. Failover Domain Recovery policy Defines lists of cluster members to try in the event that a service fails.
Appendix C. HA Resource Parameters Field Description Name Specifies a service name for logging and other purposes. Config File Specifies the absolute path to the configuration file. The default value is /etc/tomcat5/tomcat5.conf. Tomcat User User who runs the Tomcat server. The default value is tomcat. Catalina Options Other command line options for Catalina. Catalina Base Catalina base directory (differs for each service) The default value is /usr/share/tomcat5.
Field Description Domain Recovery policy Recovery policy provides the following options: • Disable — Disables the virtual machine if it fails. • Relocate — Tries to restart the virtual machine in another node; that is, it does not try to restart in the current node. • Restart — Tries to restart the virtual machine locally (in the current node) before trying to relocate (default) to virtual machine to another node. Migration type Specifies a migration type of live or pause. The default setting is live.
106
Appendix D. HA Resource Behavior This appendix describes common behavior of HA resources. It is meant to provide ancillary information that may be helpful in configuring HA services. You can configure the parameters with Luci, system-config-cluster, or by editing etc/cluster/cluster.conf. For descriptions of HA resource parameters, refer to Appendix C, HA Resource Parameters. To understand resource agents in more detail you can view them in /usr/share/cluster of any cluster node.
Appendix D. HA Resource Behavior • Section 5, “Debugging and Testing Services and Resource Ordering” Note The sections that follow present examples from the cluster configuration file, /etc/cluster/cluster.conf, for illustration purposes only. 1. Parent, Child, and Sibling Relationships Among Resources A cluster service is an integrated entity that runs under the control of rgmanager. All resources in a service run on the same node.
Sibling Start Ordering and Resource Child • Children must all stop cleanly before a parent may be stopped. • For a resource to be considered in good health, all its children must be in good health. 2.
Appendix D. HA Resource Behavior Resource Child Type Start-order Value Stop-order Value LVM lvm 1 9 File System fs 2 8 GFS File System clusterfs 3 7 NFS Mount netfs 4 6 NFS Export nfsexport 5 5 NFS Client nfsclient 6 4 IP Address ip 7 2 Samba smb 8 3 Script script 9 1 Table D.1.
Ordering Example D.3. Ordering Within a Resource Type Typed Child Resource Starting Order In Example D.3, “Ordering Within a Resource Type”, the resources are started in the following order: 1. lvm:1 — This is an LVM resource. All LVM resources are started first. lvm:1 () is the first LVM resource started among LVM resources because it is the first LVM resource listed in the Service foo portion of /etc/cluster/cluster.conf. 2. lvm:2 — This is an LVM resource.
Appendix D. HA Resource Behavior stopped in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf. 5. lvm:1 — This is an LVM resource. All LVM resources are stopped last. lvm:1 () is stopped after lvm:2; resources within a group of a resource type are stopped in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf. 2.2.
Non-typed Child Resource Start and Stop 3. fs:1 — This is a File System resource. If there were other File System resources in Service foo, they would start in the order listed in the Service foo portion of /etc/cluster/cluster.conf. 4. ip:10.1.1.1 — This is an IP Address resource. If there were other IP Address resources in Service foo, they would start in the order listed in the Service foo portion of /etc/cluster/cluster.conf. 5. script:1 — This is a Script resource.
Appendix D. HA Resource Behavior 4. ip:10.1.1.1 — This is an IP Address resource. If there were other IP Address resources in Service foo, they would stop in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf. 5. fs:1 — This is a File System resource. If there were other File System resources in Service foo, they would stop in the reverse order listed in the Service foo portion of /etc/cluster/cluster.conf. 6. lvm:2 — This is an LVM resource.
Ordering Example D.5. NFS Service Set Up for Resource Reuse and Inheritance If the service were flat (that is, with no parent/child relationships), it would need to be configured as follows: • The service would need four nfsclient resources — one per file system (a total of two for file systems), and one per target machine (a total of two for target machines).
Appendix D. HA Resource Behavior Example D.6. Service foo Normal Failure Recovery Example D.7.
Debugging and Testing Services and Action Syntax resource agents. Display the start and stop ordering of a service. Display start order: rg_test noop /etc/cluster/cluster.conf start service servicename Display stop order: rg_test noop /etc/cluster/cluster.conf stop service servicename Explicitly start or stop a service. Start a service: rg_test test /etc/cluster/cluster.conf start service servicename Stop a service: Important rg_test test /etc/cluster/cluster.
118
Appendix E. Upgrading A Red Hat Cluster from RHEL 4 to RHEL 5 This appendix provides a procedure for upgrading a Red Hat cluster from RHEL 4 to RHEL 5. The procedure includes changes required for Red Hat GFS and CLVM, also. For more information about Red Hat GFS, refer to Global File System: Configuration and Administration. For more information about LVM for clusters, refer to LVM Administrator's Guide: Configuration and Administration.
Appendix E. Upgrading A Red Hat Cluster from RHEL 4 to RHEL 5 f. Run service ccsd stop. 3. Disable cluster software from starting during reboot. At each node, run /sbin/chkconfig as follows: # # # # # # chkconfig chkconfig chkconfig chkconfig chkconfig chkconfig --level --level --level --level --level --level 2345 2345 2345 2345 2345 2345 rgmanager off gfs off clvmd off fenced off cman off ccsd off 4. Edit the cluster configuration file as follows: a. At a cluster node, open /etc/cluster/cluster.
You shouldn't change any of these values if the filesystem is mounted. Are you sure? [y/n] y current lock protocol name = "lock_gulm" new lock protocol name = "lock_dlm" Done 6. Update the software in the cluster nodes to RHEL 5 and Red Hat Cluster Suite for RHEL 5. You can acquire and update software through Red Hat Network channels for RHEL 5 and Red Hat Cluster Suite for RHEL 5. 7. Run lvmconf --enable-cluster. 8. Enable cluster software to start upon reboot.
122
Index A ACPI configuring, 17 Apache HTTP Server httpd.
Index cluster resource, 108 G general considerations for cluster administration, 25 S starting the cluster software, 73 System V init, 75 H hardware compatible, 13 HTTP services Apache HTTP Server httpd.