LifeKeeper® for Linux v4.5.
The product described in this book is a licensed product of SteelEye® Technology, Inc. SteelEye, SteelEye Technology, and LifeKeeper are registered trademarks of SteelEye Technology, Inc. Linux is a registered trademark of Linus Torvalds. Sendmail is a trademark of Sendmail, Inc. Other brand and product names used herein are for identification purposes only and may be trademarks of their respective companies. It is the policy of SteelEye Technology, Inc.
Table of Contents Introduction................................................................................................................................3 Document Contents.............................................................................................................3 LifeKeeper Documentation.................................................................................................4 Reference Documents ................................................................................
SAMS Recovery Kit Administration Guide Introduction The Sendmail Advanced Message Server (SAMS) is a suite of commercial messaging applications. These applications provide various services to clients, such as POP and IMAP, as well as storing and transferring e-mail messages. The LifeKeeper® for Linux SAMS Recovery Kit provides a mechanism to recover SAMS from a failed primary server onto a backup server in a LifeKeeper environment.
Introduction • • LifeKeeper Configuration Tasks. A description of the tasks for creating and managing your SAMS resource hierarchies using the LifeKeeper GUI. Troubleshooting. This section provides a list of informational and error messages with recommended solutions. LifeKeeper Documentation The following is a list of LifeKeeper related information available from SteelEye Technology, Inc.
Requirements Reference Documents The following is a list of reference documents associated with the SAMS product and the LifeKeeper SAMS Recovery Kit: • Sendmail Advanced Message Server Reference Guide • • • • • • Sendmail Advanced Message Server Installation Guide Sendmail Advanced Message Server User’s Guide Sendmail Switch Installation Guide Sendmail Switch User’s Guide Sendmail Manual Page Sendmail, 2nd Edition by Eric Allman & Bryan Costales.
Requirements • • Servers. The Recovery Kit requires two or more supported computers configured in accordance with LifeKeeper requirements described in the LifeKeeper Release Notes, which are shipped with the LifeKeeper product media. LifeKeeper software. You must install the same version of LifeKeeper software and any patches on each server. Please refer to the LifeKeeper Release Notes for specific LifeKeeper requirements. • LifeKeeper IP Recovery Kit.
Configuring SAMS with LifeKeeper Configuring SAMS with LifeKeeper This section contains information you should consider before you start to configure SAMS and examples of typical LifeKeeper SAMS configurations. Please refer to your LifeKeeper Online Product Manual for instructions on configuring your LifeKeeper Core resource hierarchies. Currently, LifeKeeper only supports an active/standby SAMS configuration (SAMS does not allow multiple instances of itself).
Configuring SAMS with LifeKeeper Figure 1 illustrates how SAMS works in a LifeKeeper environment. An active/standby configuration means that only one instance of SAMS can run at one time within the LifeKeeper-protected pair. In an active/standby configuration, one server acts as a primary mailhub or mailserver, while the other server acts as a backup mailhub or mailserver.
Configuring SAMS with LifeKeeper server may fail when a switchover to the backup server occurs. Note that the SAMS Recovery Kit does not require the switchable IP address to have an MX record on the DNS server. Protected Files, Directories and Services The SAMS Recovery Kit protects the following configuration and data directories: • /etc/mail • /etc/md • /var/md/store If not located on a shared file system, the MTA and MSP queue directories (e.g.
Configuring SAMS with LifeKeeper Sendmail Configuration File The following are a few more important points to note in the Sendmail configuration file (/etc/mail/sendmail.cf). Masquerading Masquerading is used to translate an email address with a given hostname into the address of the domain or that of another mailhub/mailserver. Masquerading can be done at either the domain level or at the host level of the mailhub/mailserver itself.
Configuring SAMS with LifeKeeper DNS, Sendmail and LifeKeeper DNS offers a mechanism (MX Records) for specifying backup or alternate hosts for mail delivery. This mechanism also allows hosts to assume mail-handling responsibilities for other hosts that are not configured to accept mail, such as a null client. MX records also provide a mechanism of forcing all mail to go to the hub machine or mail server. MX records specify a mail exchanger for a domain name (i.e.
Active/Standby Scenario Active/Standby Scenario The configuration scenario in this section will describe the file movement and the symbolic linking that takes place in a LifeKeeper-protected SAMS environment from the creation of the resource hierarchy, to the extension of that hierarchy to a backup server, to finally what occurs when the backup server takes over after a switchover or failover.
Active/Standby Scenario to the switchable IP address when asked for the “Host Name” in the Sendmail Switch installer program. 5. SAMS is tested to ensure that it will work properly on both servers using equivalent configuration options on both servers. 6. The MTA and MSP spool directories (e.g., /var/spool/mqueue and /var/spool/clientmqueue) or theirsubdirectories, if multiple mail queues are being used, must be manually symbolically linked to a directory on a shared file system.
Active/Standby Scenario The black arrows represent active symbolic links (i.e. the files on Server 1 are actively linked to the shared storage device after the resource is created). Configuration Notes During the creation of the SAMS resource instance on the primary server (i.e. Server 1), the Recovery Kit moves the /etc/md, /etc/mail, and /var/md/store to the shared file system. It then creates a symbolic link on the local server (Server 1) to the file on the shared device.
Active/Standby Scenario The gray arrows represent dangling links (i.e. the files on Server 2 are linked to the shared storage device, but the shared device is not mounted on Server 2. Therefore, the links on Server 2 are not active). Configuration Notes During the extension of the SAMS resource instance to the backup server (i.e. Server 2), the Recovery Kit symbolically links the SAMS configuration and data directories /etc/md, /etc/mail, and /var/md/store to the versions on the shared file system.
Active/Standby Scenario The gray arrows represent dangling links (i.e. the files on Server 1 are linked to the shared storage device, but the shared device is not mounted on Server 1. Therefore, the links on Server 1 are not active). Configuration Notes When Server 2 becomes the active or primary server, the following occurs: 1. LifeKeeper unmounts the shared file system from Server 1 and mounts it on Server 2. The dangling links on Server 2 now point to actual files. 2.
LifeKeeper Configuration Tasks LifeKeeper Configuration Tasks You can perform the following configuration tasks from the LifeKeeper GUI. The following four tasks are described in this guide, as they are unique to a SAMS resource instance, and different for each Recovery Kit. • • • • Create a Resource Hierarchy. Creates an application resource hierarchy in your LifeKeeper cluster. Delete a Resource Hierarchy. Deletes a resource hierarchy from all servers in your LifeKeeper cluster.
LifeKeeper Configuration Tasks same drop down menu choices as the Edit menu. This, of course, is only an option when a hierarchy already exists. You can also right click on a resource instance in the Resource Hierarchy Table (right-hand pane) of the status display window to perform all the configuration tasks, except Creating a Resource Hierarchy, depending on the state of the server and the particular resource.
LifeKeeper Configuration Tasks To create a resource instance from the primary server, you should complete the following steps: 1. From the LifeKeeper GUI menu, select Edit, then Resource. From the drop down menu, select Create Resource Hierarchy. IMPORTANT: The switchable IP address should be under LifeKeeper protection before creating the SAMS resource instance. A dialog box will appear with a drop down list box with all recognized Recovery Kits installed within the cluster.
LifeKeeper Configuration Tasks 2. Select the Switchback Type. This dictates how the SAMS instance will be switched back to this server when it comes back into service after a failover to the backup server. You can choose either intelligent or automatic. Intelligent switchback requires administrative intervention to switch the instance back to the primary/original server.
LifeKeeper Configuration Tasks 5. Select the IP Tag. This is a tag name given to the IP Resource hierarchy that the SAMS resource will be dependent upon. The list will show only those IP addresses that are in-service on this server. Important: Verify that the priority of the IP Tag on the primary server is higher than the priority of the IP Tag on the backup server. Click on the Next button to proceed to the next dialog box. 6. Select or enter the Mail Tag. This is a tag name given to the SAMS hierarchy.
LifeKeeper Configuration Tasks Creating mail/sams resource… BEGIN creation of resource “sams” on server “smokey” at Fri Oct 12 10:31:34 EDT 2001 Creating Resource instance “sams with id “sams” on server “smokey” devicehier: Using /opt/LifeKeeper/lkadm/subsys/scsi/device/bin/devicehier to construct the hierarchy . . . END successful creation of resource “sams” on server “tigger” at Fri Oct 12 10:32:19 EDT 2001 Click on the Next button to proceed to the next dialog box. 9.
LifeKeeper Configuration Tasks 10. Click the Done button to exit the Create Resource Hierarchy menu selection. Deleting a Resource Hierarchy To delete a resource hierarchy from all the servers in your LifeKeeper environment, complete the following steps: 1. From the LifeKeeper GUI menu, select Edit, then Resource. From the drop down menu, select Delete Resource Hierarchy. 2. Select the name of the Target Server where you will be deleting your SAMS resource hierarchy.
LifeKeeper Configuration Tasks sams Click on the Next button to proceed to the next dialog box. 4. An information box appears confirming your selection of the target server and the hierarchy you have selected to delete. sams Click on the Delete button to proceed to the next dialog box. 5. Another information box appears confirming that the SAMS resource was deleted successfully.
LifeKeeper Configuration Tasks Important: The user must be careful when deleting the SAMS resource hierarchy. The SAMS Recovery Kit keeps the backup server’s copy of the configuration and data directories on the backup server itself. These directories are moved to /etc and renamed with the extension .LK. When the hierarchy is deleted, these .LK directories are renamed to their original names on the backup server.
LifeKeeper Configuration Tasks dialog box will not appear, since the wizard has already identified the template server in the create stage. This is also the case when you rightclick on either the SAMS resource icon in the left-hand pane or rightclick on the SAMS resource box in the right-hand pane the of the GUI window and choose Extend Resource Hierarchy.
LifeKeeper Configuration Tasks Click on the Next button to proceed to the next dialog box. 5. Select the Switchback Type. This dictates how the SAMS instance will be switched back to this server when it comes back into service after a failover to the backup server. You can choose either intelligent or automatic. Intelligent switchback requires administrative intervention to switch the instance back to the primary/original server.
LifeKeeper Configuration Tasks 7. An information box will appear explaining that LifeKeeper has successfully checked your environment and that all the requirements for extending this SAMS resource have been met. If there were some requirements that had not been met, LifeKeeper would not allow you to select the Next button, and the Back button would be enabled. WARNING: SAMS is running on server “blueridge1”. Please stop SAMS since the configuration files will be overwritten.
LifeKeeper Configuration Tasks appear if you are extending this SAMS resource immediately following its creation. Click on the Next button. 10. This dialog box is for information purposes only. You cannot change the Netmask that appears in the box. Note: This dialog box will not appear if you are extending this SAMS resource immediately following its creation. Click on the Next button. 11. Select or enter the Network Interface.
LifeKeeper Configuration Tasks Click on the Next button to proceed to the next dialog box. 13. Select or enter the IP Resource Tag. Click on the Next button. 14.
LifeKeeper Configuration Tasks 15. Click the Done button in the last dialog box to exit from the Extend Resource Hierarchy menu selection. Note: Be sure to test the functionality of the new instance on both servers. Unextending Your Hierarchy 1. From the LifeKeeper GUI menu, select Edit, then Resource. From the drop down menu, select Unextend Resource Hierarchy. 2. Select the Target Server where you want to unextend the SAMS resource. It cannot be the server where SAMS is currently in service.
LifeKeeper Configuration Tasks 4. An information box appears confirming the target server and the SAMS resource hierarchy you have chosen to unextend. sams Click the Unextend button. 5. Another information box appears confirming that the SAMS resource was unextended successfully.
LifeKeeper Configuration Tasks Performing a Manual Switchover from the GUI You can initiate a manual switchover from the LifeKeeper GUI by selecting Edit, then Resource, then finally In Service from the drop down menu. For example, an in service request executed on a backup server causes the application hierarchy to be placed in service on the backup server and taken out of service on the primary server.
Troubleshooting Troubleshooting This section provides a list of messages that you may encounter during the process of creating, extending, removing and restoring a LifeKeeper SAMS hierarchy, and, where appropriate, provides additional explanation of the cause of the errors and necessary action to resolve the error condition. Other messages from other LifeKeeper scripts and utilities are also possible. In these cases, please refer to the documentation for the specific script or utility.
Troubleshooting ERROR: Must specify Sendmail configuration file name The name of the Sendmail configuration file must be specified. Enter the correct name for the configuration file (/etc/mail/sendmail.cf). Unknown error in script mailhier, err=$ERR An unknown error has occurred in the script. See the LifeKeeper error log for additional troubleshooting information. ERROR: sendmail configuration file “$CONFIG” not found The Sendmail configuration file that was specified was not found.
Troubleshooting ERROR: Message store directory setting invalid or not found in mail store configuration file “$MS_CONF” The ms-path variable is missing or contains an incorrect value in the /etc/md/store/ms.conf file. ERROR: Failed to move “$DIR” to “$SHARED_FS” LifeKeeper was unable to move a directory and its contents to the shared file system. ERROR: Failed to create sams resource instance LifeKeeper was unable to create the SAMS resource.
Troubleshooting ERROR: Failed to create LifeKeeper Application “mail” on server “$SERVER” LifeKeeper was unable to create the LifeKeeper application type “mail” on the specified server. See the LifeKeeper error log for additional troubleshooting information. ERROR: Failed to create LifeKeeper “mail” Resource Type “sams” on server “$SERVER” LifeKeeper was unable to create the LifeKeeper resource type “sams” on the specified server. See the LifeKeeper error log for additional troubleshooting information.
Troubleshooting Error - canextend () - The “$DIR” directory does not exist on server “$TARGET_SYSTEM” The specified SAMS directory does not exist on the target system. Create the directory on the target system and attempt to extend the SAMS resource hierarchy again. Error - canextend () – Failed to copy “$CONFIG_FILE_NAME” on server “$TEMPLATE_SYSTEM” to “$NEW_CONFIG” on server “$TARGET_SYSTEM” LifeKeeper tried to copy the configuration file from the template system to the target system.
Troubleshooting Error - extend () - LifeKeeper Internal ID ($ID) is already being used by another resource type on “$SERVER” LifeKeeper uses an Internal Resource Identifier that must be unique for all servers in a cluster. There is already a resource that has the same ID as SAMS. Review all the resources that are LifeKeeper-protected on the specified server. Error - extend () - Failed to create resource instance on $SERVER LifeKeeper creates a resource instance to represent the SAMS application.
Troubleshooting restore: sams: ERROR: sendmail configuration file “$FILE” is empty The Sendmail configuration is empty or does not exist. restore: sams: SAMS is already running on $IP:$PORTLIST This message is for informational purposes only. It indicates that the SAMS daemons that are to be brought in-service are already running on the specified IP address and ports. restore: sams: ERROR: Unable to start the SAMS daemons restore: sams: ERROR: Restore of sams resource “$TAG” failed.
Troubleshooting LifeKeeper was unable to stop the SAMS processes. The actual error messages from subprocesses are displayed within this message. See the LifeKeeper error log for additional troubleshooting information. SAMS Resource Health Monitoring Error Messages daemon is not responding on $IP:$PORT quickCheck: sams: attempting local recovery of resource “$TAG” These two messages indicate that one or more SAMS daemons are not functioning properly and must be restarted by LifeKeeper.