Insight Control for Linux 7.0 User Guide Abstract This document describes how to set up and use Insight Control for Linux to monitor and manage HP ProLiant servers that were licensed with Insight Control for Linux. This document builds on the information from the HP Insight Control for Linux Installation Guide, which you used to install and configure HP Systems Insight Manager (HP SIM) and Insight Control for Linux on the Central Management Server (CMS).
© Copyright 2008, 2010, 2012 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. The information contained herein is subject to change without notice.
Contents I Introduction...............................................................................................10 1 Using Insight Control for Linux................................................................11 1.1 Overview.....................................................................................................................11 1.2 Integration with Systems Insight Manager........................................................................12 1.
5 Managing the Insight Control for Linux repository......................................42 5.1 Introduction to the Insight Control for Linux repository .......................................................42 5.1.1 Configuring a remote repository...............................................................................43 5.1.2 Repository contents................................................................................................44 5.1.3 Repository item naming conventions..................
8.3.1 Opening network ports on managed systems............................................................78 8.3.2 Resolving host names on the CMS..........................................................................78 8.3.3 Installing additional components of the PSP..............................................................79 8.3.4 Configuring agents and HP SIM SSH keys...............................................................79 8.3.5 Configuring console access and logging.....................
11.4 Obtaining virtual guest and virtual host associations......................................................126 11.5 Establishing monitoring for virtual hosts and virtual guests...............................................127 11.6 Virtual guest operations.............................................................................................127 12 Using Insight Control for Linux to update HP ProLiant firmware.................129 12.1 Overview of updating HP ProLiant firmware.....................
19.4 Installing Insight Control for Linux management agents...................................................155 19.5 Verifying successful configuration of the monitoring services...........................................155 19.5.1 Ensuring that Nagios is reporting status................................................................156 19.5.2 Summarizing service status.................................................................................
23 Miscellaneous topics.........................................................................190 23.1 Changing management processor credentials...............................................................190 23.2 Changing the default port for the repository web server................................................190 23.3 Increasing the number of servers that can be discovered concurrently..............................191 23.4 Changing the IP address of the CMS ........................................
25.15.4 Deploying Linux images....................................................................................231 25.16 Troubleshooting PSP installation failures......................................................................233 25.17 Troubleshooting PXE Boot problems...........................................................................233 25.18 Troubleshooting the run script and run SSH command tools...........................................234 25.
Part I Introduction
1 Using Insight Control for Linux This chapter addresses the following topics: • “Overview” (page 11) • “Integration with Systems Insight Manager” (page 12) • “Insight Control for Linux extensions to HP SIM” (page 12) • “Insight Control for Linux toolboxes” (page 15) • “Insight Control for Linux command environment” (page 16) • “Internal task queuing and management” (page 16) • “Synchronized system clocks” (page 17) • “Insight Control for Linux RAM disk environment” (page 17) • “Network con
• Configuring network parameters • Installing an operating system, PSP, and agents • Configuring monitoring services After a server becomes a managed system, you can monitor it and manage it. 1.2 Integration with Systems Insight Manager Insight Control for Linux is a suite of software and tools that combine to provide a powerful mechanism for discovering, installing, monitoring, and managing HP ProLiant servers.
Table 1 Insight Control for Linux extensions to the HP Insight Control user interface; (continued) Menu item Description Documented in used by the Network Configuration Editor tool. The network definitions are used by the OS installation tools to implement booting using the virtual media mechanism. IMPORTANT: The network definitions must be created before initiating bare-metal discovery through virtual media.
Table 1 Insight Control for Linux extensions to the HP Insight Control user interface; (continued) Menu item Description Documented in Deploy→Operating System→Red Hat Interactive Starts an interactive Red Hat Enterprise Linux (RHEL) installation on Section 9.4.2 one or more target managed systems. (page 93) Deploy→Operating System→Red Hat (Kickstart) Uses a default or user-supplied configuration file to start an unattended RHEL installation on one or more target managed systems. Section 9.4.
Table 1 Insight Control for Linux extensions to the HP Insight Control user interface; (continued) Menu item Description Documented in Tools→Server Controls→Power Makes a remote call to the management processor to set power Off Server... status to off abruptly. Section 15.1 (page 140) Tools→Server Controls→Power Makes a remote call to the management processor to set power On Server... status to on. Section 15.2 (page 140) Tools→Server Controls→Reboot Server...
For information on creating administrator accounts, that is, non-root accounts with the privileges required to access and use HP SIM, see the HP Insight Control for Linux Installation Guide. 1.5 Insight Control for Linux command environment Table 2 lists the Insight Control for Linux commands that you can run from the command line on the CMS or on any management hub, with the exception of the pdsh command.
1.7 Synchronized system clocks When using Insight Control for Linux, and especially when using Insight Control for Linux tools to install operating systems on managed systems, to capture and deploy Linux images, HP recommends that you keep system clocks up to date and synchronized. Synchronization is required for the Console Maintenance Facility to access a managed system using SSH.
When the system is powered on, the bootable image is loaded from the CMS by way of the management processor. Virtual media does not use DHCP. The system boots a custom RAM disk that includes the predefined network configuration information (for example. the IP address, Net Mask, Gateway, and so on). Insight Control for Linux provides tools that let you define the network information parameters, edit those network parameters, and initiate bare-metal discovery. 1.
When you run the Options→IC-Linux→Configure Management Services task, it determines if this file exists: • If the file does not exist, it creates the file and assigns numbers based on the managed systems and the current numbering scheme. The Central Management Server is always node number 1. • If the file already exists, the configuration task reads the nodenumbers file and assigns the node numbers according to the file contents.
1.11.2 Viewing managed system names After the Configure Management Services task is run, you can list the managed systems with their associated names; use the shownode info command as described in Section 21.2.2 (page 181). 1.12 Connecting to HP SIM To log in and connect to HP SIM, follow these steps: 1. Open a browser window. 2.
You also must back up HP SIM configuration files to restore your configuration. For more information on these HP SIM configuration files, see the following white paper: Backing up and restoring HP SIM or greater data files in an HP-UX and Linux environment 1.
2 Security 2.1 Integrated security features This section describes features that are integrated into HP SIM and Insight Control for Linux to make them secure. Security features are also discussed in context of the associated topic throughout this document. • Browser Connections HP SIM enforces a secure connection to the web browser.
The SSH service also enables file transfer with the scp or sftp commands over the same port as SSH. • pdsh Keys The pdsh command uses public host keys to authenticate remote hosts and supports public key authentication to authenticate users. • cmfd Keys The console command uses SSL keys to connect to the console management facility daemon (cmfd) for console access. • secure boot mechanism Virtual media support is provided as the secure boot mechanism.
to individual servers. There is no mechanism for verifying the identity of the server providing the image; neither method protects from a man in the middle attack. Standard Linux deployment, which uses SSH to push an image to the target systems is a less scalable but more secure method than large scale deployment. HP recommends the use of a dedicated management LAN for large scale Linux deployments. For more information on scalable deployment, see Section 10.
An alternate method is to automate this procedure by using a script to extract the iLO's certificate and add it to the HP SIM trusted certificate list. The following is an example of a script that accepts a series of iLO certificates and adds them to the HP SIM trust store. #!/bin/sh # # Get certificate for each iLO passed in as an argument # and add it to the HP SIM trust store.
3 Managing licenses This chapter describes the following topics: • “Licensing overview” (page 26) • “Adding the Insight Control for Linux license key to HP SIM” (page 26) • “Licensing virtual guests” (page 27) 3.1 Licensing overview The licenses for the Insight Control power management and Insight Control virtual machine management are bundled with the Insight Control for Linux license. The iLO Advance remains as a separate license.
3.3 Licensing virtual guests When a virtual host (VM host) is licensed for Insight Control for Linux, all guests of that VM host are considered licensed for Insight Control for Linux as well, provided that the virtual guests are properly associated with their virtual host. You can license a virtual machine guest (VM guest) without licensing its host or you can license it in addition to licensing its host, in either case unnecessarily consuming licenses.
4 Understanding tasks and task results This chapter addresses the following topics: • “Task results overview” (page 28) • “Understanding task results” (page 28) • “Task results page” (page 28) • “Common task results” (page 30) • “HP SIM standard task results format” (page 33) • “Scalable task results format” (page 37) 4.1 Task results overview HP SIM and Insight Control for Linux enable you to manage systems by scheduling and running tasks.
Figure 1 Task results page Table 4 lists the components of the Task Results page. Table 4 Components of the Task Results page Available in HP SIM standard view, scalable view, or common to both views Component Description Task Instance Results Provides the status of the running task or the task that is selected in the task list log at the top of the page. Common Use SIM Standard Task Results Format radio button This option is only offered when you run an Insight Control for Linux task.
Table 4 Components of the Task Results page (continued) Component Description Available in HP SIM standard view, scalable view, or common to both views Selecting this radio button provides an operation oriented format that enables you to view the status of each operation in a task as it completes on each target. This format is particularly useful when you are running an Insight Control for Linux task on many targets, for example, when you are installing a Linux OS on many servers at once.
In Insight Control for Linux, it might not be possible to cancel a task immediately after you select the Stop button because an operation might be at a point on a target where it cannot be interrupted. This can result in a task changing from the Cancelled state to a Complete or Failed state because the cancel operation could not be processed in time. A task End Time is initially set to the time when you select Stop.
◦ All target details, including all information displayed in the operation status table and the log for each operation TIP: If you select All Systems for the report, the target level results are displayed for all targets, each separated by a line. 4.4.2.2 Rerun non-complete targets button The Rerun Non-Complete Targets button is enabled only when the following conditions exist: • At least one target for the task has a Failed or Cancelled status. • All targets for the task are in a Terminal state.
Figure 5 View of the operation details log 4.5 HP SIM standard task results format This section describes the portions of the Task Results page that are specific to the HP SIM Standard Task Results Format, which is the default view. Figure 6 illustrates the HP SIM Standard Task Results format. The figure shows the task results for an instance of a Red Hat Kickstart OS installation task running on three target servers.
Figure 6 HP SIM standard task results format 4.5.1 Summary status and target status area Figure 7 illustrates the Summary status: area and target status area, which provide the overall status of a task on each target. Figure 7 View of the summary status and target status areas Table 5 describes the information displayed in the Summary status: area.
Table 5 Description of target status area Column heading Description Target Name Name of the target managed system on which the task was run. Status The status of a target is computed from the status of its operations. Non-terminal target status Pending: All operations can have the Pending status. Running: At least one operation has the status Running. A percent complete is also displayed.
4.5.1.2 Log button in the target status area When you select the Log button, a new window opens that displays the log for all operations for the task, including the following information: • A summary of the task level information • The information displayed in the target status table for the selected target • A block of information for each operation in the task, including the log The log screen does not auto-refresh.
Table 6 Description of target details table (continued) Column heading Description NOTE: When an operation has a status of Cancelling, the target status is Cancelled, but the end time is empty for both the operation and the target. Terminal operation status Complete: The execution of the operation completed as expected. Failed: The operation was not successful. Cancelled: You pressed the Stop button for the target or task and this operation was the last one run or the next one to run.
Figure 9 Scalable task results format 4.6.1 Operations table Figure 10 illustrates the Operations table, which lists individual operations within a task and provides the status of the entire operation as it starts and completes on each target. The important thing to know is that operation status represents the status of the operation on every target.
Table 7 lists the information displayed in the Operations table. Table 7 Description of the operations table Column heading Description Operation Name The name of the operation that is run as a component of an Insight Control for Linux task. Status Complete: The operation has successfully completed on all targets. Running: The operation has started but it is not yet complete on all targets. Pending: The operation has not yet started.
Table 8 Description of the operation target details table Column heading Description Target Name The name of the target on which the operation was run on or is running on. Status Complete: The operation has successfully completed on the targets. Running: The operation has started but it is not yet complete on all targets. Pending: The operation has not yet started. Cancelling: you have cancelled the task by selecting the Stop button for the target or for the task.
Part II Deployment
5 Managing the Insight Control for Linux repository This chapter provides an overview of the Insight Control for Linux repository and how to perform activities related to it. The following topics are addressed: • “Introduction to the Insight Control for Linux repository ” (page 42) • “Registering items in the Insight Control for Linux repository” (page 45) • “Copying software to the Insight Control for Linux repository” (page 51) • “Editing and deleting registered items” (page 56) 5.
After an OS is registered with the repository, manually copy the vendor-supplied installation media to the appropriate directories in the repository. The media can be a physical CD or DVD, or it can be an .iso image. You must expand the .iso image into flat files. IMPORTANT: Be aware that repository management tasks do not follow typical authorization models. All HP SIM users can select, add, delete, or modify all Insight Control for Linux repository items regardless of their user authorizations. 5.1.
Figure 14 Remote repository using the CMS as a gateway 5.1.2 Repository contents Table 9 lists the classes of items that are stored in the repository. Table 9 Repository item types Name Description ISO ISO image PSP An OS-specific bundle of ProLiant optimized drivers, utilities, and management agents. Supported OS Vendor-supplied installation files for supported versions of RHEL or SLES.
Table 10 Default repository contents Item type Directory name examples Description PSP Dependency Script example_dependency.sh Provides a sample PSP dependency script that installs RPM dependencies on the managed systems; the PSP installation process requires the RPM dependencies.
simple process: you register the OS in the repository, copy the vendor-supplied installation files to the repository, and copy the appropriate boot files to the associated boot target directory. To register an OS in the repository, follow these steps: 1. Select the following menu item from the Insight Control user interface: Options→IC-Linux→Manage Repository 2. 3. Select New. From the Item Type drop down list, select either Supported OS or Custom OS.
Table 11 OS registration information (continued) Supply for supported OS, custom OS, or both Registration information Description the value you enter for the Path via HTTP must include the first directory name, such as http://192.0.2.1/sles10/CD1/ or http://192.0.2.1/sles11sp1/DVD1/. If you do not include the media subdirectory, the installation fails. NOTE: This item does not apply to VMware ESXi.
After OS registration, the next task is to copy the vendor-supplied OS installation files into the repository, which is described in “Copying software to the Insight Control for Linux repository” (page 51). 5.2.3 Registering PSPs A ProLiant Support Pack (PSP) provides the agents and drivers for use on HP servers. Certain agents must be installed on managed systems so that HP SIM and Insight Control for Linux can properly monitor and manage them.
• Associate the PSP to the operating systems it supports. Use the Ctrl-Left Mouse Button key combination to select all the operating systems that the PSP supports. The PSP must be properly associated with all operating systems it supports so that Insight Control for Linux installation tools can determine the PSP file from which to extract the required components during an OS installation. 6. 7. Select Save.
• 6. 7. Optionally, associate the configuration file with a custom OS. It is your responsibility to apply the commands in the installation task to retrieve it. Select Save. View the summary information, especially the path information, which provides the details of the repository registration including the directory and path created in the repository. Item Name: MyConfig Installation configuration path on disk: /opt/repository/instconfig/MyConfig/MyConfig.cfg Installation configuration path via http: .
9. Select OK to return to the Manage Repository screen. 5.2.6 Registering an ISO image To register an ISO image in the repository, follow these steps: 1. Select the following menu item from the Insight Control user interface: Options→IC-Linux→Manage Repository... 2. 3. 4. 5. 6. 7. Select New. Select the ISO image item type from the drop-down menu. Select Next. Select the radio button to indicate where the ISO image is currently hosted: either Locally on the CMS or Remotely on a remote HTTP server.
5.3.1 Copying RHEL into the local repository on the CMS The OS directory and the boot target directory where you copy the installation files were provided to you during the OS repository registration process described in Section 5.2.2 (page 45). You were instructed to record the paths to these directories.
To copy vendor-supplied SLES Version 10 OS installation files into the repository, follow these steps: 1. Count the number of installation media discs (CD or DVD) that were shipped with the SLES Version 10 distribution. 2. For each installation disc, create a sequentially numbered directory under the OS-specific directory in /opt/repository/os/ in the repository.
1. Visit the following web address to determine the appropriate link: http://drivers.suse.com/hp/ Choose the appropriate link: • For your server • For your server's architecture • For the version of the SLES operating system Read the install-readme.html file to verify the selection and for installation instructions. 2. Download the KISO image from the SUSE web address: # wget http://drivers.suse.
5.3.4 Copying virtual machine OS into the repository The procedure for copying virtual machine OS into the repository depends on the virtual machine software: • For VMware ESX, the process is identical to copying a RHEL operating system to the repository. For more information, see “Copying software to the Insight Control for Linux repository” (page 51).
2. Copy the compressed tar file (*tar.gz) to the PSP path on disk directory that was created when you registered the PSP in the repository (for example, /opt/repository/psp/ psp-redhatV50). Do not extract this file, because the installation process does it for you. IMPORTANT: Each /opt/repository/psp/psp-* directory must contain only the PSP tar.gz file.
5.4.2 Deleting registered items from the repository NOTE: Deleting an item from the repository does not delete the corresponding directory in the /opt/repository directory nor does it delete the files that you might have copied to that directory. If you want to delete or move the directory and files, delete or move them manually after you first perform the following procedure. To remove an item from the repository, follow these steps: 1.
6 Configuring network parameters for virtual media Topics include: • “Introduction” (page 58) • “Preparing for virtual media” (page 59) • “Using the Define Networks tool” (page 62) • “Using the Network Configuration Editor” (page 65) • “Next Step” (page 69) 6.1 Introduction Virtual media is a mechanism available only for systems with an iLO-based management processor. Virtual media allows a system to boot an ISO image over the network; it is the alternate boot mechanism to PXE.
Usually, network configuration is performed in two stages: • In the first stage, you define the network configuration parameters and store them under a network name. You can have as many network name definitions as you want. • In the second stage, you use the Network Configuration Editor to apply the predefined network names to the server's management processor. These tools are discussed in “Using the Define Networks tool” (page 62) and “Using the Network Configuration Editor” (page 65), respectively.
3. Select either the Discover a group of systems or Discover a single system button. There is a slight difference in the window for these two choices. The Discover a group of systems choice is in the illustration. 4. Enter a descriptive name in the Name text field. The descriptive name must be either listed in the CMS's hosts file or known to the CMS's name server. Otherwise, enter an IP address. 5. 6. Ensure that the Schedule check box is not checked.
specified when you installed Insight Control for Linux. The iLO is capable of supporting multiple user accounts; if your iLO was already configured with other user accounts you can just add another user account.
6.2.3 Licensing virtual media on the management processor Your iLO Advanced license key activates iLO Advanced features. For the latest instructions, which may supersede those shown below, see the following website: www.hp.com/go/insightlicense These instructions assume the network client has a network connection to the iLO-based management processor. To install the iLO Advanced license and enable the iLO Advanced functionality using a supported web browser: 1.
Figure 15 Define networks tool The parameters in the Define Networks tool include the following: • Available Networks This is a list of the network definitions. When you create a new network definition, its name is displayed in this list after pressing Save. When a network name in the Available Networks list is selected and you select the Load button, its network parameters are displayed in the appropriate fields; you can select only one network at a time.
You can enter a comma-separated list of ranges, for example: 192.168.10.5-192.168.10.50,192.168.11.100-192.168.11.199 If you want to assign IP addresses manually, leave this field blank. • SNMP Server(s) Optionally enter a list of SNMP servers. These entries are reserved for future use. • Name Server(s) Optionally enter a comma-separated list of DNS Name server IP addresses for this subnet. • NTP Server(s) Optionally enter a comma-separated list of NTP servers.
2. Select the Delete button. Unless there are any systems that had this network applied to them, the network definition is erased and its name is removed from the Available Networks list. 6.4 Using the Network Configuration Editor Use the Network Configuration Editor to assign networking parameters (that you defined with the Define Networks task) to the servers that will be booted using the virtual media mechanism; this ensures that the server's network is set up properly.
Figure 16 Network Configuration Editor page 4. 5. For each MP, optionally verify it by moving the mouse pointer over the Management Processor Name field, but do not select it. The MP's serial number and IP address are displayed to help you identify it. Each target MP is listed in a table. You have the option of: • Selecting individual target MPs. Click the checkbox in the left column of the individual target MP. • Selecting all the target MPs listed.
b. c. Enter the base name for the server names. For the example, the base name would be sage. Enter Iterator Start Value. For the example, that value would be 01 to ensure a leading zero. The number of digits that you enter for the value for the iterator determines whether the host names generated have leading zeroes. For example, if you entered comp for the base name and 001 for the iterator, the first available host name would be comp001, the next would be comp002, and so on. d. 7.
Selecting a network from this list assigns that network to the NIC represented by the MAC address selected in the Port/MAC Address column. This automatically assigns the next IP address available in the IP address range of the network and assigns the other network values (that is, the gateway, the name server, the domain, and the net mask) for that network to the NIC. If the IP address range was not specified, this field is blank and you must specify the IP address within the selected network here. 9.
In this dialog box, select Apply to set these values and close the dialog box. Selecting Cancel closes the dialog box without taking action. Save Selecting this button saves the settings for the selected targets to disk. Reload Selecting this button loads the settings for the selected target with the values stored in the disk file. Any changes that you did not save are lost.
7 Discovering systems, switches, and enclosures This chapter addresses the following tasks, which you must complete in the following order when you are configuring and setting up Insight Control for Linux: 1. “Discovering systems” (page 70) 2. “Assigning Insight Control for Linux licenses to discovered systems” (page 73) 3. “Preparing and discovering switches and enclosures” (page 74) 4. “Changing the boot method” (page 75) 7.
The following items are individual methods to discover a bare-metal server to be booted using PXE. Choose the one that applies. • Use the Initiate Bare Metal Discovery tool described in Section 7.1.2 (page 71). Be sure to select the PXE radio button in step 4. If you need information on discovering an iLO, see “Discovering the management processor with HP SIM” (page 59). • Power on the server, watch the console, and press the F12 key when prompted to initiate a one-time PXE boot.
IMPORTANT: • Ensure that HP SIM has discovered the MP. Use the HP SIM Options→Discovery... menu item Credentials for the MP will be from default MP credentials unless otherwise specified. Passwords for management processors either must be known to Insight Control for Linux or already set to global values. • You can also use the Initiate Bare-Metal Discovery tool to discover bare-metal systems through PXE or virtual media.
components of the PSP, but at a minimum, you must install the components, listed in Table 19 (page 135), which HP SIM requires. To download a PSP or obtain the associated HP ProLiant Support Pack User Guide, follow the instructions in Section 26.7.2 (page 250). 2. On the system to be discovered, use the following command to configure SNMP: # /sbin/hpsnmpconfig 3. 4. Repeat steps 1 and 2 for every installed system to be discovered. When you are finished, proceed to step 4 in this procedure.
7.3 Preparing and discovering switches and enclosures To discover switches and HP BladeSystem enclosures for Insight Control for Linux monitoring, follow these steps. Skip this task if the configuration does not contain enclosures or switches or you do not want to monitor them with Insight Control for Linux. 1. If one or more HP BladeSystem enclosures are present, go to each enclosure and set the Onboard Administrator (OA) user name and password credentials.
f. g. h. i. j. k. Scroll to the {collection_name}_Switches subcollection. Select the radio button next to the {collection_name}_Switches subcollection. Select Edit.... Select a switch listed in the Available Items column, and use >> to move it to the Selected Members column. Select OK to add the switch to the Insight Control for Linux {collection_name}_Switches subcollection. Repeat the last two steps for every switch you want to monitor. 7.
7.5 Next steps If you are configuring Insight Control for Linux for the first time, proceed to Chapter 8 (page 77) to install and set up your managed systems.
8 Setting up managed systems This chapter is an overview on setting up managed systems for Insight Control for Linux monitoring. This chapter addresses the following tasks, which you must complete in this order: 1. “Populating the Insight Control for Linux repository” (page 77) 2. “Setting up management hubs” (page 150) 3. “Linux OS installation” (page 77) 4. “Setting up managed systems for monitoring” (page 77) 8.
8.3.1 Opening network ports on managed systems The network ports listed in Table 12 are used for communication between the managed systems and the CMS. These ports must be open to network traffic. If you used Insight Control for Linux to install an OS and you used a configuration derived from a supported template, the firewall is enabled by default and Insight Control for Linux opens the ports listed in Table 12 automatically.
# /bin/hostname If the node does not report a host name, set one or configure DHCP to assign one. DHCP configuration information is located in the HP Insight Control for Linux Installation Guide. 2. On the CMS, run the mxgethostname command with the host name obtained in the previous step. For this example, the host name is venus: # mxgethostname -n venus If the CMS recognizes the host name, command output is similar to the following: Host name: venus.example.com DNS Name: venus.example.
Figure 18 Installing providers and agents 3. 4. Select Next>. Review the settings for Configure or Repair Agents, as shown in Figure 19. Insight Control for Linux requires you to make settings in the Configure SNMP and Configure secure shell (SSH) access authentication sections of this screen.
Figure 19 Settings for configure or repair agents 5. Make the following settings to configure SNMP: • Select Set read community string and enter the value for your network configuration. NOTE: To discover or identify a server that becomes a managed system, HP SIM requires that a SNMP read community string must be set to public in the global credentials for that server. There may be additional read community string settings in addition to public, but public must be specified. 6. 7. 8.
9. Select the Use the following credentials for all systems radio button and supply the managed system credentials, which is typically the root user name and password. 10. Select Run Now. Selecting a protocol that is not supported in your environment causes an error and a task reports its status as failed. Even if this happens, it is possible that the SNMP and SSH settings required for Insight Control for Linux were configured correctly. Look at the task results to verify this.
2. 3. Save your changes and exit the text editor. Use a text editor to add the following line to the /etc/securetty file (if this line is not already in the file): ttyS0 4. 5. Save your changes and exit the text editor. Use a text editor to add console=ttyS0 to the default entry in the /boot/grub/menu.lst file (if this entry is not already in the file). NOTES: • The /boot/grub/menu.lst file might be a symbolic link to the /boot/grub/ grub.conf file.
After you make these changes, all system start up and shut down messages are directed to the serial console you selected and are not directed to the graphics display. Therefore, the system boot appears to be different. This is expected and normal behavior. To view the system console, use the CMF utilities on the CMS. For more information about using CMF, see Chapter 22 (page 186). 8.3.
9 Installing operating systems on managed systems This chapter addresses the following topics: • “Linux OS installation overview” (page 85) • “Using installation configuration files for unattended installations” (page 86) • “Prerequisites to OS installations on managed systems” (page 91) • “Installing RHEL on managed systems” (page 93) • “Installing SLES on managed systems” (page 95) • “Installing VMware ESX and VMware ESXi operating systems” (page 96) • “Installing another variant of Linux on
Table 13 Types of Installation Sessions (continued) Installation Interactive Unattended Custom or Other Custom or Other Interactive Custom or Other (Unattended) VMware ESX VMware ESX Interactive VMware ESX (Kickstart) VMware ESXi VMware ESXi Interactive For more information about using Kickstart and AutoYaST files for unattended installations, see Section 9.2 (page 86).
• Copy the Kickstart or AutoYaST files or files to the appropriate directory in the repository. IMPORTANT: • HP provides a default set of basic Kickstart and AutoYaST installation configuration files for each supported OS. HP recommends copying and using the default installation configuration files as templates to create customized installation configuration files that are suitable for your environment.
• OS version Directory name in /opt/repository/instconfig RHEL Version 6 Update 1 (for KVM Virtual Hosts) rh061–virt-host-kvm RHEL Version 6 Update 1 (for KVM Virtual Guests) rh061–virt-guest-kvm VMware ESX Version 4.1 esx041 The associated installation configuration files are stored in the OS-specific directory under /opt/repository/instconfig and use the same naming convention. For example, rh061.
Table 14 Insight Control for Linux macros for installation configuration files (continued) Macro name Description This macro expands to a shell script that puts the trapsink directive in the SNMP configuration to direct the system to sent a cold-start trap to the CMS when the server is rebooted. %%completion%% This macro expands into a shell script that contacts the CMS over HTTP to inform it that the installation is complete. This is the last macro that runs.
9.2.4 Configuring an operating system for console redirection The Insight Control for Linux cmfd daemon, which runs on the CMS and management hubs, captures the console output from managed systems and stores it in a file named /hptc_cluster/adm/logs/cmf.dated/current/console_name.log, where console_name identifies the managed system. For more information, see Section 22.1 (page 186). By default, the operating system directs its output to the graphics console.
1 The designation Space-TAB means a space character followed immediately by a tab character. Thus. this line can be interpreted as: /^[ \t]*kernel For managed systems that are virtual hosts: ex /boot/grub/menu.lst <
• You registered the PSP in the repository and associated it with the OS, and you manually copied the associated PSP file to the repository path the registration process created. • If you are installing a 32-bit operating system on a managed system that has more than 64 GB of memory, be sure to specify the mem=60gb kernel option in the kernel append line during installation. Additional prerequisites might apply for specific servers. See the following section, if it applies to your environment. 9.3.
# mount options Image_name /mnt/ # cp /mnt/boot/x86_64/loader/initrd /opt/repository/boot/SLES11SP1-x64Boot/ # cp /mnt/boot/x86_64/loader/linux /opt/repository/boot/SLES11SP1-x64Boot/ • Ensure that you have the correct PSP in the repository. For information on the PSP version, see the HP Insight Control for Linux Support Matrix. • Specify a Kickstart or AutoYaST file derived from the templates specifically for the server from the Insight Control for Linux Repository when installing the OS. 9.
NOTES: • During installation, when specifying the HTTP setup, you are prompted for the IP address of the CMS and the path name for the RHEL installation. For example: http://CMS-IP-addr:CMS-port/path-name Where: CMS-IP-addr • is the IP address of the CMS CMS-port is the port number of the repository web server that you specified when you installed Insight Control for Linux. The factory default value is 60000.
9.5 Installing SLES on managed systems This section describes the two methods for installing SLES to one or more managed systems: • “Installing SLES using an unattended method” (page 95) • “Installing SLES interactively” (page 95) NOTE: When you use Insight Control for Linux installation tools to install SLES on a managed system, Insight Control for Linux automatically edits the /etc/ssh/sshd_config file and turns on password authentication in this file.
• BL685c G7 • SL165z G7 • DL385 G7 • DL685 G7 Use the following table to determine the boot parameter to enter in the Kernel append line text field. For this operating system: Use this kernel boot parameter: SLES 10 SP3 (x86) pci=nomsi SLES 10 SP4 (x86) pci=nomsi SLES 11 SP0 (x86) apic=bigsmp SLES 11 SP1 (x86) Not applicable 9.
5. Select the virtualization OS to install and select Next>. Only the virtualization OS that applies to your installation is available for you to select from the menu. IMPORTANT: The list contains only those virtualization operating systems that are registered in the repository and copied to it. If you select a virtualization OS that was registered, but the installation files were not copied to the repository, a validation error appears. 6. Select the Kickstart file and select Next>.
2. 3. 4. 5. Do one of the following to select and verify that the servers in the target list are the servers you want to install an OS on: • Proceed to the next step if the target list is correct. • Select Add Targets... or Remove Target to modify the list, if the list is incorrect. • If no servers are in the list, do the following: a. Select Collection. b. Select All Servers from the drop down menu. c. Select View Contents to display and select from the list of available servers. d.
NOTE: When performing an ESXi installation using virtual media, to facilitate the installation, Insight Control for Linux does not automatically remove the ISO image that was created. This ISO image contains the RAM Disk and removing the ISO image while RAM disk is loaded causes the installation to fail. HP recommends, if disk space is a concern, that you remove the ISO image manually. The ISO image is named using the server's Globally Unique IDentifier (GUID).
3. Create the following scripts, as needed: Script Description auto_config Required for an unattended installation, this script performs macro substitution so that a working copy of your installation configuration file has the actual values required for your installation. boot_stanza This script constructs a boot stanza that specifies your kernel and RAM disk, which enables your boot loader to boot your custom OS.
3. 4. 5. 6. Do one of the following to select and verify that the servers in the target list are the servers you want to install an OS on: • Proceed to the next step if the target list is correct. • Select Add Targets... or Remove Target to modify the list, if the list is incorrect. • If no servers are in the list, do the following: a. Select Collection. b. Select All Servers from the drop down menu. c. Select View Contents to display and select from the list of available servers. d.
If you want the target system to use the default root password (root), select the Use Default Root Password option. To set a root password other than the default, select the Specify Root Password option, enter the root password, choose the password encryption option, enter the root password, and verify the entry. HP recommends setting a strong root password on all your severs. 10. Do one of the following to start the installation: • Select Run Now to launch the OS installation operation immediately.
10 Capturing and deploying Linux images This chapter addresses the following topics: • “Overview of capturing and deploying Linux images ” (page 103) • “Prerequisites to capturing a Linux image” (page 105) • “Capturing a Linux image from a managed system” (page 108) • “Preparing for scalable deployment” (page 109) • “Deploying a captured Linux image to one or more managed system” (page 112) • “Insight Control for Linux partition wizard overview” (page 115) 10.
NOTE: To account for the time it may take to capture or deploy a very large image over a slow network, a time out of five days is in effect for capturing or deploying a Linux image so that you can determine if an operation hangs. HP recommends that you check your task results to verify the status of any running jobs. 10.1.1 File system types Table 16 lists the supported and unsupported file system types on the source and target managed systems for Linux image capture and deployment tasks.
The script is run in a chroot environment so there is no need to configure paths relative to the Insight Control for Linux environment. For information on how these scripts can be used, see the comments in the example scripts provided with Insight Control for Linux. 10.1.
Table 17 Source and target deployment requirements Item Requirement Server type The hardware models of the source and target managed systems must be the same. For example, if you capture an image from an HP ProLiant BL460 Gen8 server, you can only deploy that image to another BL460 Gen8 server. Memory Differences in the amount of memory on the source and target managed systems are permitted. Number of NICs Differences in the number of NICs on the source and target managed systems are permitted.
• For SLES images, change the hard links to soft links before capturing the image. SLES relies on the use of hard links within its file system, and the tar command that captures the image captures those hard links. If a partitioning scheme is used during deployment that distributes files to multiple file systems (like separate /usr and /var partitions), the tar command does not allow hard links to be established across separate file systems. This generates an error, causing the task to fail.
The following example does not include the contents of /scratch in the captured image (because the dump flag is set to 0). During the image deployment operation, the disk is repartitioned and /scratch is an empty file system. /dev/sdc1 /scratch ext3 defaults 0 0 10.3 Capturing a Linux image from a managed system IMPORTANT: Remember that captured images are retrieved through a web server interface that allows anonymous access.
7. Select a Precapture script, a Postcapture Script, or both. A Precapture script is run on the managed system before the image is captured. A Postcapture script is run on the managed system after the image is captured. The default behavior is to not run either type of script. 8. Do one of the following: • Select Run Now to launch the image capture operation immediately. • Select Schedule to schedule the image capture operation to occur in the future. 9.
Figure 20 Network groups example The concept behind a scalable deployment is to transfer an OS image tar file from the CMS to the group leader in each network group. After the image tar file is completely transferred, the group leader transfers the image to each of the remaining servers in the network group. The advantage to this concept is that all network traffic is kept local to the switch or enclosure.
The Customize Collections window appears. 2. Select New... in the Customize Collections window. A new section titled New Collection appears at the bottom of the Custom Collections window. 3. 4. Select the Choose members individually radio button. Select All Servers from the Choose from: menu. This action populates the Available Items: list with the available servers. 5. Perform the following steps for each switch you have: a.
f. Select Save As Collection... The Save As Collection portion appears. g. Enter a name for this network group. The name is used only to associate the managed systems in the network group. h. i. j. Select Existing collection: and choose the Network Groups menu item. Select OK to continue. Generate the netgroups.conf file with the following command: # /opt/hptc/bin/netgroup --ofile /opt/mx/icle/netgroups.conf k. Examine the netgroup.conf file to verify the collection entry for the group.
IMPORTANT: • HP recommends that, if you are deploying the image to a software RAID array or an LVM volume, that you wipe the disk or disks that will receive the image. • Before deploying a 64-bit OS image to an AMD Opteron 6200 server, add the following entry to the /opt/mx/icle/icle.
• Select the Create partition scheme from wizard option if you want to customize the disk partition layout on the target managed system, and the following table appears: Figure 21 Existing disk partition scheme See Section 10.6 (page 115) for a general overview of the Partition Wizard and how to use it to edit disk partitions and volume groups. Select Next> after you have completed customizing the disk partition layout. 9.
10.6 Insight Control for Linux partition wizard overview The Insight Control for Linux Partition Wizard is a generic hybrid of the Red Hat and Novell Partition Wizards. The Partition Wizard does not have logic to examine the managed systems on which it is used, thus you must have prior knowledge of the storage hardware. The Partition Wizard enables you to capture an image with one partition scheme and then to deploy the image to one or more managed systems with a more customized partition scheme.
• If you are capturing and deploying a reiserfs or an ext3 partition type, ensure that the mount points are set, as required. Partition types swap and lvm do not have mount points. The Partition Wizard permits you to proceed without specifying mount points for the reiserfs and ext3 partition types, and it does not detect the missing mount points. This might cause the deployment to fail, and the failure is indicated in the Task Results. • The Partition Wizard does not save entered values for reuse.
The initial Partition Wizard table is divided into two sections: Hard Drives, the top of the table that shows the physical devices, and Volume Groups, the bottom part of the table that shows logical volumes: • The Hard Drives section represents the physical media on the server. You must have prior knowledge about the hardware in order to add the correct number if disks. You can add a maximum of 16 disks to the Hard Drives section along with a maximum of 16 partitions per disk.
11 Installing and setting up virtual machines This 1. 2. 3. 4. 5. 6.
2. 3. 4. Set the Global Sign-In credentials for the virtual host with the Options→Security→Credentials→Global Credentials... menu item. Install the operating system with virtualized configuration on the physical server of your choice. Chapter 9 (page 85) describes the steps for using Insight Control for Linux to install a Linux operating system. Run Options→Identify Systems... to verify the installation. The next step is to register the virtual host with the virtual machine management. 11.
3. Examine the system page for the virtual host with Tools→System Information→System Page... task to verify that Insight Control virtual machine management is configured correctly. Locate the System Subtype row under Product Description. The description should contain the text Virtual Machine Host. 11.3 Creating and installing virtual guests Generally this section discusses how: • To create the virtual guest: HP suggests that you use the vCenter application for VMware ESX and VMware ESXi.
6. 7. Boot the VM guest, and proceed through an interactive install. Perform a network installation using an installation configuration file from the Insight Control for Linux repository. Be sure to specify any required kernel parameters. The following is an example of the response to the boot prompt. boot: linux ks=http://cms:port/instconfig/os/os.
TIP: Match the machine name to the host name in a virtual machine map. See Section 23.10 (page 194). • Ensure that the localhost (QEMU) is connected. If the localhost entry is missing, select File→Add Connection, then select QEMU/KVM as the hypervisor, specify that the connection is Local, and select Connect. If the localhost entry exists but is not connected, right-click on the localhost entry and select Connect. • Start the procedure by selecting New. • Specify a unique name for the virtual guest.
11.3.2.2 Installing a SLES KVM virtual guest Use the following guidelines for installing a SLES KVM virtual guest: • Verify that the AutoYaST file for the virtual guest resides in the /opt/repository/instconfig/osver-virt-guest-kvm directory on the CMS, where osver indicates the operating system version, for example, sl111. The format of the AutoYaST file name is osver-virt-guest-kvm.cfg • Installing a SLES KVM virtual guest requires an ISO. Download the ISO and copy it to the KVM virtual host.
• Before you select OK to start the installation, be advised that you have 20 to 30 seconds to specify that an Installation is to be performed on subsequent screens. If the timeout elapses, the virtual guest attempts to boot from the hard disk. The following needs to occur within this time: ◦ The virtual guest console should open automatically after you select OK. If it does not, locate the virtual guest's name in the virt-manager utility, right-click on it, and select Open to open the monitoring console.
IMPORTANT: The RHEL Kickstart and SLES AutoYaST configuration template files for virtual guests are delivered with a hard-coded root password, which poses a security issue if used without modification. For secure installations, HP recommends that you install the virtual guest operating systems in a manner that keeps the root password secure, such as an interactive installation, or use a Kickstart or AutoYaST file that is properly protected on the local host.
• Specify the Simple file option for the storage space assignment. • Select an available physical device for the connection to the Host Network, for example, peth0. • To monitor a virtual guest, it must be assigned a well-known IP address. This can be either the static IP address that you entered when you installed the virtual guest or, if you used DHCP, the fixed IP address that maps to the MAC address you establish. For more information, see Section 23.10 (page 194).
11.5 Establishing monitoring for virtual hosts and virtual guests NOTE: Insight Control for Linux does not support monitoring of VMware ESXi virtual hosts or virtual guests running Microsoft Windows Configuring a virtual host or a virtual guest for monitoring is the same procedure as for real managed systems. In short, the procedure consists of the following Insight Control for Linux menu items: 1. Configure→Configure or Repair Agents... on the virtual guest.
NOTE: For specific commands, see the virsh(1) and virt-manager(1) manual pages that accompany your KVM distribution.
12 Using Insight Control for Linux to update HP ProLiant firmware This chapter addresses the following topics: • “Overview of updating HP ProLiant firmware” (page 129) • “Basic firmware update functionality” (page 129) • “Advanced firmware update functionality” (page 132) 12.1 Overview of updating HP ProLiant firmware Keeping firmware up to date is a challenging but necessary task. Each ProLiant server usually has several devices that require regular firmware updates, which can create a burden.
12.2.1 Initial setup Before you can initiate a firmware update on a server, you must download and prepare the firmware files and tools that do the work. Insight Control for Linux uses the HP Smart Update Firmware DVD for all firmware updates. Downloading and installing these files is a one time setup operation, although when new versions of the HP Smart Update Firmware DVD become available, update the tools on your CMS.
HPSUM_FLAGS=parameters Where parameters indicates the hpsum command's option flags you want to use. For example: HPSUM_FLAGS=--downgrade For more information on the hpsum command's option flags, see Section 12.2.5 (page 131). 12.2.4 Viewing the results of a firmware update The results of a firmware update are captured in the Insight Control for Linux task results.
# Copy latest firmware from /root cp /root/CP009403.scexe . # Download latest iLO firmware direct from hp.com wget ftp://ftp.hp.com/pub/CP009237.scexe # Remove system BIOS so it won’t get updated rm CP009139.scexe NOTE: When downloading new firmware files or removing older files, Insight Control for Linux uses only the files designated as Linux Online Flash Component; these files end with the .scexe extension. Be sure you are manipulating the correct file types.
described in the Section 12.2 (page 129). If the file is found, it is scanned for a reference to the server being updated. If a reference is found, it is acted upon. The configuration file contains one line for each server requiring customized firmware. Each line has the following format: system=firmware-filename Where system Is one of the following: 1. A host name 2. An IP address 3. A MAC address 4. The text string default These system values are given in precedence order, from highest to lowest.
The MAC address is the MAC address of a new prototype server which always needs the latest revisions as soon as possible, so a separate firmware tar file was created for that system. Lastly, there is no default line, so any servers not specifically listed in this file perform a normal firmware update using the default firmware tar file, production-firmware.tar. Example 2 08:00:2b:c4:aa:1f=firmware-files.tar devel-server1=new-device-fw.
13 Installing PSPs on managed systems This chapter addresses the following topics: • “Overview of the PSP installation tool” (page 135) • “Required PSP components” (page 135) • “Creating a PSP dependency script” (page 136) • “PSP installation procedure” (page 137) 13.1 Overview of the PSP installation tool The Insight Control for Linux PSP installation tool enables you to install any or all PSP components on one or more managed systems.
1 This agent is installed on servers with iLO 4 management processors. While HP SIM requires hp-ams so that it can use embedded features of the iLO 4 management processor, Insight Control for Linux does not use it. If you want to use only the hp-ams agent on your iLO 4–based servers, you must manually remove the other agents. The Agentless Management Service (AMS) will be responsible for sending all host operating system-specific to the iLO 4 firmware.
Managed systems are rebooted when the PSP installation script is finished, regardless of the outcome of the PSP installation. The reboot is required so that HP SIM can continue to properly manage the managed system ensures that all drivers and agents are properly started. IMPORTANT: If an errata kernel is installed on the managed system, ensure that the PSP package you want to install supports the errata kernel version. 13.
15. Select the following menu item from the Insight Control user interface to view the task results: Tasks & Logs→Task Results If the PSP installation completed successfully on a target managed system for all selected PSP components, the final state of the task on that system is Complete. If any of the selected software components did not install successfully on the target managed system for any reason, including package dependency failures, the final state of the task on that system is Failed.
14 ISO control operations ISO Controls allow you to boot from an ISO image, insert an ISO image, and eject an ISO image on iLO-based managed systems. You can use this functionality to perform interactive OS installations from OS distribution ISOs, including Windows. The ISO image must be registered in the Insight Control for Linux repository before you can perform these operations. For information on registering an ISO image, see “Registering an ISO image” (page 51).
15 Remote server controls The menu items on the Tools→Server Controls menu enable you to remotely manage power control on a physical managed system. IMPORTANT: Be aware that the Insight Control for Linux server controls operate by contacting the management processor of the server directly and executing the requested power function. That means that servers are powered off or cycled abruptly without a graceful shutdown.
16 Using SSH for remote server management Insight Control for Linux provides several ways for you to access a managed system through SSH. This chapter addresses the following topics: • “Setting SSH credentials on managed systems” (page 141) • “Setting SSH credentials for users” (page 141) • “Running a command on multiple managed systems” (page 142) • “Using Insight Control for Linux to run commands and scripts through SSH” (page 143) 16.
Deploy→Operating System→Capture Linux Image On the Task Results screen, the Task Instance Results always shows the user who launched the task. This might not be the credentials used for the task execution. Because different target managed systems can have different users specified in the SSH settings, the same task can run on different targets as different users. 16.
16.4 Using Insight Control for Linux to run commands and scripts through SSH The following menu items enable you to run a script or command through SSH to one or more managed systems: • Tools→Command Line Tools→Run SSH Command... • Tools→Command Line Tools→Run Script... 16.4.1 Running an SSH command The Tools→Command Line Tools→Run SSH Command... runs a command on a target managed system.
The Run Script... task feeds the command lines in the script to an SSH instance on the target system. The script is a series of command lines to be run on the target system using SSH. The Linux script you run must be located in the Insight Control for Linux repository in the /opt/ repository/script directory. You must ensure that the Linux script does not leave any open file descriptors upon completion (including scripts you might have called).
Part III Monitoring
17 Managing Insight Control for Linux collections This chapter addresses the following topics: • “Introduction to collections” (page 146) • “Populating a collection” (page 147) • “Adding servers and switches to an Insight Control for Linux collection” (page 147) • “Removing a managed system or switch from an Insight Control for Linux collection” (page 148) 17.
Table 20 Insight Control for Linux subcollections (continued) Object type Subcollection name Description How populated servers that Insight Control for Linux manages. Switches {collection_name}_Switches Insight Control for Linux monitors Populated manually only. all switches placed in this subcollection. Management Hubs {collection_name}_Management_Hubs This subcollection contains all Populated manually. For the servers that are designated information, see Section 18.2 as management hubs.
2. ◦ Configure SNMP and SSH keys ◦ Configure console access and logging Add the servers or switches to the Insight Control for Linux collection: a. Select Customize... in the left pane of the HP Insight Control user interface;. b. Scroll down the name column until you see Systems Managed by IC-Linux. c. Select the plus sign (+) to expand it. d. Scroll down until you see your Insight Control for Linux collection. e. Select the plus sign (+) to expand it. f.
b. c. d. e. f. g. Scroll down the name column until you see Systems Managed by IC-Linux. Select the plus sign (+) to expand it. Scroll down until you see your Insight Control for Linux collection. Select the plus sign (+) to expand it. Scroll down until you see the Insight Control for Linux subcollections. Do one of the following: • If you are removing a managed system from the collection, select the radio button next to the {collection_name}_Servers subcollection.
18 Setting up management hubs 18.1 About management hubs A management hub is an aggregation point for management activities. Insight Control for Linux uses management hubs to distribute the management load across multiple servers. HP recommends creating multiple management hubs if you plan to monitor over 256 managed systems. You have the option of choosing any physical server to act as a management hub; you can elect to use the CMS as a management hub or not.
2. 3. Install the operating system for that server using the appropriate Kickstart or AutoYaST file; this file has the form *-management-hub.cfg to ensure that the required RPMs are installed. For specific information on installing operating systems, see Chapter 9 (page 85). Add the server to the {collection_name}_Management_Hubs collection as follows: a. Select Customize... in the System and Event Collections panel. This figure shows the location with a red arrow.
There are two text fields, Collection name and Choose from, and two lists, Available items and Selected Members. e. Select All Servers from the Choose from: menu. This action populates the Available Items: list with the available servers. f. Select the server from the Available Items: list. You can use Ctrl-Left Mouse for multiple selections. g. h. i. Use the >> button to move the selected servers from the Available Items: list to the Selected Members: list. Select OK.
19 Configuring monitoring services This chapter describes how to configure Insight Control for Linux monitoring services. In addition to an Section 19.1 (page 153), this chapter addresses the following tasks, which you must complete in this order: 1. “Configuring a self-signed Apache certificate on the CMS” (page 153) 2. “Starting management and monitoring services ” (page 153) 3. “Installing Insight Control for Linux management agents” (page 155) 4.
• It also deploys the Insight Control for Linux management agents to all servers in the {collection_name}_Servers subcollection. For information on managing subcollections, see Chapter 17 (page 146). Insight Control for Linux monitors only the objects in these collections: • Either all licensed servers are automatically added to the {collection_name}_Servers subcollection or only the servers in the {collection_name}_Servers collection, depending your response on the Auto-populate option.
• Enter no if you want Insight Control for Linux only to manage and monitor only the servers that you manually put in {collection_name}_Servers collection. TIP: 5. Populate your collections manually before proceeding. Select Run Now. This task can take several minutes to configure services. The Stdout tab shows the scripts that are running, and Done appears when this task is complete. 6.
3. Ensure that the pdsh command can run a command across all the managed systems. For example: # pdsh -a uptime pluto: 3:22pm up 0:49, 1 user, load average: 0.47, 0.47, 0.40 charon: 11:02am up 0:49, 1 user, load average: 0.38, 0.36, 0.36 poseidon: 9:46am up 1 day 4:46, 3 users, load average: 1.10, 1.23, 1.34 4. Verify that the nrpe daemon is working on all the managed systems with the following command: # /opt/hptc/nagios/libexec/gather_all_data --verbose write 4048, 2, 2, eth1 to db => icelx2 (charon.
If Warnings Are Reported If one or more warnings are reported in the Warning column, use the analyze option to obtain an analysis of the problem. When possible, the command output provides potential corrective action or the reasons for a given state.
20 Using graphical tools to monitor managed systems This chapter addresses the following topics • “Insight Control for Linux system monitoring overview” (page 158) • “Nagios overview” (page 159) • “Using Nagios” (page 162) • “Services monitored by Nagios” (page 170) • “Understanding Nagios alert messages” (page 172) • “Understanding system event log monitoring ” (page 173) • “Configuring Nagios email alerts” (page 173) • “Monitoring Metrics in real time” (page 174) 20.
NOTE: Insight Control for Linux does not support monitoring of virtual hosts running VMware ESXi , and does not support servers or virtual guests running Microsoft Windows. 20.1.1 Collecting metrics through a management processor Insight Control for Linux supports management processors using the iLO or IPMI protocols for gathering sensor and system event log information. To access a system’s management processor, you must configure the management processor credentials in HP SIM.
Nagios obtains its sensor and metric data from the Supermon open source monitoring application, which is integrated with the Insight Control for Linux. Figure 23 illustrates the interaction of these tools. Figure 23 System monitoring tools integration The mond and syslog daemons run on every managed system. The Supermon service manages requests for mond daemons that run on a subset of systems.
20.2.2 Launching Nagios To launch Nagios, you must have a valid certificate for the Apache service. To configure an Apache certificate, see Section 19.2 (page 153). Select the following menu item from the Insight Control user interface to launch Nagios: Tools→Integrated Consoles→Nagios The Nagios main window shown in Figure 24 appears when you launch Nagios. Figure 24 Nagios main window From the Nagios main window, you can choose any of the menu options on the left navigation bar.
Hosts Services Host Groups Summary Grid Service Groups Summary Grid Problems Services (Unhandled) Hosts (Unhandled) Network Outages Reports Availability Trends Alerts History Summary Histogram Notifications Event Log HP Graph System Comments Downtime Process Info Performance Info Scheduling Queue Configuration NOTE: The term Hosts on the Nagios window refers to any object with an IP address, not just managed systems. Keep this in mind when using the Nagios application. 20.
Figure 25 Nagios tactical overview The top of the window provides information about the network. It provides the number of network outages and information on the network health in terms of the Nagios hosts and Nagios services. The next portion of the window contains information about the Nagios hosts. It reports the number of hosts that are down, unreachable, up, and pending. In Figure 25, two hosts are down.
A disabled service is a configuration status, not an error condition. Insight Control for Linux takes advantage of the Nagios passive check feature to optimize and to minimize data collection and reporting across large numbers of managed systems. NOTE: HP recommends that administrators do not enable these services because they are not meant to run under normal conditions and causes Nagios to generate false alerts. Nagios services are described in the next portion of the window. 20.3.
Figure 27 Nagios service detail view The Status column displays any problems that might be occurring. To display the status of a service, select the link for the service in the Service column to open the Nagios Service Information view shown in Figure 28. 20.
Figure 28 Nagios service information view 20.3.3 Displaying hosts and services that are experiencing problems The Service Problems view, which is accessed by selecting Problems Services (Unhandled) in the Nagios menu, is useful for configurations with hundreds of systems. It identifies the Nagios hosts that are experiencing problems, and it shows only the corresponding Nagios services with status that is not OK, which enables you to monitor only those Nagios hosts that need attention.
Figure 29 Nagios service problems view Select the link that corresponds to a Nagios host to open the Nagios Host Information view for that Nagios host. You can also use the Nagios report generator, nrg, to obtain an analysis of Nagios services: # nrg --mode analyze For more information and examples of its use, see nrg(8). 20.3.
Figure 30 HP Graph default overview display Figure 31 HP Graph detail display of managed systems If you want to display the graphical data for a selected Nagios host (a Nagios host can be a virtual host), select an item in the menu in the upper left-hand side. Figure 32 (page 170) shows the graphs for one managed system, osmone. The following menus and menu items control the information you can display for a managed system: • The Metric menu influences the information shown in the graphs.
• • cpu iowait Reports the percentage of time the system was waiting for I/O to complete or to handle an interrupt. cpu system Shows how much of the CPU time was spent on system-level tasks. cpu usage Reports how much of the managed system's CPU set was spent in the user, system, and nice states. This is the default view. load average Reports the 1, 5, and 15 minute load averages. mem buffers Shows how much of the managed system's memory is allocated to system-wide memory buffers.
Figure 32 HP Graph host display for one managed system 20.3.5 Gathering and displaying system environment data Insight Control for Linux provides plug-ins that monitor the environment data on each managed system such as temperature and fan speed, which can be indicators of possible system failure. To display environment data, select the Service Problems menu item in the left frame of the Nagios main window to open the Service Status for All Hosts window.
Table 21 Nagios monitoring plug-ins running on the CMS Service name Plug-in name Function/Description Apache HTTPS Server check_http Monitors the Web server providing the Nagios Web interface. Configuration Monitor check_node_config Periodically generates and updates configuration information for managed systems. IP Assignment DHCP check_procs Watches the DHCP service on the CMS. Management Settings Monitor check_nagios_vars Watches the /opt/hptc/etc/sysconfig/vars.
Table 22 Services monitored on managed systems (continued) Service name Function/Description The System Event Log is collected through the management processor, either an iLO or an IPMI BMC. System Events are hardware-related alerts such as memory errors, power supply faults, and so on. System Free Space2 Displays the system free space in /root, /tmp, /var, and /hptc_cluster. This data is compared to thresholds defined in the nagios_vars.ini file.
3 4 5 6 2 Critical other Unknown The name of the Nagios service description. For more information, see the corresponding /opt/hptc/nagios/etc/templates/*_template.cfg template file. The alert applies to this host name. The IP address of the host. The message text generated from the plug-in. In the following example, indicates that the Nagios monitor running on iclx47 collected this data.
host_notification_period service_notification_options host_notification_options service_notification_commands host_notification_commands email pager } 24x7 w,u,c,r d,u,r notify-by-email,notify-by-epager host-notify-by-email,host-notify-by-epager nagios@localhost.localdomain nagios@localhost.localdomain Changing the values for email and pager to reflect the system name enables Nagios to send notification through the sendmail utility. For example, change nagios@localhost.localdomain to nagios@example.com.
• Allows user customized and predefined metrics 20.8.3 Performance Dashboard requirements The servers you want to monitor must fulfill the following requirements for using the Performance Dashboard tool; the servers must be: • Licensed for Insight Control for Linux • Configured to use Insight Control for Linux monitoring services, as described in Chapter 19 (page 153) 20.8.
Figure 34 Monitoring three metrics using Performance Dashboard 20.8.4.1 Ring plot color coding The colors that the Performance Dashboard ring plot segments use represent the following: • Light Gray means that a managed system is actively reporting data. • Pink represents the actual value of the metric. • Dark Gray means that a managed system is not reporting data and might be down.
2. 3. 4. 5. Select target managed systems. You can select individual managed systems or all managed systems in the icelx_servers subcollection. Select Apply to move the selected managed systems to the target list. Verify the target list. Select Run Now to launch the Performance Dashboard tool. 20.8.6 Using the mouse buttons to manipulate the Performance Dashboard tool Table 23 describes how to use the mouse to manipulate the Performance Dashboard tool.
• User Time • System Time • Nice Time • Idle Time • Load Averages (1-Minute, 5-Minute, And 15-Minute Intervals) • Total Processes • Total User Processes • Total Zombie Processes • Network Received MB • Network Received Packets • Network Received Dropped Packets • Network Received Errors • Network Transmitted MB • Network Transmitted Packets • Network Transmitted Dropped Packets • Network Transmitted Errors • Total Swap • Swap In Use • Pages In • Pages Out • Pages Swa
21 Using the command line to view managed system status Insight Control for Linux provides commands that you can run on the CMS to determine the status of managed systems. This chapter addresses the following topics: • “Archiving sensor metrics on an individual basis” (page 179) • “Displaying usage, statistics, and metrics with the shownode command” (page 180) • “Displaying environmental data” (page 184) • “Reporting usage information and host and service status” (page 184) 21.
Example 6 Expanded sensor metrics # shownode metrics sensors iclx1 Timestamp |Node_Id |Name |Value |Description -------------------------------------------------------------------------date_and_time |iclx1 |Temp 8 Memory |54 |Celsius; ok date_and_time |iclx1 |Temp 5 CPU |31 |Celsius; ok date_and_time |iclx1 |Temp 2 CPU |33 |Celsius; ok date_and_time |iclx1 |Temp 7 CPU 2 |30 |Celsius; ok date_and_time |iclx1 |Temp 1 System |40 |Celsius; ok date_and_time |iclx1 |Temp 6 CPU 2 |30 |Celsius; ok date_and_time |ic
Admin: device: gateway: hwaddr: iftype: ifusage: interface_number: ipaddr: ipv6addr: mtu: name: netmask: port: switch: install_disk: is_blade: level: location: memory: n_sockets: node_number: power_setting_dts: power_setting_on: region: server_type: ervices: gather_data: hosts: provider_type: eth2 Admin 192.0.2.3 earth.example.com Unknown (edit /etc/snmp/snmpd.
iclx1 iclx2 iclx3 iclx4 iclx5 iclx6 |192.0.2.1 |192.0.2.2 |192.0.2.3 |192.0.2.4 |192.0.2.5 |192.0.2.6 |earth |neptune |saturn |mercury |192.0.2.5 |pluto |earth.example.com |neptune.example.com |saturn.example.com |mercury.example.com |192.0.2.5 |pluto.example.com |192.0.2.7 |Unknown |192.0.2.8 |192.0.2.9 |Unknown |192.0.2.11 |ILO3 |Unknown |ILO3 |ILO3 |Unknown |dl1v3 The shownode info --admin command displays a list of managed systems and includes the management processor user name and password. 21.
As shown in the following example, invoking the command without specifying a managed system displays the sensor data for all managed systems. The output is truncated horizontally to fit on the page.
# shownode metrics mem Timestamp |Node |Total |Free |Buffer |Shared |TotalHigh |TotalFree |Cached --------------------------------------------------------------------------------------------date_and_time |iclx3 |4039616 |240360 |135832 |0 |0 |0 |2744104 date_and_time |iclx5 |4148548 |3502700 |157864 |0 |3275096 |2819496 |407160 date_and_time |iclx2 |4048376 |2775708 |57020 |0 |0 |0 |375768 date_and_time |iclx4 |4039616 |231936 |103828 |0 |0 |0 |2264952 date_and_time |iclx6 |2054832 |1317672 |62868 |0 |0 |0
Check the sensor status on the enclosure. Verify the status of the Enclosures Collection Monitor which provides this data. nh The Enclosure Collection Monitor collects sensor information from the blade system enclosures. Enclosure status can be found in the Nagios Enclosure Status service plug-in status.
22 Connecting to a remote console This chapter addresses the following topics: • “Console management facility overview” (page 186) • “How CMF works” (page 186) • “Accessing a remote console” (page 186) • “Serial connections on DL100 series servers” (page 187) • “Enabling telnet access to iLO management processors” (page 187) 22.1 Console management facility overview The Console Management Facility (CMF) daemon, cmfd, collects and stores console output for all managed systems.
# shownode roles --role management_hub 2. Log in to the console with the console command. You can specify either the internal name or the host name. This example uses the internal name icelx16 instead of the host name mercury: $ console icelx16 Locating server for icelx16 Server for icelx16 is mercury.example.com Press q to exit login: IMPORTANT: The console command may not be able to connect to the system console if the dates on the management hubs are not synchronized. 3.
By default, the cmfd connects to the management processor using the SSH protocol. The iLO management processors support the SSH protocol, but the LO100i management processors for the DL100 G5 series servers require an HP LO100i Advanced Pack License to enable SSH support. Alternatively, if you do not want to purchase this license, you can instruct cmfd to connect to the management processors using the telnet protocol by performing the following steps: 1.
Part IV Other topics
23 Miscellaneous topics This chapter addresses the following topics: • “Changing management processor credentials” (page 190) • “Changing the default port for the repository web server” (page 190) • “Increasing the number of servers that can be discovered concurrently” (page 191) • “Changing the IP address of the CMS ” (page 191) • “Uninstalling Insight Control for Linux” (page 191) • “Determining the installed Insight Control for Linux version” (page 192) • “Event logging overview” (page 192)
23.3 Increasing the number of servers that can be discovered concurrently When performing a bare-metal discovery on a set of servers, the maximum number of nodes that are discovered concurrently is 16. Perform the following steps to increase that number: 1. Edit the /opt/mx/icle/icle.properties file to add the following line: DISCOVERY_MAX_AT_ONCE=servers Where servers is an integer value representing the number of servers discovered concurrently. 2. Restart HP SIM. 23.
# cd /opt/hp/icelx/config/uninstall # ./uninstall.sh 5. Remove the following Insight Control for Linux monitoring directories. If you have any files in these directories that you want to preserve, make sure you save a copy of the files before you remove them. # # # # rm rm rm rm -Rf -Rf -Rf -Rf /opt/hptc /hptc_cluster /var/hptc /opt/repository/boot/pxelinux.cfg NOTE: System configuration files that Insight Control for Linux modifies (for example, /etc/ dhcpd.conf, /etc/rsyncd.
Options Defines generic information such as reconnection timeouts, FIFO size limits, and so on. Sources Defines the different sources from which messages are obtained. Filters Defines the rules to segregate messages. For example, messages can be separated by host, severity, facility, and so on. Destinations Contains the devices and files where the messages are sent or saved. Logs Combines the sources, filters, and destination into specific rules to handle the different messages.
You can change the number of the tasks that can be run at the same time with the following steps: 1. Edit the /opt/mx/icle/icle.properties file. The value of the MAX_CONCUR_CHAINS variable in this file helps to determine the number of concurrent tasks. If this variable is not specified, the default value of 64 is used. 2. Restart Insight Control for Linux. 23.
hardware ethernet 00:16:3E:AB:CD:01; fixed-address 192.0.2.150; option host-name "vm001"; } host vm002 { hardware ethernet 00:16:3E:AB:CD:02; fixed-address 192.0.2.151; option host-name "vm002"; } } MAC addresses for some Xen virtual machines begin with the octets 00:16:3E; the final three octets are chosen arbitrarily. Likewise, MAC addresses for some KVM virtual machines begin with the octets 52:54:00; the final three octets are also chosen arbitrarily.
24 Advanced topics Topics include: • “Management Processor Credentials” (page 196) • “Deploying WBEM provider components using Configure or Repair Agents task” (page 198) • “Logging RAM disk connections and operations” (page 199) 24.
5. Select OK. 24.1.2.2 Discovering and setting up servers with virtual media deployment If your site uses the virtual media deployment features of Insight Control for Linux, perform these additional steps when you discover the management processors: 1. For the initial part of the process, create an account on the management processor being discovered that matches the default Insight Control for Linux MP credentials. 2. Use the HP SIM discovery tool to discover the management processor. 3.
When a new set of credentials is entered with the Configure →Management Processor→Credentials... task, Insight Control for Linux attempts to find a user with the same user name. If one is found, the user password is changed to match the new credential. If no match is found, then the new credentials are placed in slot 15, overwriting the credentials. For this reason, do not store credentials, other than those for Insight Control for Linux, in slots 15 and 16. 24.
• kernel-source • sblim-indication_helper For SLES 10 SP3, the openwbem package must not be installed. All Xen virtual hosts must have the HP ProLiant Support Pack (PSP) installed. For information on deploying PSP, including dependent packages, see the Minimum requirements for Linux servers section at http://h20000.www2.hp.com/bc/docs/support/SupportManual/c00472061/c00472061.pdf. 24.
Part V Troubleshooting and support resources
25 Troubleshooting This chapter addresses the following topics: • “General troubleshooting topics” (page 201) • “Alternative booting” (page 202) • “Apache service does not start” (page 202) • “Troubleshooting CMF problems” (page 202) • “Troubleshooting configuration problems” (page 205) • “Troubleshooting connection problems” (page 208) • “Troubleshooting DHCP problems” (page 209) • “Troubleshooting discovery problems” (page 211) • “Troubleshooting firmware update problems” (page 215) • “
Problem See: Tool Launch OK? says NO Section 25.25 (page 242) Licensing page is always displayed when running a tool Section 25.12 (page 216) Target managed system is not licensed for this tool Section 25.12 (page 216) SSH credentials missing for a server Section 25.22 (page 240) Unable to create SSH connection: No route to host Section 25.22 (page 240) Unable to get SSH credentials: SSH credentials for the specified server were not set or are missing Section 25.
Cause/Symptom Corrective actions • Examine the /opt/hptc/cmf/logs/cmfd.log for errors. • Verify cmfd -h for the usage and default startup parameters. The CMF retries failed connections periodically Perform the appropriate action: • Verify the BIOS configuration. • Verify that the management processor user name and password were not changed. • Verify the management processor's IP address.
Cause/Symptom Corrective actions Console command cannot connect to console. • Verify that cmfd is running on each of the management hubs; the console command searches for the cmfd daemon that has the connection to the console. • Verify that the dates on the management hubs are synchronized. Console command connects to cmfd but there is no output. Make sure that: • The BIOS on the managed system is configured to redirect the serial port to the management processor. For more information, see Section 8.3.
25.5 Troubleshooting configuration problems The following table describes possible configuration problems and provides actions to correct them. Cause/Symptom Corrective actions Configure Insight Control for Linux management services fails Perform the appropriate action: • Verify that the task has indeed completed. The Task Results window may report completion although the operation might not yet be complete. Monitor the console to determine the result.
Cause/Symptom Corrective actions • The CMS has multiple NICs and HP SIM has identified these as separate entities. If you experience similar issues, follow these troubleshooting recommendations: • Verify that the /etc/hosts file is correct. For example, make sure the real host name is not equated to localhost and make sure there is only one real and valid entry for the host name and IP address. • Verify that the DNS configuration is correct.
Cause/Symptom Corrective actions Enclosures collection monitor will report a CRITICAL status Locate the value for the command[encchk_all] if the OA credentials have not been configured properly command definition in the /opt/hptc/nagios/etc/ nrpe_local.cfg file. Run the command associated with the command definition. For example: # /opt/hptc/supermon/bin/sensors --cp=enclosures --domain icelx[1-5]:enclosures 1206387637 The user could not be authenticated.
Cause/Symptom Corrective actions Incorrect or no information returned for Insight Control for Perform the appropriate action: Linux • Reconfigure by running the The shownode config command returns no data, as Options→IC-Linux→Configure Management Services shown here: task. # shownode config all: The shownode info returns an error message, like the one shown here: # shownode info NO CACHE FILE! RERUN create_nodenames. Failure at /opt/hptc/perl/lib/sim/hptc_node.
Cause/Symptom Corrective Actions • Verify that the /etc/opt/mx/config/ RootTrustList.txt file contains the address of the CMS and the management hubs. • Ensure that the /opt/hptc/database/etc/ssl file on the CMS and management hubs contains the following: certfile.pem keyfile.pem • Ensure that Trusted Certificates from HP SIM have at least one certificate for Insight Control for Linux. Ensure that the output of the Options→Security→Credentials→Trusted Systems… task matches the values in certfile.pem.
Cause/Symptom Corrective actions • Examine the /var/log/messages system log file for error messages, and take any corrective action required. DHCP Process Will Not Start Perform the appropriate action: Any attempt to start the DHCP process fails with errors. • Verify that the /etc/dhcpd.conf service configuration file is valid. Verify it against the output of the examples in dhcpd.conf(5). • Verify that DHCP is configured to serve IP addresses on the correct network interface.
Cause/Symptom Corrective actions more than 80% of the time; budget approximately 20% additional IP addresses. Managed Servers will not PXE Boot Verify that your DHCP service configuration is properly When a console is connected to a managed system, either configured to provide a Boot Server Hostname or next-server value, instructing the PXE boot process to directly or through the managed system's iLO remote console, the boot process reaches the PXE boot stage, but load a network boot loader.
Cause/Symptom Corrective actions • Determine if CMS is managing 500 or more nodes (where a node represents a server, a management processor, an onboard administrator, a switch, and so on) using a postgres database. If so, it is possible that Initial Data Collection is failing because of database connectivity issues with postgres. If your CMS is managing over 500 nodes, HP recommends using a supported Oracle database for managing 500 or more nodes.
Cause/Symptom Corrective actions • Run the Data Collection Report on the system, which is accessible from the Tools & Links page for the system, and verify that there is a Network Interface section containing one or more MAC address(es). The Reset Server operation failed. Manually reboot the server. Previously discovered system does not bare-metal discover. Manually delete all the files in the /opt/repository/ A managed system that was previously discovered in Insight boot/pxelinux.
Cause/Symptom Corrective actions Password modification failed. Unable to add or update user account on management processor. Configure→Management Processor→Credentials to configure a new global password of at least 8 characters. Possible bare metal discovery issues with LO100i servers Use a browser to verify access to the LO100i management processor with the following command to access its web page.
25.9 Troubleshooting firmware update problems The following table provides the actions to correct a firmware update task failure. Cause/Symptom Corrective Actions Firmware Update Task Failed Perform the appropriate action: If the task fails, the system is left up in the Insight Control for Linux RAM disk, so that you can examine the hpsum logs and enter commands as necessary.
25.11 Troubleshooting large scale deployment problems The following table provides the actions to correct a large scale deployment failure. Cause/Symptom Corrective Actions Large Scale Deployment Failed Perform the appropriate action: • Examine the log in Operation Details section of the Task Results window for errors or other information.
Cause/Symptom Corrective Actions Some HP ProLiant DL100 series servers temperature and WARNING Alerts for the Nodeinfo service in “Nagios fan sensor metrics are not individually reported by default. Troubleshooting” (page 220). Instead, they are tallied in the "Sensor Count" metric. The temperature and fan sensor metrics are monitored correctly and are individually reported if they exceed the warning or critical thresholds.
Cause/Symptom Corrective Actions The wget command fails: • If there is a proxy server in your environment that is not configured properly • If the appropriate network ports open on the managed system are not open. For more information, see “Opening network ports on managed systems” (page 78) Take the corrective action based on the wget failure. Configure Management Services task fails Metrics are not collected Verify that a proxy is not used to communicate between the CMS and the managed system.
Cause/Symptom Corrective Actions 2. Restart the web server. # /etc/init.d/httpd restart For SLES operating systems: 1. Create the symbolic link: # ln -sf /opt/hptc/hpcgraph/hpcgraph.conf /etc/apache2/conf.d/hpcgraph.conf 2. Restart the web server. For RHEL operating systems: # /etc/init.d/apache2 stop # /etc/init.d/apache2 startssl For SLES operating systems: # /etc/init.d/apache2 restart Alternatively, remove the /etc/httpd/conf.d/ colplot-apache.conf file and restart the web server.
Cause/Symptom Corrective Actions • Run the following command to determine if the infrastructure is gathering data: /opt/hptc/cmu/bin/cmumon –actionfile \ /opt/hptc/cmu/etc/sysconfig/ActionAndAlertsFile.txt –cmustatdir \ /opt/hptc/cmu/tmp/GUI --foreground --lm NOTE: This command is continued over three lines for clarity. Performance Dashboard stops presenting data. The broken Examine the two Performance Dashboard log files for GIF symbol is displayed. errors: /var/opt/mx/logs/mxdomainmgr.0.
25.14.3 Running Nagios plug-ins manually The Nagios plug-ins are located in the /opt/hptc/nagios/libexec directory. You can run them from the command line if needed. To run the Nagios check_sel plug-in from the command line, follow these steps: 1. Log in as the nagios user. 2. Change to the following directory: $ cd /opt/hptc/nagios/libexec 3. Locate the Nagios plug-in you want to run, for example: $ ls *_sel 4. Optionally, invoke the Nagios plug-in with the --help option: $ .
Figure 35 Sample Nagios messages Messages are categorized in the Status column as OK, Unknown, Pending, Warning, and Critical and are color-coded. The messages described in this section are indexed by the Service and Status Information columns. The messages in this section are arranged alphabetically by the Service column entry. If there is a warning or critical message, find the information for that service in the Status Information column and apply it to the specified Nagios host.
A warning or critical message indicates that thresholds for the specific managed system were exceeded. Thresholds can be set on a per-managed system, per-class or per-system basis in the nagios_vars.ini file. These values are specific to the site and depend on site load. If thresholds are reasonable, monitor for excessive activity on the managed system.
Status Information: Node / and /var free space This entry typically displays the status of the /, /var, and /hptc_cluster file systems on the system. A warning or critical message indicates that the thresholds for the specific managed system were exceeded. Clean up disk space. 25.14.
Cause/Symptom Corrective Actions # /etc/init.d/apache2 stop # /etc/init.d/apache2 startssl Nagios startup error: The browser displays a directory list Verify that the php RPM and its required dependencies are of Nagios files instead of the Nagios main window. on your CMS. To verify on a RHEL CMS: # rpm -qa | grep php To verify on a SLES CMS: # rpm -qa | grep php5 Use the rpm command to install the php RPM and its required dependencies.
Cause/Symptom Corrective Actions The Nagios default threshold values for total nagios_vars.ini file. HP recommends saving a copy processes, user processes, and zombie of the original file before making any updates. processes might be too small for certain system 1. Save a copy of the original file: configurations, particularly those with virtualization # cp /opt/hptc/nagios/etc/nagios_vars.ini operating systems. If so, you will encounter CRITICAL or /opt/hptc/nagios/etc/nagios_vars.ini.
Cause/Symptom Corrective Actions timestamp system mcelog: Cannot mmap SMBIOS } tables at dffff000 The nrg --mode analyze command fails on SLES11 SP1 with can't locate ioctl.ph message Running the nrg command with the --mode analyze option to generate an analysis of Nagios plug-ins by managed system fails and generates output resembling the following: # nrg --mode analyze Can't locate linux/ioctl.
Cause/Symptom Corrective actions The selected OS has been deleted from the Insight Control Verify that the OS exists in the Insight Control for Linux for Linux repository. repository. The proper files were not copied from the installation media Perform the following actions: into the appropriate /opt/repository/os • Copy the proper OS files into the appropriate /opt/ subdirectories. repository subdirectories.
Cause/Symptom Corrective actions OS Installations fail, cannot connect to managed system Verify that a proxy is not used to communicate between the CMS and the managed system. OS Installations hang Insight Control for Linux does not have proxy server support; the Insight Control for Linux features do not communicate through proxy servers, and require direct network connectivity between the CMS and the managed systems.
25.15.2 Custom OS installations Cause/Symptom Corrective actions The selected OS is incompatible with the server hardware. Select an OS version that is compatible with the server hardware architecture. The selected OS does not support network based installations. Correct the “kernel append” line and rerun the tool. The kernel or initrd (RAM disk) names are not correct. Verify that the kernel and initrd file names as used in the repository.
25.15.4 Deploying Linux images Cause/Symptom Corrective actions The target server is a different type of hardware than the image was captured from. Verify that the platform the image was captured from matches the platform it is deployed to. Insight Control for Linux only supports like to like hardware deployment. The captured OS is not fully supported. You might be able to log into the system console and complete configuring the system manually.
Cause/Symptom Corrective actions this happens, you can manually increase the timeout value For example, a value of 3600 specifies a timeout of one for this operation. hour. You can specify whatever value you believe is appropriate, but keep in mind, this increase should only be needed in unusual circumstances.
Cause/Symptom Corrective actions appear asking you to press the Enter key to bring up a console shell. Deployment of SLES 10 SP3 fails when logical volumes span Place both disks on the same controller. disks on multiple controllers Deploying an image on a managed system, running SLES 10 SP3, with multiple disks (each on individual controller) fails because the order of the disks observed by the RAM disk does not agree with SLES 10 SP3. As a consequence, the image is deployed to the wrong disks. 25.
Cause/Symptom Corrective Actions 3. 4. 5. 6. Select the appropriate port. Change the Boot Support setting to Enable. Select Save. Select Save again and exit the utility. 25.18 Troubleshooting the run script and run SSH command tools The following table describes possible causes of problems with running scripts and commands and provides actions to correct them. Cause/Symptom Corrective actions Run Script Fails • Review the task log for the Run Script tool; verify that there is no error message.
Cause/Symptom Corrective actions This is usually caused when a server is booting and a power control command is sent to its management processor. Most commonly the system is in a BIOS boot and the management processor cannot determine the power status. If this is unsuccessful, reset the management processor by one of the following methods: • Upgrade the firmware on the server and management processor to the most recent version. • Power cycle the server.
Cause/Symptom Corrective actions Intermittent power on, power off, and reset server errors Rerun Options→Identify systems... to fully discover the iLO. Intermittent power on, power off, or reset server errors might occur during an Insight Control for Linux operation (for example, during an OS installation, image capture, or image deployment). Checking to see if power is on. Failed: Error retrieving BMC for server. Root cause:PANIC: BMC Manager not configured for device of this type.
25.20.3.1 Repairing the association of a booted managed system running an OS If a managed system is booted and running a supported OS, follow these steps to repair a lost association. • If the managed system is already running the proper agents and is properly configured, instruct HP SIM to re-query the managed system to get the proper association data: 1. Open HP SIM and select All Systems in the left pane. 2.
6. • If the BIOS data is valid and the iLO XML call is still reporting errors, a hardware problem might be the cause. In that case, telephone HP Customer Service. If the association problem is still not resolved after completing the suggested corrective actions described here, something more unusual is wrong. Check firewall ports on the CMS and the managed system and make sure SNMP is not being blocked. Look for anything that might be blocking the proper flow of the association data. 25.20.3.
9. If the server does have an OS installed, immediately install the ProLiant Support Pack (see the HP Insight Control for Linux Support Matrix for the current supported version), either manually through the remote console or by selecting the following: Deploy→Deploy Drivers, Firmware, and Agents→Install ProLiant Support Pack (PSP)... When this procedure is complete, the server is present in HP SIM and the association with the management processor is restored. 25.20.3.
1. 2. Run Options→Identify Systems... on the unassociated iLO or iLOs to force HP SIM to make the association. Repeat this process until all iLOs are associated with their servers. Select the following menu item from the Insight Control user interface to turn off power to the server or servers: Tools→Server Controls→Power Off Server... 25.21 Troubleshooting SNMP problems This section applies only to systems with iLO-based management processors.
Cause/Symptom Corrective actions The user name, password, or both for the SSH credentials credentials as appropriate. For more information, see of a target system are incorrect, causing SSH to fail. “Setting SSH credentials on managed systems” (page 141) and the HP Systems Insight Manager online help. SSH delays on SLES managed systems on networks without The following actions fix this issue: name resolution • Configure a DNS resolver on the network in question.
Cause/Symptom Corrective actions If not, restart it: # service supermon restart • Ensure that the mond daemon is running on all the managed systems: # pdsh -a -x `headnode` /etc/init.d/mond status Supermon and mond are running, but there is no activity Use the telnet command to connect on the appropriate Supermon listens on port 2710. The mond daemon listens port, port 2710 for Supermon and port 2709 for the mond daemon. Enter the S command after connection to see on port 2709. metrics data output.
Cause/Symptom Corrective action completely stopped before Insight Control for Linux can remove the RPMs. root 29156 17502 0 08:17 pts/1 00:00:00 grep mxinitconfig The output of the uninstall.sh script resembles the following: 2. Remove that process with the kill command. # uninstall.sh ... Uninstalling HP Systems Insight Manager ... Stopping HP SIM Stopping hpsmdb # kill -9 8626 Removing the process allows the uninstall.sh script to continue. 25.
Cause/Symptom Corrective action Ensure that the sblim-cmpi-base, libvirt-cim, and libcmpiutil packages are installed. For the SLES 10 operating systems, these packages are installed by running the Configure→Configure or Repair Agents... task on the VM host, selecting to install Insight Control virtual machine management. For other supported operating systems, install these packages from the RHEL or SLES distribution media. • Verify the network name on the system page for the CMS.
Cause/Symptom Corrective action ESXi 5.0 installation fails with fatal error: 6 (Buffer too small) Perform the appropriate action: 1. Open the /opt/repository/taskchain/ ESXiInstallation.xml file in a text editor. The default timeout value of 5400 seconds in the WaitforOSRamDisk operation might be inadequate for 2. Locate the WaitforOsRamDisk operation and change the installation. the value from its default of 5400 to a larger value to allow time for all the ESXi modules to load.
25.28 Troubleshooting virtual media problems Cause/Symptom Corrective action Server attempts to PXE boot or boot from local disk instead Perform the following actions: of booting using virtual media. • Verify that port 60002 is open on the CMS. • Run the Insight Control for Linux Configure→IC-Linux→Configure Boot Method task. Be sure to select Virtual Media for the boot method.
26 Support and other resources 26.1 Information to collect before contacting HP Be sure to have the following information available before you contact HP: • Software product name • Hardware product model number • Operating system type and version • Applicable error message • Third-party hardware or software • Technical support registration number (if applicable) 26.2 How to contact HP Use the following methods to contact HP technical support: • See the Contact HP worldwide website: http://www.
26.3.2 Warranty information HP will replace defective delivery media for a period of 90 days from the date of purchase. This warranty applies to all Insight Management products. 26.4 HP authorized resellers For the name of the nearest HP authorized reseller, see the following sources: • In the United States, see the HP U.S. service locator website: http://www.hp.com/service_locator • In other locations, see the Contact HP worldwide website: http://www.hp.com/go/assistance 26.
(Unattended). They replace the previous Deploy→Operating System→Custom or Other task. Likewise, the procedures for deploying a custom OS have changed. For information on deploying a custom OS, see the white paper titled Installing a Custom Operating System with HP Insight Control for Linux. ◦ The download web addresses in the table in “Additional prerequisites for certain ProLiant servers” (page 92) were updated. ◦ The section on Partition wizard requirements and guidelines was expanded.
Download from the Insight Control for Linux product website The Insight Control for Linux product website contains links to the Insight Control for Linux documentation set and white papers, a link to the Insight Control for Linux QuickSpecs, license information, product registration information, and many other related topics. To view or download documentation from the Insight Control for Linux product website, follow these steps: 1. Open a web browser to the following web address: http://www.hp.
6. 7. 8. Scroll down the page until you see the table labeled Software - Support Pack. Select the PSP link in the Description column. To download the PSP, select the Download>> button associated with the *.tar.gz (gzipped) file. To view or download the associated HP ProLiant Support Pack User Guide, select the Release Notes tab. • Linux vendors The following are links to Linux vendor websites. Linux vendors are not limited to the vendors shown in this list.
◦ http://www.balabit.com/products/syslog_ng Home page for syslog-ng, a tool that is used for consolidated logging. ◦ http://www.virt-manager.org Home page for the virt-manager tool. ◦ http://www.vmware.com/products/esx/index.html Home page for VMware ESX. ◦ http://www.vmware.com/products/esxi/ Home page for VMware ESXi. ◦ http://www.linux-kvm.org Home page for KVM. ◦ http://www.xen.org Home page for Xen. 26.7.
NOTE An alert that contains additional or supplementary information. TIP An alert that provides helpful information. 26.
A Customizing Nagios The Nagios configuration is designed so that you can customize it as needed. Complete documentation for customizing Nagios is available on the following Nagios website: www.nagios.
# # # # # NRPE GROUP This determines the effective group that the NRPE daemon should run as. You can either supply a group name or a GID. NOTE: This option is ignored if NRPE is running under either inetd or xinetd nrpe_group=new_nagios_group Where new_nagios_group is the group name of the new Nagios user's account. Save the file. 5. Edit the /opt/hptc/nagios/etc/nagios.
# NAGIOS GROUP # This determines the effective group that Nagios should run as. # You can either supply a group name or a GID. nagios_group=new_nagios_group Save the file. 9. Run the Options→IC-Linux→Configure Management Services task. NOTE: The Task Results window may report completion although the operation might not yet be complete. Monitor the console to determine the result. 10. If your system has multiple management hubs, log into each management hub and repeat steps 2 through 8. 11.
To avoid these alerts, use the command sequence listed in the following table to shut down Nagios before performing any maintenance operations and tasks and start or restart Nagios. Purpose Command line To shut down Nagios on the CMS immediately before performing maintenance operations and tasks: # /etc/init.d/nagios stop To start Nagios after a maintenance operation: # /etc/init.d/nagios start To restart Nagios after changing its configuration: # /etc/init.d/nagios restart A.2.
thresholds and generates alerts when a threshold is reached. Depending on your specific site configuration and use, some default thresholds might not be appropriate for your system. The platform-dependent default thresholds serve as a baseline, but they might not be optimal for your site. Determine the threshold values appropriate for your site and customize the Nagios configuration accordingly. The /opt/hptc/nagios/etc/nagios_vars.
Table 24 Supermon metrics collection intervals (continued) Metric name Collection interval btime default* processes default* netinfo default* meminfo default* swapinfo default* time default* switch default* cputotal default* avenrun %LOADAVECOLLECTIONPERIOD% ** mdadm %MDADMCOLLECTIONPERIOD% ** * The default is 5 minutes. ** This value is specified in the /opt/hptc/nagios/etc/nagios_vars.ini file. A.2.5.
Actively Launched on Managed System? Maximum Check Attempts Normal Check Retry Check Interval Indicates whether or not Nagios periodically runs this service check at the specified normal check interval. Indicates the number of times Nagios examines the service before reporting a failure. Indicates the frequency of the check interval. Indicates the amount of time Nagios waits before retrying after a failure.
Nagios creates alerts for power, memory, voltage, and Automatic System Recovery (ASR) messages. The rules for alerts are defined in the /opt/hptc/nagios/etc/selRules file. You can modify these rules by editing this file as follows: • Add rules to this file for new alerts. • Change alerts by modifying the corresponding rule in this file. • Remove a rule to delete the corresponding alert.
Glossary A AutoYaST file A configuration file used to effect an unattended SLES operating system installation. B bare-metal Describes a server that is not booted with a running operating system. This could be a brand new server with no OS installed on it, or it could be a server with an OS that is not booted. C central management server See CMS. certificate An electronic document that contains a subject's public key and identifying information about the subject.
HTTPS An extension to the HTTP protocol that supports sending data securely over the web. hypervisor Computer software, specific to a hardware platform, that allows you to run multiple operating systems on a single host at the same time. I iLO Integrated Lights Out. A self-contained hardware technology available on various hardware models that enables remote management of any node within a system. Subsequent generations of this technology are iLO 2, iLO 3, and iLO 4.
PSP ProLiant Support Pack. HP software components that are bundled together and verified to work with a particular operating system. An HP ProLiant Support Pack contains driver components, agent components, and application and utility components. All these are verified to install together. PSP dependency script An optional user-provided script that runs during a PSP deployment to a managed system. PXE Preboot Execution Environment.
Index A Apache self-signed certificate, 202 configuring on the CMS, 153 Apache service does not start, 202 association between server and management processor, 236 between virtual host and virtual guest, 126 AutoYaST file, 86 see also installation configuration file defined, 85 B bare metal discovery iLO to server association lost, 239 power cycle starts , 235 starts after power cycle, 235 bare-metal system discovering (PXE), 70, 71 discovering (virtual media), 71 bare-metal system discovery discovery, 13
troubleshooting, 209 digital signing, 24 directories to back up, 20 discover bare-metal systems, 13 bare-metal systems using PXE, 70 bare-metal systems using virtual media, 71 enclosures, 70, 74 servers with supported OS on them, 72 servers with unsupported OS on them, 70 switches, 70, 74 systems, 70 discovery iLO to server association lost, 239 power cycle starts bare metal discovery, 235 documentation ESX, 252 ESXi, 252 HP Insight Control, 250 HP ProLiant Support Pack User Guide, 250 Insight Control for L
Insight Control for Linux troubleshooting, 215 Insight Control power management, 140 Insight Control virtual machine management, 119 install PSP troubleshooting, 233 installation custom or other OS, 86, 99 interactive, 85 Linux variant, 99 prerequisites, 91 procedure to install a Linux OS, 100 procedure to install a VMware ESX using a Kickstart file, 96 procedure to install VMware ESX interactively, 97 procedure to install VMware ESXi interactively, 97 PSP, 135 Red Hat interactive, 93 Red Hat unattended, 93
changing credentials, 190 credentials, 196 enabling telnet on, 187 enabling virtual media, 60 iLO, 159 IPMI, 159 lost association to server, 236 obtaining status of, 236 setting user name and password, 13 memory, 183 menu items, 12 metrics collection interval, 258 mond management agent, 159 monitoring environmental data, 170 hosts and services, 164 hosts and services with problems, 166 network bandwith, 167 real time metrics, 174 services failure, 205 strategy, 158 troubleshooting, 216 using Nagios, 162 usi
/opt/hptc/nagios/etc/selRules file, 260 OS installing on managed systems, 77 supported, 86 OS deployment troubleshooting, 227 OS installation troubleshooting, 227 creating, 136 defined, 136 location in repository, 136 simple example, 136 PXE boot, 17 troubleshooting, 233 P RAM disk booting to, 202 RAM disk environment, 17 reboot managed system, 140 register for technical support and update service, 247 ISO image in repository, 51 Kickstart and AutoYaST files in repository, 49 Linux OS in repository, 45 P
RPM signatures validating, 24 RRDtool, 167 defined, 158 documentation, 251 run script troubleshooting, 234 run ssh command troubleshooting, 234 S scalable deployment preparing for, 109 selecting, 113 scalable task results format, 37 Secure Shell, 22 security, 22 sendmail utility, 173 sensor data not reported, 241 sensor metrics archiving, 179 sensor thresholds changing for Nagios, 257 serial console access and logging configuring, 82 server control troubleshooting, 234 server power management, 140 _Server
task results, 28 task results page, 28 common areas, 30 controlling view options, 31 HP SIM standard task results format, 33 log button, 36 operation control buttons, 39 operation details log, 32 operation target details table, 39 operations table, 38 parameters button, 32 rerun non complete targets, 32 scalable task results format, 37 stop button, 35 target details table, 36 target status area, 34 view printable report button, 31 task status, 28 tasks changing number of concurrent, 193 technical support se
deploying, 198 websites HP authorized resellers, 248 HP technical support, 247 Linux vendors, 251 ProLiant servers, 250 ProLiant Support Pack, 250 white papers Insight Control for Linux, 249 X Xen documentation, 252 guidelines for configuring virtual guest, 124 required BIOS setting, 118 272 Index