HP Insight Control for Linux 7.1 User Guide Abstract This document describes how to set up and use Insight Control for Linux to monitor and manage HP ProLiant servers that were licensed with Insight Control for Linux. This document builds on the information from the HP Insight Control for Linux Installation Guide, which you used to install and configure HP Systems Insight Manager (HP SIM) and Insight Control for Linux on the Central Management Server (CMS).
© Copyright 2008, 2010, 2012 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. The information contained herein is subject to change without notice.
Contents I Introduction...............................................................................................11 1 Using Insight Control for Linux................................................................12 1.1 Overview.....................................................................................................................12 1.2 Integration with Systems Insight Manager........................................................................13 1.
5 Managing the Insight Control for Linux repository......................................43 5.1 Introduction to the Insight Control for Linux repository .......................................................43 5.1.1 Configuring a remote repository...............................................................................44 5.1.2 Repository contents................................................................................................45 5.1.3 Repository item naming conventions..................
8.2 Linux OS installation.....................................................................................................78 8.3 Setting up managed systems for monitoring.....................................................................78 8.3.1 Opening network ports on managed systems............................................................79 8.3.2 Resolving host names on the CMS..........................................................................79 8.3.
11.3.2.2 Installing a SLES KVM virtual guest...............................................................124 11.3.3 Guidelines for configuring a Xen virtual guest........................................................125 11.4 Obtaining virtual guest and virtual host associations......................................................127 11.5 Establishing monitoring for virtual hosts and virtual guests...............................................128 11.6 Virtual guest operations...........................
19.2 Configuring a self-signed Apache certificate on the CMS...............................................154 19.3 Starting management and monitoring services .............................................................154 19.4 Installing Insight Control for Linux management agents...................................................156 19.5 Verifying successful configuration of the monitoring services...........................................156 19.5.1 Ensuring that Nagios is reporting status.............
22.5 Enabling telnet access to iLO management processors...................................................188 IV Other topics..........................................................................................190 23 Miscellaneous topics.........................................................................191 23.1 Changing management processor credentials...............................................................191 23.2 Changing the default port for the repository web server..................
25.15.1 RHEL and SLES installations...............................................................................229 25.15.2 Custom OS installations....................................................................................231 25.15.3 Capturing Linux images....................................................................................231 25.15.4 Deploying Linux images....................................................................................232 25.
A.6 Modifying the Nagios password file.................................................................................262 Glossary..................................................................................................263 Index.......................................................................................................
Part I Introduction
1 Using Insight Control for Linux This chapter addresses the following topics: • “Overview” (page 12) • “Integration with Systems Insight Manager” (page 13) • “Insight Control for Linux extensions to HP SIM” (page 13) • “Insight Control for Linux toolboxes” (page 16) • “Insight Control for Linux command environment” (page 17) • “Internal task queuing and management” (page 17) • “Synchronized system clocks” (page 18) • “Insight Control for Linux RAM disk environment” (page 18) • “Network con
• Configuring network parameters • Installing an operating system and agents from the SPP or PSP • Configuring monitoring services After a server becomes a managed system, you can monitor it and manage it. 1.2 Integration with Systems Insight Manager Insight Control for Linux is a suite of software and tools that combine to provide a powerful mechanism for discovering, installing, monitoring, and managing HP ProLiant servers.
Table 1 Insight Control for Linux extensions to the HP Insight Control user interface; (continued) Menu item Description Documented in used by the Network Configuration Editor tool. The network definitions are used by the OS installation tools to implement booting using the virtual media mechanism. IMPORTANT: The network definitions must be created before initiating bare-metal discovery through virtual media.
Table 1 Insight Control for Linux extensions to the HP Insight Control user interface; (continued) Menu item Description Documented in Deploy→Operating System→Red Hat Interactive Starts an interactive Red Hat Enterprise Linux (RHEL) installation on Section 9.4.2 one or more target managed systems. (page 95) Deploy→Operating System→Red Hat (Kickstart) Uses a default or user-supplied configuration file to start an unattended RHEL installation on one or more target managed systems. Section 9.4.
Table 1 Insight Control for Linux extensions to the HP Insight Control user interface; (continued) Menu item Description Documented in Tools→Server Controls→Power Makes a remote call to the management processor to set power Off Server... status to off abruptly. Section 15.1 (page 141) Tools→Server Controls→Power Makes a remote call to the management processor to set power On Server... status to on. Section 15.2 (page 141) Tools→Server Controls→Reboot Server...
For information on creating administrator accounts, that is, non-root accounts with the privileges required to access and use HP SIM, see the HP Insight Control for Linux Installation Guide. 1.5 Insight Control for Linux command environment Table 2 lists the Insight Control for Linux commands that you can run from the command line on the CMS or on any management hub, with the exception of the pdsh command.
1.7 Synchronized system clocks When using Insight Control for Linux, and especially when using Insight Control for Linux tools to install operating systems on managed systems, to capture and deploy Linux images, HP recommends that you keep system clocks up to date and synchronized. Synchronization is required for the Console Maintenance Facility to access a managed system using SSH.
When the system is powered on, the bootable image is loaded from the CMS by way of the management processor. Virtual media does not use DHCP. The system boots a custom RAM disk that includes the predefined network configuration information (for example. the IP address, Net Mask, Gateway, and so on). Insight Control for Linux provides tools that let you define the network information parameters, edit those network parameters, and initiate bare-metal discovery. 1.
When you run the Options→IC-Linux→Configure Management Services task, it determines if this file exists: • If the file does not exist, it creates the file and assigns numbers based on the managed systems and the current numbering scheme. The Central Management Server is always node number 1. • If the file already exists, the configuration task reads the nodenumbers file and assigns the node numbers according to the file contents.
1.11.2 Viewing managed system names After the Configure Management Services task is run, you can list the managed systems with their associated names; use the shownode info command as described in Section 21.2.2 (page 182). 1.12 Connecting to HP SIM To log in and connect to HP SIM, follow these steps: 1. Open a browser window. 2.
You also must back up HP SIM configuration files to restore your configuration.
2 Security 2.1 Integrated security features This section describes features that are integrated into HP SIM and Insight Control for Linux to make them secure. Security features are also discussed in context of the associated topic throughout this document. • Browser Connections HP SIM enforces a secure connection to the web browser.
The SSH service also enables file transfer with the scp or sftp commands over the same port as SSH. • pdsh Keys The pdsh command uses public host keys to authenticate remote hosts and supports public key authentication to authenticate users. • cmfd Keys The console command uses SSL keys to connect to the console management facility daemon (cmfd) for console access. • secure boot mechanism Virtual media support is provided as the secure boot mechanism.
• Issues relating to scalable deployment The scalable deployment feature of Insight Control for Linux uses HTTP to transfer a Linux image from the CMS to a group leader and FTP to transfer that image from the group leader to individual servers. There is no mechanism for verifying the identity of the server providing the image; neither method protects from a man in the middle attack.
You must repeat this procedure for every iLO whose certificate you want to add to the HP SIM trust storage. An alternate method is to automate this procedure by using a script to extract the iLO's certificate and add it to the HP SIM trusted certificate list. The following is an example of a script that accepts a series of iLO certificates and adds them to the HP SIM trust store. #!/bin/sh # # Get certificate for each iLO passed in as an argument # and add it to the HP SIM trust store.
3 Managing licenses This chapter describes the following topics: • “Licensing overview” (page 27) • “Adding the Insight Control for Linux license key to HP SIM” (page 27) • “Licensing virtual guests” (page 28) 3.1 Licensing overview The licenses for the Insight Control power management and Insight Control virtual machine management are bundled with the Insight Control for Linux license. The iLO Advance remains as a separate license.
3.3 Licensing virtual guests When a virtual host (VM host) is licensed for Insight Control for Linux, all guests of that VM host are considered licensed for Insight Control for Linux as well, provided that the virtual guests are properly associated with their virtual host. You can license a virtual machine guest (VM guest) without licensing its host or you can license it in addition to licensing its host, in either case unnecessarily consuming licenses.
4 Understanding tasks and task results This chapter addresses the following topics: • “Task results overview” (page 29) • “Understanding task results” (page 29) • “Task results page” (page 29) • “Common task results” (page 31) • “HP SIM standard task results format” (page 34) • “Scalable task results format” (page 38) 4.1 Task results overview HP SIM and Insight Control for Linux enable you to manage systems by scheduling and running tasks.
Figure 1 Task results page Table 4 lists the components of the Task Results page. Table 4 Components of the Task Results page Available in HP SIM standard view, scalable view, or common to both views Component Description Task Instance Results Provides the status of the running task or the task that is selected in the task list log at the top of the page. Common Use SIM Standard Task Results Format radio button This option is only offered when you run an Insight Control for Linux task.
Table 4 Components of the Task Results page (continued) Component Available in HP SIM standard view, scalable view, or common to both views Description Selecting this radio button provides an operation oriented format that enables you to view the status of each operation in a task as it completes on each target. This format is particularly useful when you are running an Insight Control for Linux task on many targets, for example, when you are installing a Linux OS on many servers at once.
In Insight Control for Linux, it might not be possible to cancel a task immediately after you select the Stop button because an operation might be at a point on a target where it cannot be interrupted. This can result in a task changing from the Cancelled state to a Complete or Failed state because the cancel operation could not be processed in time. A task End Time is initially set to the time when you select Stop.
◦ All target details, including all information displayed in the operation status table and the log for each operation TIP: If you select All Systems for the report, the target level results are displayed for all targets, each separated by a line. 4.4.2.2 Rerun non-complete targets button The Rerun Non-Complete Targets button is enabled only when the following conditions exist: • At least one target for the task has a Failed or Cancelled status. • All targets for the task are in a Terminal state.
Figure 5 View of the operation details log 4.5 HP SIM standard task results format This section describes the portions of the Task Results page that are specific to the HP SIM Standard Task Results Format, which is the default view. Figure 6 illustrates the HP SIM Standard Task Results format. The figure shows the task results for an instance of a Red Hat Kickstart OS installation task running on three target servers.
Figure 6 HP SIM standard task results format 4.5.1 Summary status and target status area Figure 7 illustrates the Summary status: area and target status area, which provide the overall status of a task on each target. Figure 7 View of the summary status and target status areas Table 5 describes the information displayed in the Summary status: area. 4.
Table 5 Description of target status area Column heading Description Target Name Name of the target managed system on which the task was run. Status The status of a target is computed from the status of its operations. Non-terminal target status Pending: All operations can have the Pending status. Running: At least one operation has the status Running. A percent complete is also displayed.
4.5.1.2 Log button in the target status area When you select the Log button, a new window opens that displays the log for all operations for the task, including the following information: • A summary of the task level information • The information displayed in the target status table for the selected target • A block of information for each operation in the task, including the log The log screen does not auto-refresh.
Table 6 Description of target details table (continued) Column heading Description NOTE: When an operation has a status of Cancelling, the target status is Cancelled, but the end time is empty for both the operation and the target. Terminal operation status Complete: The execution of the operation completed as expected. Failed: The operation was not successful. Cancelled: You pressed the Stop button for the target or task and this operation was the last one run or the next one to run.
Figure 9 Scalable task results format 4.6.1 Operations table Figure 10 illustrates the Operations table, which lists individual operations within a task and provides the status of the entire operation as it starts and completes on each target. The important thing to know is that operation status represents the status of the operation on every target.
Table 7 lists the information displayed in the Operations table. Table 7 Description of the operations table Column heading Description Operation Name The name of the operation that is run as a component of an Insight Control for Linux task. Status Complete: The operation has successfully completed on all targets. Running: The operation has started but it is not yet complete on all targets. Pending: The operation has not yet started.
Table 8 Description of the operation target details table Column heading Description Target Name The name of the target on which the operation was run on or is running on. Status Complete: The operation has successfully completed on the targets. Running: The operation has started but it is not yet complete on all targets. Pending: The operation has not yet started. Cancelling: you have cancelled the task by selecting the Stop button for the target or for the task.
Part II Deployment
5 Managing the Insight Control for Linux repository This chapter provides an overview of the Insight Control for Linux repository and how to perform activities related to it. The following topics are addressed: • “Introduction to the Insight Control for Linux repository ” (page 43) • “Registering items in the Insight Control for Linux repository” (page 46) • “Copying software to the Insight Control for Linux repository” (page 52) • “Editing and deleting registered items” (page 57) 5.
After an OS is registered with the repository, manually copy the vendor-supplied installation media to the appropriate directories in the repository. The media can be a physical CD or DVD, or it can be an .iso image. You must expand the .iso image into flat files. IMPORTANT: Be aware that repository management tasks do not follow typical authorization models. All HP SIM users can select, add, delete, or modify all Insight Control for Linux repository items regardless of their user authorizations. 5.1.
Figure 14 Remote repository using the CMS as a gateway 5.1.2 Repository contents Table 9 lists the classes of items that are stored in the repository. Table 9 Repository item types Name Description ISO ISO image PSP An OS-specific bundle of ProLiant optimized drivers, utilities, and management agents. SPP An OS-specific bundle of ProLiant optimized drivers, utilities, firmware, and management agents. Supported OS Vendor-supplied installation files for supported versions of RHEL or SLES.
The items listed in Table 10 are preregistered and reside in the repository after you install Insight Control for Linux. The default contents include sample RHEL Kickstart and SLES AutoYaST installation configuration files and an example PSP or SPP dependency script. Table 10 Default repository contents Item type Directory name examples Description SPP and PSP Dependency Scripts example_dependency.
5.2.2 Registering operating systems Registering a supported version of RHEL or SLES, a supported virtualization OS, or a variant of a Linux OS to make the operating systems available for automated or interactive installations is a simple process: you register the OS in the repository, copy the vendor-supplied installation files to the repository, and copy the appropriate boot files to the associated boot target directory. To register an OS in the repository, follow these steps: 1.
Table 11 OS registration information (continued) Registration information Description Supply for supported OS, custom OS, or both Enter the full web address (using the IP address) to the OS installation media, such as http://192.0.2.1/redhat/some_version/. For repository entries for SLES 10 and SLES 10 SP1, you must verify if the remote installation media use CD1,CD2,… directories; otherwise use a directory named DVD1.
10. Select OK to return to the Manage Repository screen. Two new items appear in the table. One item is of the type Supported OS and the other is of the type Boot image. The Boot image item type is added for you automatically. Its name is the same as the supported OS with the word Boot appended. The option to add a Boot image item type is never available because this item type is always associated with a Supported OS item type, and thus, it is created automatically for you.
◦ Hyphens (-), periods (.), and underscores (_) Enter a descriptive name but do not use the SPP or PSP tar.gzip file name, which can be quite long. • Provide the version number of the SPP or PSP that you copy to the repository. For the supported versions, see the HP Insight Control for Linux Support Matrix. For PSPs, the version number must be in the form of N.NN, for example, 9.0x. For SPPs, the version number must be in the form of NNNN.NN.N, for example, 2012.05.0.
◦ Numbers 0 (zero) through 9 ◦ Hyphens (-), periods (.), and underscores (_) Do not append .cfg to the file name. • Description of the file. • From the drop down list, select the registered operating systems to which the configuration file is applied during an unattended OS installation. Use the Ctrl-Left Mouse Button key combination to select multiple operating systems. • 6. 7. Optionally, associate the configuration file with a custom OS.
• Hyphens (-), periods (.), and underscores (_) The name of the file you copy to the repository must be the same as the name of the item registered. For information about the importance of choosing unique names for items you are registering in the repository, see Section 5.1.3 (page 46). 6. 7. 8. 9. Select Save. View the summary information, which includes the directory and path where you upload the script.
• “Copying RHEL into a remote repository” (page 53) • “Copying SLES into the repository” (page 53) • “Copying virtual machine OS into the repository” (page 56) • “Copying a custom OS into the repository” (page 56) • “Downloading SPPs and PSPs into the repository” (page 56) 5.3.
/opt/repository/os/SLES10SP4-x64 The boot target directory name is similar to this: /opt/repository/boot/SLES10SP4-x64Boot There are three DVDs that comprise SLES Version 11. Only the first DVD must be copied to the repository. DVD2 contains source files; DVD3 contains the documentation. Each service pack release for SLES Version 10 has already applied all patches to the installation media. To copy vendor-supplied SLES Version 10 OS installation files into the repository, follow these steps: 1.
1. Visit the following web address to determine the appropriate link: http://drivers.suse.com/hp/ Choose the appropriate link: • For your server • For your server's architecture • For the version of the SLES operating system Read the install-readme.html file to verify the selection and for installation instructions. 2. Download the KISO image from the SUSE web address: # wget http://drivers.suse.
5.3.4 Copying virtual machine OS into the repository The procedure for copying virtual machine OS into the repository depends on the virtual machine software: • For VMware ESX, the process is identical to copying a RHEL operating system to the repository. For more information, see “Copying software to the Insight Control for Linux repository” (page 52).
5.3.7.1 Downloading and copying a PSP 1. Open a browser to the HP Support Center website: http://www.hp.com/go/hpsc 2. 3. 4. Select the Support & Drivers tab near the top of the page. Select Drivers & Software. Enter your server model (for example, DL360 G7) in the Enter a product name/number text box, then click Search. NOTE: If more than one server model matches the value you entered in the For product text box, select the appropriate server model from the search results. 5. 6.
5.4.1 Editing registered items in the repository You can edit selected information for repository items after the registration process is complete. Editing the name of a repository item does not change the associated file or directory names, and changes only the name that the user interface displays. You can change the path to a remotely hosted repository item. To edit an item in the repository, follow these steps: 1.
6 Configuring network parameters for virtual media Topics include: • “Introduction” (page 59) • “Preparing for virtual media” (page 60) • “Using the Define Networks tool” (page 63) • “Using the Network Configuration Editor” (page 66) • “Next Step” (page 70) 6.1 Introduction Virtual media is a mechanism available only for systems with an iLO-based management processor. Virtual media allows a system to boot an ISO image over the network; it is the alternate boot mechanism to PXE.
Usually, network configuration is performed in two stages: • In the first stage, you define the network configuration parameters and store them under a network name. You can have as many network name definitions as you want. • In the second stage, you use the Network Configuration Editor to apply the predefined network names to the server's management processor. These tools are discussed in “Using the Define Networks tool” (page 63) and “Using the Network Configuration Editor” (page 66), respectively.
3. Select either the Discover a group of systems or Discover a single system button. There is a slight difference in the window for these two choices. The Discover a group of systems choice is in the illustration. 4. Enter a descriptive name in the Name text field. The descriptive name must be either listed in the CMS's hosts file or known to the CMS's name server. Otherwise, enter an IP address. 5. 6. Ensure that the Schedule check box is not checked.
specified when you installed Insight Control for Linux. The iLO is capable of supporting multiple user accounts; if your iLO was already configured with other user accounts you can just add another user account.
6.2.3 Licensing virtual media on the management processor Your iLO Advanced license key activates iLO Advanced features. For the latest instructions, which may supersede those shown below, see the following website: www.hp.com/go/insightlicense These instructions assume the network client has a network connection to the iLO-based management processor. To install the iLO Advanced license and enable the iLO Advanced functionality using a supported web browser: 1.
Figure 15 Define networks tool The parameters in the Define Networks tool include the following: • Available Networks This is a list of the network definitions. When you create a new network definition, its name is displayed in this list after pressing Save. When a network name in the Available Networks list is selected and you select the Load button, its network parameters are displayed in the appropriate fields; you can select only one network at a time.
You can enter a comma-separated list of ranges, for example: 192.168.10.5-192.168.10.50,192.168.11.100-192.168.11.199 If you want to assign IP addresses manually, leave this field blank. • SNMP Server(s) Optionally enter a list of SNMP servers. These entries are reserved for future use. • Name Server(s) Optionally enter a comma-separated list of DNS Name server IP addresses for this subnet. • NTP Server(s) Optionally enter a comma-separated list of NTP servers.
2. Select the Delete button. Unless there are any systems that had this network applied to them, the network definition is erased and its name is removed from the Available Networks list. 6.4 Using the Network Configuration Editor Use the Network Configuration Editor to assign networking parameters (that you defined with the Define Networks task) to the servers that will be booted using the virtual media mechanism; this ensures that the server's network is set up properly.
Figure 16 Network Configuration Editor page 4. 5. For each MP, optionally verify it by moving the mouse pointer over the Management Processor Name field, but do not select it. The MP's serial number and IP address are displayed to help you identify it. Each target MP is listed in a table. You have the option of: • Selecting individual target MPs. Click the checkbox in the left column of the individual target MP. • Selecting all the target MPs listed.
b. c. Enter the base name for the server names. For the example, the base name would be sage. Enter Iterator Start Value. For the example, that value would be 01 to ensure a leading zero. The number of digits that you enter for the value for the iterator determines whether the host names generated have leading zeroes. For example, if you entered comp for the base name and 001 for the iterator, the first available host name would be comp001, the next would be comp002, and so on. d. 7.
Selecting a network from this list assigns that network to the NIC represented by the MAC address selected in the Port/MAC Address column. This automatically assigns the next IP address available in the IP address range of the network and assigns the other network values (that is, the gateway, the name server, the domain, and the net mask) for that network to the NIC. If the IP address range was not specified, this field is blank and you must specify the IP address within the selected network here. 9.
In this dialog box, select Apply to set these values and close the dialog box. Selecting Cancel closes the dialog box without taking action. Save Selecting this button saves the settings for the selected targets to disk. Reload Selecting this button loads the settings for the selected target with the values stored in the disk file. Any changes that you did not save are lost.
7 Discovering systems, switches, and enclosures This chapter addresses the following tasks, which you must complete in the following order when you are configuring and setting up Insight Control for Linux: 1. “Discovering systems” (page 71) 2. “Assigning Insight Control for Linux licenses to discovered systems” (page 74) 3. “Preparing and discovering switches and enclosures” (page 75) 4. “Changing the boot method” (page 76) 7.
The following items are individual methods to discover a bare-metal server to be booted using PXE. Choose the one that applies. • Use the Initiate Bare Metal Discovery tool described in Section 7.1.2 (page 72). Be sure to select the PXE radio button in step 4. If you need information on discovering an iLO, see “Discovering the management processor with HP SIM” (page 60). • Power on the server, watch the console, and press the F12 key when prompted to initiate a one-time PXE boot.
IMPORTANT: • Ensure that HP SIM has discovered the MP. Use the HP SIM Options→Discovery... menu item Credentials for the MP will be from default MP credentials unless otherwise specified. Passwords for management processors either must be known to Insight Control for Linux or already set to global values. • You can also use the Initiate Bare-Metal Discovery tool to discover bare-metal systems through PXE or virtual media.
components of the SPP or PSP, but at a minimum, you must install the components, listed in Table 19 (page 136), which HP SIM requires. To download an SPP or a PSP or obtain the associated HP ProLiant Support Pack User Guide, follow the instructions in Section 26.7.2 (page 251). 2. On the system to be discovered, use the following command to configure SNMP: # /sbin/hpsnmpconfig 3. 4. Repeat steps 1 and 2 for every installed system to be discovered.
7.3 Preparing and discovering switches and enclosures To discover switches and HP BladeSystem enclosures for Insight Control for Linux monitoring, follow these steps. Skip this task if the configuration does not contain enclosures or switches or you do not want to monitor them with Insight Control for Linux. 1. If one or more HP BladeSystem enclosures are present, go to each enclosure and set the Onboard Administrator (OA) user name and password credentials.
f. g. h. i. j. k. Scroll to the {collection_name}_Switches subcollection. Select the radio button next to the {collection_name}_Switches subcollection. Select Edit.... Select a switch listed in the Available Items column, and use >> to move it to the Selected Members column. Select OK to add the switch to the Insight Control for Linux {collection_name}_Switches subcollection. Repeat the last two steps for every switch you want to monitor. 7.
7.5 Next steps If you are configuring Insight Control for Linux for the first time, proceed to Chapter 8 (page 78) to install and set up your managed systems. 7.
8 Setting up managed systems This chapter is an overview on setting up managed systems for Insight Control for Linux monitoring. This chapter addresses the following tasks, which you must complete in this order: 1. “Populating the Insight Control for Linux repository” (page 78) 2. “Setting up management hubs” (page 151) 3. “Linux OS installation” (page 78) 4. “Setting up managed systems for monitoring” (page 78) 8.
8.3.1 Opening network ports on managed systems The network ports listed in Table 12 are used for communication between the managed systems and the CMS. These ports must be open to network traffic. If you used Insight Control for Linux to install an OS and you used a configuration derived from a supported template, the firewall is enabled by default and Insight Control for Linux opens the ports listed in Table 12 automatically.
# /bin/hostname If the node does not report a host name, set one or configure DHCP to assign one. DHCP configuration information is located in the HP Insight Control for Linux Installation Guide. 2. On the CMS, run the mxgethostname command with the host name obtained in the previous step. For this example, the host name is venus: # mxgethostname -n venus If the CMS recognizes the host name, command output is similar to the following: Host name: venus.example.com DNS Name: venus.example.
Figure 18 Installing providers and agents 3. 4. Select Next>. Review the settings for Configure or Repair Agents, as shown in Figure 19. Insight Control for Linux requires you to make settings in the Configure SNMP and Configure secure shell (SSH) access authentication sections of this screen. 8.
Figure 19 Settings for configure or repair agents 5. Make the following settings to configure SNMP: • Select Set read community string and enter the value for your network configuration. NOTE: To discover or identify a server that becomes a managed system, HP SIM requires that a SNMP read community string must be set to public in the global credentials for that server. There may be additional read community string settings in addition to public, but public must be specified. 6. 7. 8.
9. Select the Use the following credentials for all systems radio button and supply the managed system credentials, which is typically the root user name and password. 10. Select Run Now. Selecting a protocol that is not supported in your environment causes an error and a task reports its status as failed. Even if this happens, it is possible that the SNMP and SSH settings required for Insight Control for Linux were configured correctly. Look at the task results to verify this.
If you are installing Xen on a managed system or if you do not specify the console assignment, you must modify several files on each managed system for serial console monitoring to function, provided that the BIOS is configured as described in step 1 above. These files are: • /boot/grub/menu.lst • /etc/inittab • /etc/securetty NOTE: The following procedure assumes that COM1 is assigned to the virtual serial port in the BIOS, thus, ttyS0 is being used.
1 2 The backslash (\) in this example indicates line continuation. Do not enter a backslash character in your file. Add console=ttyS0 here. Make sure you enter the number zero, not the letter O. For managed systems that are virtual hosts: Look for the default= attribute, add com1=115200,8n1 console=com1 to the kernel entry for HP BladeSystems, and add console=ttyS0 to the module entry. For example: default=0 timeout=5 splashimage=(hd0,0)/grub/splash.xpm.
Example 2 Excerpt from sample /etc/ssh/sshd_config file # To disable tunneled clear text passwords, change to no here! #PasswordAuthentication yes #PermitEmptyPasswords no PasswordAuthentication yes 8.4 Next steps If you are installing and configuring Insight Control for Linux for the first time, proceed to Chapter 19 (page 154) to configure Insight Control for Linux monitoring services.
9 Installing operating systems on managed systems This chapter addresses the following topics: • “Linux OS installation overview” (page 87) • “Using installation configuration files for unattended installations” (page 88) • “Prerequisites to OS installations on managed systems” (page 92) • “Installing RHEL on managed systems” (page 94) • “Installing SLES on managed systems” (page 95) • “Installing VMware ESX and VMware ESXi operating systems” (page 96) • “Installing another variant of Linux on
Table 13 Types of Installation Sessions (continued) Installation Interactive Unattended Custom or Other Custom or Other Interactive Custom or Other (Unattended) VMware ESX VMware ESX Interactive VMware ESX (Kickstart) VMware ESXi VMware ESXi Interactive For more information about using Kickstart and AutoYaST files for unattended installations, see Section 9.2 (page 88).
NOTES: ◦ The template file is written for 64-bit installations. If you want to install a 32-bit, you will need to modify the installation and configuration file accordingly. ◦ The AutoYaST templates that HP supplies use DHCP for hostname/DNS configuration. If there are multiple interfaces configured through DHCP, then the operating system can select any of them as the host name during the installation.
The following table provides a few examples: OS version Directory name in /opt/repository/instconfig SLES Version 10 Service Pack 4 sll04 SLES Version 11 Service Pack 1(for Management Hubs) sl111–management-hub SLES Version 11 Service Pack 1 (for Xen Virtual Hosts) sl111–virt-host-xen SLES Version 11 Service Pack 1 (for Xen Virtual Guests) sl111–virt-guest-xen RHEL Version 5 Update 7 rh057 RHEL Version 5 Update 7 (for Management rh057–management-hub Hubs) • RHEL Version 6 Update 1 (for KVM Vir
Table 14 Insight Control for Linux macros for installation configuration files Macro name Description %%agentinstall%% This macro is unique to Insight Control for Linux. During installation, it expands into a shell script that downloads the SPP or PSP components from the CMS and installs only the packages that HP SIM and Insight Control for Linux need to be able to monitor the managed system properly.
9.2.3 Installation configuration files for custom operating systems You can upload installation configuration files for unsupported operating systems into the Insight Control for Linux repository. However, the OS installation process does not have a built-in mechanism for linking the installation configuration files to a given installation.
9.3.1 Additional prerequisites for certain ProLiant servers Some server/operating system combinations require an updated boot RAM disk (initrd). These servers are identified in the HP Insight Control for Linux Support Matrix. To perform a RHEL Kickstart or SLES AutoYaST installation on these servers, you must replace the initrd supplied with the standard Red Hat Linux or SUSE Linux distribution with a customized initrd provided by HP.
Table 15 Download web address for customized initrd files and Driver Kit images (continued) Operating system Architecture Download address Then download and install this customized initrd file for the RAID driver: ftp://ftp.hp.
9.4.2 Installing RHEL interactively An interactive installation method requires interaction with the RHEL installation user interface. Other than PXE booting from the selected OS release, update, and architecture, Insight Control for Linux provides no other automated configuration service with this interactive method. You must interact with the OS installer through the selected console type.
9.5.2 Installing SLES interactively An interactive installation requires interaction with the SLES installation process. Other than booting from the selected OS release, service pack, and architecture, Insight Control for Linux provides no other automated configuration services with this interactive method. You must interact with the OS installer through the selected console type.
IMPORTANT: Installing a virtualization OS on a system erases data on that system. Before you begin, be sure that you have captured or backed up any data you want to retain. Preserving user data on volumes other than the principle target volume is not guaranteed. Presume that data on primary and secondary volumes is erased. The tasks for installing the virtualization OS are launched from the following HP SIM menu: Deploy→Operating System 9.6.
If you want the target system to use the default root password (root), select the Use Default Root Password option. To set a root password other than the default, select the Specify Root Password option, enter the root password, and verify the entry. HP recommends setting a strong root password on all your severs. 9. Do one of the following to start the installation: • Select Run Now to launch the installation operation immediately. • Select Schedule to schedule the installation to occur in the future.
IMPORTANT: The list contains only those virtualization operating systems that are registered in the repository and copied to it. If you select a virtualization OS that was registered, but the installation files were not copied to the repository, a validation error appears. 6. 7. Specify the kernel append line to add additional kernel command line parameters. The kernel append line is added to the end of the installation RAM disk kernel line; however, you do not need to provide any information.
The general use of the Custom or Other installation tool is not officially supported because you, and not Insight Control for Linux, must manage most of the boot and installation process. Also the agents that HP SIM requires for long term monitoring and manageability of your managed systems are not provided in SPPs or PSPs and thus are not installed automatically.
TIP: 4. 5. When you create a script, make sure that it is executable by the root user. Copy the scripts to the /opt/repository/custom/MyOS directory. For unattended installations, register the installation configuration script with the Insight Control for Linux Repository: For information on registering an installation configuration file, see Registering operating systems (page 47) 6. Select the Deploy→Operating System→Custom or Other menu item to begin the installation.
7. If you are performing an unattended installation, select the installation configuration file (either a Kickstart or AutoYaST file) for the OS type and version you are installing, and select Next>. Otherwise, skip this step. When you register an installation configuration file, you identify which operating systems it applies to. This association feeds into the list of configuration files that are available for a particular installation operation.
• Select Schedule to schedule the OS installation to occur in the future. 11. Examine the Task Results window to follow the progress of the installation operation and the related task states. 9.
10 Capturing and deploying Linux images This chapter addresses the following topics: • “Overview of capturing and deploying Linux images ” (page 104) • “Prerequisites to capturing a Linux image” (page 106) • “Capturing a Linux image from a managed system” (page 109) • “Preparing for scalable deployment” (page 110) • “Deploying a captured Linux image to one or more managed system” (page 113) • “Insight Control for Linux partition wizard overview” (page 116) 10.
NOTE: To account for the time it may take to capture or deploy a very large image over a slow network, a time out of five days is in effect for capturing or deploying a Linux image so that you can determine if an operation hangs. HP recommends that you check your task results to verify the status of any running jobs. 10.1.1 File system types Table 16 lists the supported and unsupported file system types on the source and target managed systems for Linux image capture and deployment tasks.
The script is run in a chroot environment so there is no need to configure paths relative to the Insight Control for Linux environment. For information on how these scripts can be used, see the comments in the example scripts provided with Insight Control for Linux. 10.1.
Table 17 Source and target deployment requirements Item Requirement Server type The hardware models of the source and target managed systems must be the same. For example, if you capture an image from an HP ProLiant BL460 Gen8 server, you can only deploy that image to another BL460 Gen8 server. Memory Differences in the amount of memory on the source and target managed systems are permitted. Number of NICs Differences in the number of NICs on the source and target managed systems are permitted.
If you must use static addresses and host names, create a Postdeployment script capable of setting these values. • For SLES images, change the hard links to soft links before capturing the image. SLES relies on the use of hard links within its file system, and the tar command that captures the image captures those hard links.
Field position Description Example 5 Dump option 1 6 fsck option 0 The following example does not include the contents of /scratch in the captured image (because the dump flag is set to 0). During the image deployment operation, the disk is repartitioned and /scratch is an empty file system. /dev/sdc1 /scratch ext3 defaults 0 0 10.3 Capturing a Linux image from a managed system IMPORTANT: Remember that captured images are retrieved through a web server interface that allows anonymous access.
TIP: For information about the importance of choosing unique names for items in the repository, see Section 5.1.3 (page 46). The name you supply is appended with a unique identifier and the date and time when the task occurred. The image is stored as a gzipped tar file in the /opt/repository/image/ {prefix_date_and_time} directory on the CMS. 7. Select a Precapture script, a Postcapture Script, or both. A Precapture script is run on the managed system before the image is captured.
Figure 20 Network groups example The concept behind a scalable deployment is to transfer an OS image tar file from the CMS to the group leader in each network group. After the image tar file is completely transferred, the group leader transfers the image to each of the remaining servers in the network group. The advantage to this concept is that all network traffic is kept local to the switch or enclosure.
The Customize Collections window appears. 2. Select New... in the Customize Collections window. A new section titled New Collection appears at the bottom of the Custom Collections window. 3. 4. Select the Choose members individually radio button. Select All Servers from the Choose from: menu. This action populates the Available Items: list with the available servers. 5. Perform the following steps for each switch you have: a.
f. Select Save As Collection... The Save As Collection portion appears. g. Enter a name for this network group. The name is used only to associate the managed systems in the network group. h. i. j. Select Existing collection: and choose the Network Groups menu item. Select OK to continue. Generate the netgroups.conf file with the following command: # /opt/hptc/bin/netgroup --ofile /opt/mx/icle/netgroups.conf k. Examine the netgroup.conf file to verify the collection entry for the group.
IMPORTANT: • HP recommends that, if you are deploying the image to a software RAID array or an LVM volume, that you wipe the disk or disks that will receive the image. • Before deploying a 64-bit OS image to an AMD Opteron 6200 server, add the following entry to the /opt/mx/icle/icle.
• Select the Create partition scheme from wizard option if you want to customize the disk partition layout on the target managed system, and the following table appears: Figure 21 Existing disk partition scheme See Section 10.6 (page 116) for a general overview of the Partition Wizard and how to use it to edit disk partitions and volume groups. Select Next> after you have completed customizing the disk partition layout. 9.
10.6 Insight Control for Linux partition wizard overview The Insight Control for Linux Partition Wizard is a generic hybrid of the Red Hat and Novell Partition Wizards. The Partition Wizard does not have logic to examine the managed systems on which it is used, thus you must have prior knowledge of the storage hardware. The Partition Wizard enables you to capture an image with one partition scheme and then to deploy the image to one or more managed systems with a more customized partition scheme.
• If you are capturing and deploying a reiserfs or an ext3 partition type, ensure that the mount points are set, as required. Partition types swap and lvm do not have mount points. The Partition Wizard permits you to proceed without specifying mount points for the reiserfs and ext3 partition types, and it does not detect the missing mount points. This might cause the deployment to fail, and the failure is indicated in the Task Results. • The Partition Wizard does not save entered values for reuse.
The initial Partition Wizard table is divided into two sections: Hard Drives, the top of the table that shows the physical devices, and Volume Groups, the bottom part of the table that shows logical volumes: • The Hard Drives section represents the physical media on the server. You must have prior knowledge about the hardware in order to add the correct number if disks. You can add a maximum of 16 disks to the Hard Drives section along with a maximum of 16 partitions per disk.
11 Installing and setting up virtual machines This 1. 2. 3. 4. 5. 6.
2. 3. 4. Set the Global Sign-In credentials for the virtual host with the Options→Security→Credentials→Global Credentials... menu item. Install the operating system with virtualized configuration on the physical server of your choice. Chapter 9 (page 87) describes the steps for using Insight Control for Linux to install a Linux operating system. Run Options→Identify Systems... to verify the installation. The next step is to register the virtual host with the virtual machine management. 11.
3. Examine the system page for the virtual host with Tools→System Information→System Page... task to verify that Insight Control virtual machine management is configured correctly. Locate the System Subtype row under Product Description. The description should contain the text Virtual Machine Host. 11.3 Creating and installing virtual guests Generally this section discusses how: • To create the virtual guest: HP suggests that you use the vCenter application for VMware ESX and VMware ESXi.
6. 7. Boot the VM guest, and proceed through an interactive install. Perform a network installation using an installation configuration file from the Insight Control for Linux repository. Be sure to specify any required kernel parameters. The following is an example of the response to the boot prompt. boot: linux ks=http://cms:port/instconfig/os/os.
TIP: Match the machine name to the host name in a virtual machine map. See Section 23.10 (page 196). • Ensure that the localhost (QEMU) is connected. If the localhost entry is missing, select File→Add Connection, then select QEMU/KVM as the hypervisor, specify that the connection is Local, and select Connect. If the localhost entry exists but is not connected, right-click on the localhost entry and select Connect. • Start the procedure by selecting New. • Specify a unique name for the virtual guest.
11.3.2.2 Installing a SLES KVM virtual guest Use the following guidelines for installing a SLES KVM virtual guest: • Verify that the AutoYaST file for the virtual guest resides in the /opt/repository/instconfig/osver-virt-guest-kvm directory on the CMS, where osver indicates the operating system version, for example, sl111. The format of the AutoYaST file name is osver-virt-guest-kvm.cfg • Installing a SLES KVM virtual guest requires an ISO. Download the ISO and copy it to the KVM virtual host.
◦ • Accept the default values for the Power Off, Reboot, and Crash options. Before you select OK to start the installation, be advised that you have 20 to 30 seconds to specify that an Installation is to be performed on subsequent screens. If the timeout elapses, the virtual guest attempts to boot from the hard disk. The following needs to occur within this time: ◦ The virtual guest console should open automatically after you select OK.
For information on licensing virtual guests, see Section 3.3 (page 28). IMPORTANT: The RHEL Kickstart and SLES AutoYaST configuration template files for virtual guests are delivered with a hard-coded root password, which poses a security issue if used without modification.
The SLES AutoYaST file is located in /tmp/osver-virt-guest.cfg. • Specify the Simple file option for the storage space assignment. • Select an available physical device for the connection to the Host Network, for example, peth0. • To monitor a virtual guest, it must be assigned a well-known IP address. This can be either the static IP address that you entered when you installed the virtual guest or, if you used DHCP, the fixed IP address that maps to the MAC address you establish.
11.5 Establishing monitoring for virtual hosts and virtual guests NOTE: Insight Control for Linux does not support monitoring of VMware ESXi virtual hosts or virtual guests running Microsoft Windows Configuring a virtual host or a virtual guest for monitoring is the same procedure as for real managed systems. In short, the procedure consists of the following Insight Control for Linux menu items: 1. Configure→Configure or Repair Agents... on the virtual guest.
NOTE: For specific commands, see the virsh(1) and virt-manager(1) manual pages that accompany your KVM distribution. 11.
12 Using Insight Control for Linux to update HP ProLiant firmware This chapter addresses the following topics: • “Overview of updating HP ProLiant firmware” (page 130) • “Basic firmware update functionality” (page 130) • “Advanced firmware update functionality” (page 134) 12.1 Overview of updating HP ProLiant firmware Keeping firmware up to date is a challenging but necessary task. Each ProLiant server usually has several devices that require regular firmware updates, which can create a burden.
12.2.1 Initial setup Before you can initiate a firmware update on a server, you must download and prepare the firmware files and tools that do the work. Insight Control for Linux uses the HP Smart Update Firmware DVD or Service Pack for ProLiant (SPP) for all firmware updates. Downloading and installing these files is a one time setup operation, although when new versions of the HP Smart Update Firmware DVD or SPP become available, update the tools on your CMS.
You are asked to verify the targets and license any unlicensed nodes. On the next screen, you optionally can enter option flags for the hpsum command. For normal operation, do not enter anything in this screen, however, if you want to specify any option flags for the HPSUM command, enter them in the text field provided. For more information on the hpsum command's option flags, see Section 12.2.5 (page 132). Select Run Now to start the update process or Schedule to schedule it for a later time.
12.2.6 Adding or removing firmware files from the firmware tar file HP continuously releases new firmware for devices, and these new releases are usually in the next revision of the Smart Update Firmware DVD. However there might be times when you want to use this new firmware before the next DVD is released. There also might be times when you do not want to update the firmware on a specific device, and so you do not want that device’s firmware file in the firmware tar file.
FW_UPG_WAIT_TIMEOUT=900 Determine the value (in seconds) that is appropriate for your installation and assign it to the FW_UPG_WAIT_TIMEOUT value. 12.3 Advanced firmware update functionality Insight Control for Linux incorporates advanced firmware update options if you require more control over exactly which versions of firmware should be installed and on which systems.
IMPORTANT: Ensure that system values are unique in the file. For example, there should not be two identical MAC addresses in the same configuration file. Wildcards are not supported in the configuration file. MAC addresses are case insensitive and must be separated by colons (:). 12.3.2 Example firmware configuration files The following are examples of configuration files: Example 1 prod-server-1=production-firmware.tar prod-server-2=production-firmware.tar 172.31.64.
13 Installing SPPs and PSPs on managed systems This chapter addresses the following topics: • “Overview of the SPP and PSP installation tool” (page 136) • “Required SPP and PSP components” (page 136) • “Creating a SPP or PSP dependency script” (page 137) • “SPP or PSP installation procedure” (page 138) 13.1 Overview of the SPP and PSP installation tool The Insight Control for Linux SPP and PSP installation tool enables you to install any or all SPP or PSP components on one or more managed systems.
1 This agent is installed on servers with iLO 4 management processors. While HP SIM requires hp-ams so that it can use embedded features of the iLO 4 management processor, Insight Control for Linux does not use it. If you want to use only the hp-ams agent on your iLO 4–based servers, you must manually remove the other agents. 2 The Agentless Management Service (AMS) will be responsible for sending all host operating system-specific to the iLO 4 firmware.
http//172.0.0.4:60000/os/RHEL6ESU2-x64/RedHat/RPMS/kernel-devel-2.6.9-67.EL.x86_64.rpm # Exit with 0 status - a non zero status will generate an error exit 0 SPP dependency scripts and PSP dependency scripts have the same form and function. Managed systems are rebooted when the SPP or PSP installation script is finished, regardless of the outcome of the SPP or PSP installation.
NOTE: For a list of PSP configuration parameters and their descriptions, see the HP ProLiant Support Pack User Guide. For instructions on how to obtain this document, see Section 26.7.2 (page 251). You must ensure that any changes made to the configuration parameter set are valid. No verification of the configuration parameters is performed. The configuration values you specify here are not saved. To preserve them, use a text editor to cut and paste them from this window into a file. 9.
14 ISO control operations ISO Controls allow you to boot from an ISO image, insert an ISO image, and eject an ISO image on iLO-based managed systems. You can use this functionality to perform interactive OS installations from OS distribution ISOs, including Windows. The ISO image must be registered in the Insight Control for Linux repository before you can perform these operations. For information on registering an ISO image, see “Registering an ISO image” (page 52).
15 Remote server controls The menu items on the Tools→Server Controls menu enable you to remotely manage power control on a physical managed system. IMPORTANT: Be aware that the Insight Control for Linux server controls operate by contacting the management processor of the server directly and executing the requested power function. That means that servers are powered off or cycled abruptly without a graceful shutdown.
16 Using SSH for remote server management Insight Control for Linux provides several ways for you to access a managed system through SSH. This chapter addresses the following topics: • “Setting SSH credentials on managed systems” (page 142) • “Setting SSH credentials for users” (page 142) • “Running a command on multiple managed systems” (page 143) • “Using Insight Control for Linux to run commands and scripts through SSH” (page 144) 16.
Deploy→Operating System→Capture Linux Image On the Task Results screen, the Task Instance Results always shows the user who launched the task. This might not be the credentials used for the task execution. Because different target managed systems can have different users specified in the SSH settings, the same task can run on different targets as different users. 16.
16.4 Using Insight Control for Linux to run commands and scripts through SSH The following menu items enable you to run a script or command through SSH to one or more managed systems: • Tools→Command Line Tools→Run SSH Command... • Tools→Command Line Tools→Run Script... 16.4.1 Running an SSH command The Tools→Command Line Tools→Run SSH Command... runs a command on a target managed system.
The Run Script... task feeds the command lines in the script to an SSH instance on the target system. The script is a series of command lines to be run on the target system using SSH. The Linux script you run must be located in the Insight Control for Linux repository in the /opt/ repository/script directory. You must ensure that the Linux script does not leave any open file descriptors upon completion (including scripts you might have called).
Part III Monitoring
17 Managing Insight Control for Linux collections This chapter addresses the following topics: • “Introduction to collections” (page 147) • “Populating a collection” (page 148) • “Adding servers and switches to an Insight Control for Linux collection” (page 148) • “Removing a managed system or switch from an Insight Control for Linux collection” (page 149) 17.
Table 20 Insight Control for Linux subcollections (continued) Object type Subcollection name Description How populated servers that Insight Control for Linux manages. Switches {collection_name}_Switches Insight Control for Linux monitors Populated manually only. all switches placed in this subcollection. Management Hubs {collection_name}_Management_Hubs This subcollection contains all Populated manually. For the servers that are designated information, see Section 18.2 as management hubs.
2. ◦ Install the complete ProLiant Support Pack (optional) ◦ Configure SNMP and SSH keys ◦ Configure console access and logging Add the servers or switches to the Insight Control for Linux collection: a. Select Customize... in the left pane of the HP Insight Control user interface;. b. Scroll down the name column until you see Systems Managed by IC-Linux. c. Select the plus sign (+) to expand it. d. Scroll down until you see your Insight Control for Linux collection. e.
2. Remove the managed systems or switches from the Insight Control for Linux collection: a. Select Customize... in the left pane of the HP Insight Control user interface;. b. Scroll down the name column until you see Systems Managed by IC-Linux. c. Select the plus sign (+) to expand it. d. Scroll down until you see your Insight Control for Linux collection. e. Select the plus sign (+) to expand it. f. Scroll down until you see the Insight Control for Linux subcollections. g.
18 Setting up management hubs 18.1 About management hubs A management hub is an aggregation point for management activities. Insight Control for Linux uses management hubs to distribute the management load across multiple servers. HP recommends creating multiple management hubs if you plan to monitor over 256 managed systems. You have the option of choosing any physical server to act as a management hub; you can elect to use the CMS as a management hub or not.
2. 3. Install the operating system for that server using the appropriate Kickstart or AutoYaST file; this file has the form *-management-hub.cfg to ensure that the required RPMs are installed. For specific information on installing operating systems, see Chapter 9 (page 87). Add the server to the {collection_name}_Management_Hubs collection as follows: a. Select Customize... in the System and Event Collections panel. This figure shows the location with a red arrow.
There are two text fields, Collection name and Choose from, and two lists, Available items and Selected Members. e. Select All Servers from the Choose from: menu. This action populates the Available Items: list with the available servers. f. Select the server from the Available Items: list. You can use Ctrl-Left Mouse for multiple selections. g. h. i. Use the >> button to move the selected servers from the Available Items: list to the Selected Members: list. Select OK.
19 Configuring monitoring services This chapter describes how to configure Insight Control for Linux monitoring services. In addition to an Section 19.1 (page 154), this chapter addresses the following tasks, which you must complete in this order: 1. “Configuring a self-signed Apache certificate on the CMS” (page 154) 2. “Starting management and monitoring services ” (page 154) 3. “Installing Insight Control for Linux management agents” (page 156) 4.
• It also deploys the Insight Control for Linux management agents to all servers in the {collection_name}_Servers subcollection. For information on managing subcollections, see Chapter 17 (page 147). Insight Control for Linux monitors only the objects in these collections: • Either all licensed servers are automatically added to the {collection_name}_Servers subcollection or only the servers in the {collection_name}_Servers collection, depending your response on the Auto-populate option.
• Enter no if you want Insight Control for Linux only to manage and monitor only the servers that you manually put in {collection_name}_Servers collection. TIP: 5. Populate your collections manually before proceeding. Select Run Now. This task can take several minutes to configure services. The Stdout tab shows the scripts that are running, and Done appears when this task is complete. 6.
3. Ensure that the pdsh command can run a command across all the managed systems. For example: # pdsh -a uptime pluto: 3:22pm up 0:49, 1 user, load average: 0.47, 0.47, 0.40 charon: 11:02am up 0:49, 1 user, load average: 0.38, 0.36, 0.36 poseidon: 9:46am up 1 day 4:46, 3 users, load average: 1.10, 1.23, 1.34 4. Verify that the nrpe daemon is working on all the managed systems with the following command: # /opt/hptc/nagios/libexec/gather_all_data --verbose write 4048, 2, 2, eth1 to db => icelx2 (charon.
If Warnings Are Reported If one or more warnings are reported in the Warning column, use the analyze option to obtain an analysis of the problem. When possible, the command output provides potential corrective action or the reasons for a given state.
20 Using graphical tools to monitor managed systems This chapter addresses the following topics • “Insight Control for Linux system monitoring overview” (page 159) • “Nagios overview” (page 160) • “Using Nagios” (page 163) • “Services monitored by Nagios” (page 171) • “Understanding Nagios alert messages” (page 173) • “Understanding system event log monitoring ” (page 174) • “Configuring Nagios email alerts” (page 174) • “Monitoring Metrics in real time” (page 175) 20.
NOTE: Insight Control for Linux does not support monitoring of virtual hosts running VMware ESXi , and does not support servers or virtual guests running Microsoft Windows. 20.1.1 Collecting metrics through a management processor Insight Control for Linux supports management processors using the iLO or IPMI protocols for gathering sensor and system event log information. To access a system’s management processor, you must configure the management processor credentials in HP SIM.
Nagios obtains its sensor and metric data from the Supermon open source monitoring application, which is integrated with the Insight Control for Linux. Figure 23 illustrates the interaction of these tools. Figure 23 System monitoring tools integration The mond and syslog daemons run on every managed system. The Supermon service manages requests for mond daemons that run on a subset of systems.
20.2.2 Launching Nagios To launch Nagios, you must have a valid certificate for the Apache service. To configure an Apache certificate, see Section 19.2 (page 154). Select the following menu item from the Insight Control user interface to launch Nagios: Tools→Integrated Consoles→Nagios The Nagios main window shown in Figure 24 appears when you launch Nagios. Figure 24 Nagios main window From the Nagios main window, you can choose any of the menu options on the left navigation bar.
Hosts Services Host Groups Summary Grid Service Groups Summary Grid Problems Services (Unhandled) Hosts (Unhandled) Network Outages Reports Availability Trends Alerts History Summary Histogram Notifications Event Log HP Graph System Comments Downtime Process Info Performance Info Scheduling Queue Configuration NOTE: The term Hosts on the Nagios window refers to any object with an IP address, not just managed systems. Keep this in mind when using the Nagios application. 20.
Figure 25 Nagios tactical overview The top of the window provides information about the network. It provides the number of network outages and information on the network health in terms of the Nagios hosts and Nagios services. The next portion of the window contains information about the Nagios hosts. It reports the number of hosts that are down, unreachable, up, and pending. In Figure 25, two hosts are down.
A disabled service is a configuration status, not an error condition. Insight Control for Linux takes advantage of the Nagios passive check feature to optimize and to minimize data collection and reporting across large numbers of managed systems. NOTE: HP recommends that administrators do not enable these services because they are not meant to run under normal conditions and causes Nagios to generate false alerts. Nagios services are described in the next portion of the window. 20.3.
Figure 27 Nagios service detail view The Status column displays any problems that might be occurring. To display the status of a service, select the link for the service in the Service column to open the Nagios Service Information view shown in Figure 28.
Figure 28 Nagios service information view 20.3.3 Displaying hosts and services that are experiencing problems The Service Problems view, which is accessed by selecting Problems Services (Unhandled) in the Nagios menu, is useful for configurations with hundreds of systems. It identifies the Nagios hosts that are experiencing problems, and it shows only the corresponding Nagios services with status that is not OK, which enables you to monitor only those Nagios hosts that need attention.
Figure 29 Nagios service problems view Select the link that corresponds to a Nagios host to open the Nagios Host Information view for that Nagios host. You can also use the Nagios report generator, nrg, to obtain an analysis of Nagios services: # nrg --mode analyze For more information and examples of its use, see nrg(8). 20.3.
Figure 30 HP Graph default overview display Figure 31 HP Graph detail display of managed systems If you want to display the graphical data for a selected Nagios host (a Nagios host can be a virtual host), select an item in the menu in the upper left-hand side. Figure 32 (page 171) shows the graphs for one managed system, osmone. The following menus and menu items control the information you can display for a managed system: • The Metric menu influences the information shown in the graphs.
• • cpu iowait Reports the percentage of time the system was waiting for I/O to complete or to handle an interrupt. cpu system Shows how much of the CPU time was spent on system-level tasks. cpu usage Reports how much of the managed system's CPU set was spent in the user, system, and nice states. This is the default view. load average Reports the 1, 5, and 15 minute load averages. mem buffers Shows how much of the managed system's memory is allocated to system-wide memory buffers.
Figure 32 HP Graph host display for one managed system 20.3.5 Gathering and displaying system environment data Insight Control for Linux provides plug-ins that monitor the environment data on each managed system such as temperature and fan speed, which can be indicators of possible system failure. To display environment data, select the Service Problems menu item in the left frame of the Nagios main window to open the Service Status for All Hosts window.
Table 21 Nagios monitoring plug-ins running on the CMS Service name Plug-in name Function/Description Apache HTTPS Server check_http Monitors the Web server providing the Nagios Web interface. Configuration Monitor check_node_config Periodically generates and updates configuration information for managed systems. IP Assignment DHCP check_procs Watches the DHCP service on the CMS. Management Settings Monitor check_nagios_vars Watches the /opt/hptc/etc/sysconfig/vars.
Table 22 Services monitored on managed systems (continued) Service name Function/Description The System Event Log is collected through the management processor, either an iLO or an IPMI BMC. System Events are hardware-related alerts such as memory errors, power supply faults, and so on. System Free Space2 Displays the system free space in /root, /tmp, /var, and /hptc_cluster. This data is compared to thresholds defined in the nagios_vars.ini file.
3 4 5 6 2 Critical other Unknown The name of the Nagios service description. For more information, see the corresponding /opt/hptc/nagios/etc/templates/*_template.cfg template file. The alert applies to this host name. The IP address of the host. The message text generated from the plug-in. In the following example, indicates that the Nagios monitor running on iclx47 collected this data.
host_notification_period service_notification_options host_notification_options service_notification_commands host_notification_commands email pager } 24x7 w,u,c,r d,u,r notify-by-email,notify-by-epager host-notify-by-email,host-notify-by-epager nagios@localhost.localdomain nagios@localhost.localdomain Changing the values for email and pager to reflect the system name enables Nagios to send notification through the sendmail utility. For example, change nagios@localhost.localdomain to nagios@example.com.
• Allows user customized and predefined metrics 20.8.3 Performance Dashboard requirements The servers you want to monitor must fulfill the following requirements for using the Performance Dashboard tool; the servers must be: • Licensed for Insight Control for Linux • Configured to use Insight Control for Linux monitoring services, as described in Chapter 19 (page 154) 20.8.
Figure 34 Monitoring three metrics using Performance Dashboard 20.8.4.1 Ring plot color coding The colors that the Performance Dashboard ring plot segments use represent the following: • Light Gray means that a managed system is actively reporting data. • Pink represents the actual value of the metric. • Dark Gray means that a managed system is not reporting data and might be down.
2. 3. 4. 5. Select target managed systems. You can select individual managed systems or all managed systems in the icelx_servers subcollection. Select Apply to move the selected managed systems to the target list. Verify the target list. Select Run Now to launch the Performance Dashboard tool. 20.8.6 Using the mouse buttons to manipulate the Performance Dashboard tool Table 23 describes how to use the mouse to manipulate the Performance Dashboard tool.
• User Time • System Time • Nice Time • Idle Time • Load Averages (1-Minute, 5-Minute, And 15-Minute Intervals) • Total Processes • Total User Processes • Total Zombie Processes • Network Received MB • Network Received Packets • Network Received Dropped Packets • Network Received Errors • Network Transmitted MB • Network Transmitted Packets • Network Transmitted Dropped Packets • Network Transmitted Errors • Total Swap • Swap In Use • Pages In • Pages Out • Pages Swa
21 Using the command line to view managed system status Insight Control for Linux provides commands that you can run on the CMS to determine the status of managed systems. This chapter addresses the following topics: • “Archiving sensor metrics on an individual basis” (page 180) • “Displaying usage, statistics, and metrics with the shownode command” (page 181) • “Displaying environmental data” (page 185) • “Reporting usage information and host and service status” (page 185) 21.
Example 6 Expanded sensor metrics # shownode metrics sensors iclx1 Timestamp |Node_Id |Name |Value |Description -------------------------------------------------------------------------date_and_time |iclx1 |Temp 8 Memory |54 |Celsius; ok date_and_time |iclx1 |Temp 5 CPU |31 |Celsius; ok date_and_time |iclx1 |Temp 2 CPU |33 |Celsius; ok date_and_time |iclx1 |Temp 7 CPU 2 |30 |Celsius; ok date_and_time |iclx1 |Temp 1 System |40 |Celsius; ok date_and_time |iclx1 |Temp 6 CPU 2 |30 |Celsius; ok date_and_time |ic
Admin: device: gateway: hwaddr: iftype: ifusage: interface_number: ipaddr: ipv6addr: mtu: name: netmask: port: switch: install_disk: is_blade: level: location: memory: n_sockets: node_number: power_setting_dts: power_setting_on: region: server_type: ervices: gather_data: hosts: provider_type: eth2 Admin 192.0.2.3 earth.example.com Unknown (edit /etc/snmp/snmpd.
iclx1 iclx2 iclx3 iclx4 iclx5 iclx6 |192.0.2.1 |192.0.2.2 |192.0.2.3 |192.0.2.4 |192.0.2.5 |192.0.2.6 |earth |neptune |saturn |mercury |192.0.2.5 |pluto |earth.example.com |neptune.example.com |saturn.example.com |mercury.example.com |192.0.2.5 |pluto.example.com |192.0.2.7 |Unknown |192.0.2.8 |192.0.2.9 |Unknown |192.0.2.11 |ILO3 |Unknown |ILO3 |ILO3 |Unknown |dl1v3 The shownode info --admin command displays a list of managed systems and includes the management processor user name and password. 21.
As shown in the following example, invoking the command without specifying a managed system displays the sensor data for all managed systems. The output is truncated horizontally to fit on the page.
# shownode metrics mem Timestamp |Node |Total |Free |Buffer |Shared |TotalHigh |TotalFree |Cached --------------------------------------------------------------------------------------------date_and_time |iclx3 |4039616 |240360 |135832 |0 |0 |0 |2744104 date_and_time |iclx5 |4148548 |3502700 |157864 |0 |3275096 |2819496 |407160 date_and_time |iclx2 |4048376 |2775708 |57020 |0 |0 |0 |375768 date_and_time |iclx4 |4039616 |231936 |103828 |0 |0 |0 |2264952 date_and_time |iclx6 |2054832 |1317672 |62868 |0 |0 |0
Check the sensor status on the enclosure. Verify the status of the Enclosures Collection Monitor which provides this data. nh The Enclosure Collection Monitor collects sensor information from the blade system enclosures. Enclosure status can be found in the Nagios Enclosure Status service plug-in status.
22 Connecting to a remote console This chapter addresses the following topics: • “Console management facility overview” (page 187) • “How CMF works” (page 187) • “Accessing a remote console” (page 187) • “Serial connections on DL100 series servers” (page 188) • “Enabling telnet access to iLO management processors” (page 188) 22.1 Console management facility overview The Console Management Facility (CMF) daemon, cmfd, collects and stores console output for all managed systems.
# shownode roles --role management_hub 2. Log in to the console with the console command. You can specify either the internal name or the host name. This example uses the internal name icelx16 instead of the host name mercury: $ console icelx16 Locating server for icelx16 Server for icelx16 is mercury.example.com Press q to exit login: IMPORTANT: The console command may not be able to connect to the system console if the dates on the management hubs are not synchronized. 3.
By default, the cmfd connects to the management processor using the SSH protocol. The iLO management processors support the SSH protocol, but the LO100i management processors for the DL100 G5 series servers require an HP LO100i Advanced Pack License to enable SSH support. Alternatively, if you do not want to purchase this license, you can instruct cmfd to connect to the management processors using the telnet protocol by performing the following steps: 1.
Part IV Other topics
23 Miscellaneous topics This chapter addresses the following topics: • “Changing management processor credentials” (page 191) • “Changing the default port for the repository web server” (page 191) • “Increasing the number of servers that can be discovered concurrently” (page 192) • “Changing the IP address of the CMS ” (page 192) • “Uninstalling Insight Control for Linux” (page 192) • “Determining the installed Insight Control for Linux version” (page 193) • “Event logging overview” (page 193)
2. 3. 4. 5. Change the value of the REPOSITORY_HTTP_PORT attribute to an unused port that is available and selected to run the repository web server. Save your changes and exit the text editor. If you have changed the port number for the repository web server, verify that the new port is open on the CMS. One method is to use the nmap command. Restart HP SIM by running the following commands: # /opt/mx/bin/mxstop # /opt/mx/bin/mxstart Wait for two to three minutes for HP SIM to restart completely. 23.
1. If you have installed the Insight Control for Linux management agents, select the following menu item from the Insight Control user interface to remove the management agents from all managed systems: Deploy→Deploy Drivers, Firmware and Agents→IC-Linux→Uninstall Agents... 2. Run the following script to unconfigure syslog-ng on the CMS: # /opt/hptc/etc/cconfig.d/C30syslogng_forward cunconfigure 3. Remove the Insight Control for Linux SSH keys for pdsh, which are located in the /root/ .ssh/ directory.
23.7.1 Understanding the event logging structure Insight Control for Linux uses syslog-ng to log events. Each managed system is configured to forward its syslog events to syslog-ng running on the CMS. Each managed system runs the syslogd daemon and passes events of priority warning or higher to the CMS. The CMS runs the syslogng_forward service and writes the events it receives from its managed systems to the /hptc_cluster/adm/logs/consolidated.log file. 23.7.2 The syslog-ng.conf rules file The syslog-ng.
23.8 Changing the number of concurrent tasks The number of concurrent tasks that Insight Control for Linux can run depends on the following: • The type of task being run. Each task is assigned a certain weight value. The /opt/mx/icle/icelx.execution.xml file lists the weight values for each type of task. • The value of MAX_CONCUR_CHAINS variable in the /opt/mx/icle/icle.properties file. The default value is 64.
2. 3. 4. 5. 6. Select Deploy→Deploy Drivers, Firmware and Agents→IC-Linux→Configure SNMP on DL1xx Servers.... Specify the target server. Verify the target server. Select Run Now. Monitor the Task Results window to follow the progress of the operation and the related task states. 23.10 Setting up the DHCP server for virtual guests HP recommends that you map (for the DHCP server) the MAC addresses to the IP addresses, particularly if you want to monitor the virtual machine guests (virtual guests).
# service xinetd restart 23.
24 Advanced topics Topics include: • “Management Processor Credentials” (page 198) • “Deploying WBEM provider components using Configure or Repair Agents task” (page 200) • “Logging RAM disk connections and operations” (page 201) 24.
5. Select OK. 24.1.2.2 Discovering and setting up servers with virtual media deployment If your site uses the virtual media deployment features of Insight Control for Linux, perform these additional steps when you discover the management processors: 1. For the initial part of the process, create an account on the management processor being discovered that matches the default Insight Control for Linux MP credentials. 2. Use the HP SIM discovery tool to discover the management processor. 3.
When a new set of credentials is entered with the Configure →Management Processor→Credentials... task, Insight Control for Linux attempts to find a user with the same user name. If one is found, the user password is changed to match the new credential. If no match is found, then the new credentials are placed in slot 15, overwriting the credentials. For this reason, do not store credentials, other than those for Insight Control for Linux, in slots 15 and 16. 24.
• kernel-source • sblim-indication_helper For SLES 10 SP3, the openwbem package must not be installed. All Xen virtual hosts must have a corresponding HP Service Pack for ProLiant (SPP) or a HP ProLiant Support Pack (PSP) installed. 24.3 Logging RAM disk connections and operations By enabling logging with the following procedure, you can which systems connect to the Insight Control for Linux RAM disk, and you can watch the progress of the RAM disk operations: 1.
Part V Troubleshooting and support resources
25 Troubleshooting This chapter addresses the following topics: • “General troubleshooting topics” (page 203) • “Alternative booting” (page 204) • “Apache service does not start” (page 204) • “Troubleshooting CMF problems” (page 204) • “Troubleshooting configuration problems” (page 207) • “Troubleshooting connection problems” (page 210) • “Troubleshooting DHCP problems” (page 211) • “Troubleshooting discovery problems” (page 213) • “Troubleshooting firmware update problems” (page 217) • “
Problem See: Tool Launch OK? says NO Section 25.25 (page 243) Licensing page is always displayed when running a tool Section 25.12 (page 218) Target managed system is not licensed for this tool Section 25.12 (page 218) SSH credentials missing for a server Section 25.22 (page 241) Unable to create SSH connection: No route to host Section 25.22 (page 241) Unable to get SSH credentials: SSH credentials for the specified server were not set or are missing Section 25.
Cause/Symptom Corrective actions • Examine the /opt/hptc/cmf/logs/cmfd.log for errors. • Verify cmfd -h for the usage and default startup parameters. The CMF retries failed connections periodically Perform the appropriate action: • Verify the BIOS configuration. • Verify that the management processor user name and password were not changed. • Verify the management processor's IP address.
Cause/Symptom Corrective actions Console command cannot connect to console. • Verify that cmfd is running on each of the management hubs; the console command searches for the cmfd daemon that has the connection to the console. • Verify that the dates on the management hubs are synchronized. Console command connects to cmfd but there is no output. Make sure that: • The BIOS on the managed system is configured to redirect the serial port to the management processor. For more information, see Section 8.3.
25.5 Troubleshooting configuration problems The following table describes possible configuration problems and provides actions to correct them. Cause/Symptom Corrective actions Configure Insight Control for Linux management services fails Perform the appropriate action: • Verify that the task has indeed completed. The Task Results window may report completion although the operation might not yet be complete. Monitor the console to determine the result.
Cause/Symptom Corrective actions • The CMS has multiple NICs and HP SIM has identified these as separate entities. If you experience similar issues, follow these troubleshooting recommendations: • Verify that the /etc/hosts file is correct. For example, make sure the real host name is not equated to localhost and make sure there is only one real and valid entry for the host name and IP address. • Verify that the DNS configuration is correct.
Cause/Symptom Corrective actions Enclosures collection monitor will report a CRITICAL status Locate the value for the command[encchk_all] if the OA credentials have not been configured properly command definition in the /opt/hptc/nagios/etc/ nrpe_local.cfg file. Run the command associated with the command definition. For example: # /opt/hptc/supermon/bin/sensors --cp=enclosures --domain icelx[1-5]:enclosures 1206387637 The user could not be authenticated.
Cause/Symptom Corrective actions Incorrect or no information returned for Insight Control for Perform the appropriate action: Linux • Reconfigure by running the The shownode config command returns no data, as Options→IC-Linux→Configure Management Services shown here: task. # shownode config all: The shownode info returns an error message, like the one shown here: # shownode info NO CACHE FILE! RERUN create_nodenames. Failure at /opt/hptc/perl/lib/sim/hptc_node.
Cause/Symptom Corrective Actions • Verify that the /etc/opt/mx/config/ RootTrustList.txt file contains the address of the CMS and the management hubs. • Ensure that the /opt/hptc/database/etc/ssl file on the CMS and management hubs contains the following: certfile.pem keyfile.pem • Ensure that Trusted Certificates from HP SIM have at least one certificate for Insight Control for Linux. Ensure that the output of the Options→Security→Credentials→Trusted Systems… task matches the values in certfile.pem.
Cause/Symptom Corrective actions • Examine the /var/log/messages system log file for error messages, and take any corrective action required. DHCP Process Will Not Start Perform the appropriate action: Any attempt to start the DHCP process fails with errors. • Verify that the /etc/dhcpd.conf service configuration file is valid. Verify it against the output of the examples in dhcpd.conf(5). • Verify that DHCP is configured to serve IP addresses on the correct network interface.
Cause/Symptom Corrective actions more than 80% of the time; budget approximately 20% additional IP addresses. Managed Servers will not PXE Boot Verify that your DHCP service configuration is properly When a console is connected to a managed system, either configured to provide a Boot Server Hostname or next-server value, instructing the PXE boot process to directly or through the managed system's iLO remote console, the boot process reaches the PXE boot stage, but load a network boot loader.
Cause/Symptom Corrective actions • Determine if CMS is managing 500 or more nodes (where a node represents a server, a management processor, an onboard administrator, a switch, and so on) using a postgres database. If so, it is possible that Initial Data Collection is failing because of database connectivity issues with postgres. If your CMS is managing over 500 nodes, HP recommends using a supported Oracle database for managing 500 or more nodes.
Cause/Symptom Corrective actions • Run the Data Collection Report on the system, which is accessible from the Tools & Links page for the system, and verify that there is a Network Interface section containing one or more MAC address(es). The Reset Server operation failed. Manually reboot the server. Previously discovered system does not bare-metal discover. Manually delete all the files in the /opt/repository/ A managed system that was previously discovered in Insight boot/pxelinux.
Cause/Symptom Corrective actions Password modification failed. Unable to add or update user account on management processor. Configure→Management Processor→Credentials to configure a new global password of at least 8 characters. Possible bare metal discovery issues with LO100i servers Use a browser to verify access to the LO100i management processor with the following command to access its web page.
25.9 Troubleshooting firmware update problems The following table provides the actions to correct a firmware update task failure. Cause/Symptom Corrective Actions Firmware Update Task Failed Perform the appropriate action: If the task fails, the system is left up in the Insight Control for Linux RAM disk, so that you can examine the hpsum logs and enter commands as necessary.
25.11 Troubleshooting large scale deployment problems The following table provides the actions to correct a large scale deployment failure. Cause/Symptom Corrective Actions Large Scale Deployment Failed Perform the appropriate action: • Examine the log in Operation Details section of the Task Results window for errors or other information.
Cause/Symptom Corrective Actions Some HP ProLiant DL100 series servers temperature and WARNING Alerts for the Nodeinfo service in “Nagios fan sensor metrics are not individually reported by default. Troubleshooting” (page 222). Instead, they are tallied in the "Sensor Count" metric. The temperature and fan sensor metrics are monitored correctly and are individually reported if they exceed the warning or critical thresholds.
Cause/Symptom Corrective Actions The wget command fails: • If there is a proxy server in your environment that is not configured properly • If the appropriate network ports open on the managed system are not open. For more information, see “Opening network ports on managed systems” (page 79) Take the corrective action based on the wget failure. Configure Management Services task fails Metrics are not collected Verify that a proxy is not used to communicate between the CMS and the managed system.
Cause/Symptom Corrective Actions Alternatively, remove the /etc/httpd/conf.d/ colplot-apache.conf file and restart the web server. Blank page from HP Graph with Internet Explorer version Relaunch the Nagios web interface or restart the browser. 6.0 If you are using Internet Explorer Version 6.0, the initial page with graphs is displayed when you select HP Graph.
Cause/Symptom Corrective Actions Incorrect Performance Dashboard context menu is displayed Determine that JavaScript has permission to replace context for a right-click of the mouse. menus in the browser. Performance Dashboard select metric menu might stop working Restart the Performance Dashboard tool: Tools→Integrated Consoles→Performance Dashboard... On rare occasions, the Performance Dashboard select metric menu stops working. The menu appears, but selecting or unselecting a metric has no effect.
$ ./check_sel --help check_sel <--help> -H hostname <-t timeout> --help -H --cache file Host to check Persistent cache to remember where we last read/processed default /hptc_cluster/adm/logs/sel/cache/selcache-$nodename.
a warning or critical message, find the information for that service in the Status Information column and apply it to the specified Nagios host. NOTE: The following messages are based on the default values. The messages differ if the messages in the /opt/hptc/nagios/etc/misccommands.cfg file were changed. Service: Apache HTTPS Server Status Information: HTTPS performance information Displays the status of the Apache HTTPS web server on the CMS.
These messages can occur if metrics collection cannot be completed in a reasonable time; examine the /opt/hptc/nagios/etc/nagios.cfg file for the value of the service_check_timeout parameter. The default works best for configurations with fewer than 256 managed systems. Increase the value of the service_check_timeout parameter to solve the problem for configurations with more managed systems. Also, run the following command to verify that the supermond service is running on the CMS: # /etc/init.
Corrective Actions: • If the check_nrpe error is reported for the CMS, use the following commands to verify that the nrpe service is running on the CMS: # ps auxww | grep nrpe If the nrpe service is not running, use the following commands to start it and to rerun the gather_all_data script: # /etc/init.d/nagios start_nrpe # /opt/hptc/nagios/libexec/gather_all_data --verbose • If the output reports that vars.
Cause/Symptom Corrective Actions Nagios “Management Settings Monitor” service reports vars warning Run the following commands to resynchronize the vars.ini file across all managed systems: # cd /opt/hptc/nagios/libexec # ./check_nagios_vars --update Nagios services report a non-OK status Remove the nagios_vars.db file: Under very rare circumstances, the Nagios cache might become unsynchronized. If this occurs, it is possible that some Nagios services do not operate correctly.
Cause/Symptom Corrective Actions user_procs_critical = 300 zombie_procs_warning = 1 zombie_procs_critical = 5 [service_nodes] total_procs_warning = 500 total_procs_critical = 600 user_procs_warning = 300 user_procs_critical = 400 zombie_procs_warning = 1 zombie_procs_critical = 5 3. Rebuild the /opt/hptc/etc/sysconfig/vars.ini file and push out the new vars.
Cause/Symptom Corrective Actions /usr/lib/perl5/vendor_perl/5.10.0/i586-linux-thread-multi/asm/ioctls.ph line 5. Compilation failed in require at /usr/lib/perl5/vendor_perl/5.10.0/i586-linux-thread-multi/bits/ioctls.ph line 8. Compilation failed in require at /usr/lib/perl5/vendor_perl/5.10.0/i586-linux-thread-multi/sys/ioctl.ph line 8. Compilation failed in require at /opt/hptc/sbin/nrg line 132. 25.
Cause/Symptom Corrective actions For information, see the HP Insight Control for Linux Installation Guide. Kickstart / AutoYaST install completes but task in HP SIM UI still shows that it is running Ensure that you removed the following two files: • autoInstallComplete_jsp.class • autoInstallComplete_jsp.
Cause/Symptom Corrective actions data contained on the disk prevents the OS's installer from • Try installing a Linux OS from a different vendor, only properly configuring the disk. Although the symptoms can to reformat the disk drive. vary, they include error messages and instances of • Boot the system to the Insight Control for Linux RAM disk installations hanging on reboot. with the Diagnose→Boot to Linux Rescue Mode...
Cause/Symptom Corrective actions • Turn off the dump flag in the /etc/fstab file on the target servers to capture fewer partitions. The target server has lost association with its management For the corrective action, see Section 25.20 (page 237) processor. More than one root partition was found when trying to capture the image. Modify the configuration of the target server so that it finds only one root partition. The target managed system OS uses an unsupported file system type.
Cause/Symptom Corrective actions the partition wizard, the deployment fails because the partitions are not sized as they should be. In this instance, capture the multi-partition image, deploy it to a single partition, capture that image, then redeploy the image using the number of required partitions. Captured Image could fail to deploy because of duplicate labels or existing partition information Perform the appropriate action: 1.
Cause/Symptom Corrective actions If not, check your DHCP server and network configuration. 2. Insight Control for Linux boot menu. If the boot menu does not appear, check your network configuration. 3. A ten second delay. 4. Loading of the Insight Control for Linux RAM disk. If the server tries to boot a local disk instead of the Insight Control for Linux RAM disk, examine the /opt/ repository/boot/pxelinux.cfg file for stale MAC address files.
Cause/Symptom Corrective actions The PSP file is corrupted. Recopy the SPP or PSP to the appropriate /opt/ repository/psp subdirectory. The PSP installation fails on a ESX or ESXi system. The Deploy→Deploy Drivers, Firmware and Agents→IC-Linux→Install SPP or PSP... is not supported on managed systems running ESX or ESXi If you need to install ESX agents, run Configure→Configure or Repair Agents… 25.
25.19 Troubleshooting server power control problems The following table describes possible causes of problems with powering up or down a managed system and provides actions to correct them. Cause/Symptom Corrective actions Error retrieving BMC for server. Root cause: Could not determine the BMC associated with the server (x.x.x.x) in the database Perform the appropriate action: • Ensure that SNMP is configured correctly and that HP SIM has access to SNMP on the target system.
Cause/Symptom Corrective actions Intermittent power on, power off, and reset server errors Rerun Options→Identify systems... to fully discover the iLO. Intermittent power on, power off, or reset server errors might occur during an Insight Control for Linux operation (for example, during an OS installation, image capture, or image deployment). Checking to see if power is on. Failed: Error retrieving BMC for server. Root cause:PANIC: BMC Manager not configured for device of this type.
25.20.3.1 Repairing the association of a booted managed system running an OS If a managed system is booted and running a supported OS, follow these steps to repair a lost association. • If the managed system is already running the proper agents and is properly configured, instruct HP SIM to re-query the managed system to get the proper association data: 1. Open HP SIM and select All Systems in the left pane. 2.
6. • If the BIOS data is valid and the iLO XML call is still reporting errors, a hardware problem might be the cause. In that case, telephone HP Customer Service. If the association problem is still not resolved after completing the suggested corrective actions described here, something more unusual is wrong. Check firewall ports on the CMS and the managed system and make sure SNMP is not being blocked. Look for anything that might be blocking the proper flow of the association data. 25.20.3.
9. If the server does have an OS installed, immediately install the ProLiant Support Pack (see the HP Insight Control for Linux Support Matrix for the current supported version), either manually through the remote console or by selecting the following: Deploy→Deploy Drivers, Firmware, and Agents→Install SPP or PSP... When this procedure is complete, the server is present in HP SIM and the association with the management processor is restored. 25.20.3.
1. 2. Run Options→Identify Systems... on the unassociated iLO or iLOs to force HP SIM to make the association. Repeat this process until all iLOs are associated with their servers. Select the following menu item from the Insight Control user interface to turn off power to the server or servers: Tools→Server Controls→Power Off Server... 25.21 Troubleshooting SNMP problems This section applies only to systems with iLO-based management processors.
Cause/Symptom Corrective actions The user name, password, or both for the SSH credentials credentials as appropriate. For more information, see of a target system are incorrect, causing SSH to fail. “Setting SSH credentials on managed systems” (page 142) and the HP Systems Insight Manager online help. SSH delays on SLES managed systems on networks without The following actions fix this issue: name resolution • Configure a DNS resolver on the network in question.
Cause/Symptom Corrective actions If not, restart it: # service supermon restart • Ensure that the mond daemon is running on all the managed systems: # pdsh -a -x `headnode` /etc/init.d/mond status Supermon and mond are running, but there is no activity Use the telnet command to connect on the appropriate Supermon listens on port 2710. The mond daemon listens port, port 2710 for Supermon and port 2709 for the mond daemon. Enter the S command after connection to see on port 2709. metrics data output.
Cause/Symptom Corrective action completely stopped before Insight Control for Linux can remove the RPMs. root 29156 17502 0 08:17 pts/1 00:00:00 grep mxinitconfig The output of the uninstall.sh script resembles the following: 2. Remove that process with the kill command. # uninstall.sh ... Uninstalling HP Systems Insight Manager ... Stopping HP SIM Stopping hpsmdb # kill -9 8626 Removing the process allows the uninstall.sh script to continue. 25.
Cause/Symptom Corrective action # cimprovider -l Ensure that the sblim-cmpi-base, libvirt-cim, and libcmpiutil packages are installed. For the SLES 10 operating systems, these packages are installed by running the Configure→Configure or Repair Agents... task on the VM host, selecting to install Insight Control virtual machine management. For other supported operating systems, install these packages from the RHEL or SLES distribution media. • Verify the network name on the system page for the CMS.
Cause/Symptom Corrective action ESXi 5.0 installation fails with fatal error: 6 (Buffer too small) Perform the appropriate action: 1. Open the /opt/repository/taskchain/ ESXiInstallation.xml file in a text editor. The default timeout value of 5400 seconds in the WaitforOSRamDisk operation might be inadequate for 2. Locate the WaitforOsRamDisk operation and change the installation. the value from its default of 5400 to a larger value to allow time for all the ESXi modules to load.
25.28 Troubleshooting virtual media problems Cause/Symptom Corrective action Server attempts to PXE boot or boot from local disk instead Perform the following actions: of booting using virtual media. • Verify that port 60002 is open on the CMS. • Run the Insight Control for Linux Configure→IC-Linux→Configure Boot Method task. Be sure to select Virtual Media for the boot method.
26 Support and other resources 26.1 Information to collect before contacting HP Be sure to have the following information available before you contact HP: • Software product name • Hardware product model number • Operating system type and version • Applicable error message • Third-party hardware or software • Technical support registration number (if applicable) 26.2 How to contact HP Use the following methods to contact HP technical support: • See the Contact HP worldwide website: http://www.
26.3.2 Warranty information HP will replace defective delivery media for a period of 90 days from the date of purchase. This warranty applies to all Insight Management products. 26.4 HP authorized resellers For the name of the nearest HP authorized reseller, see the following sources: • In the United States, see the HP U.S. service locator website: http://www.hp.com/service_locator • In other locations, see the Contact HP worldwide website: http://www.hp.com/go/assistance 26.
(Unattended). They replace the previous Deploy→Operating System→Custom or Other task. Likewise, the procedures for deploying a custom OS have changed. For information on deploying a custom OS, see the white paper titled Installing a Custom Operating System with HP Insight Control for Linux. ◦ The download web addresses in the table in “Additional prerequisites for certain ProLiant servers” (page 93) were updated. ◦ The section on Partition wizard requirements and guidelines was expanded.
Download from the Insight Control for Linux product website The Insight Control for Linux product website contains links to the Insight Control for Linux documentation set and white papers, a link to the Insight Control for Linux QuickSpecs, license information, product registration information, and many other related topics. To view or download documentation from the Insight Control for Linux product website, follow these steps: 1. Open a web browser to the following web address: http://www.hp.
• Linux vendors The following are links to Linux vendor websites. Linux vendors are not limited to the vendors shown in this list. The address of each website or link to a particular topic is subject to change without notice by the website provider. ◦ http://www.redhat.com Home page for Red Hat, distributors of Red Hat Enterprise Linux (RHEL). ◦ http://www.novell.com/linux Home page for Novell, distributors of SUSE Linux Enterprise Server (SLES). ◦ http://www.linux.org/docs/index.
◦ http://www.virt-manager.org Home page for the virt-manager tool. ◦ http://www.vmware.com/products/esx/index.html Home page for VMware ESX. ◦ http://www.vmware.com/products/esxi/ Home page for VMware ESXi. ◦ http://www.linux-kvm.org Home page for KVM. ◦ http://www.xen.org Home page for Xen. 26.7.3 Troubleshooting resources The HP Insight Control for Linux Installation Guide and HP Insight Control for Linux User Guide each contain a chapter that describes troubleshooting hints and techniques. 26.
TIP 254 Support and other resources An alert that provides helpful information.
A Customizing Nagios The Nagios configuration is designed so that you can customize it as needed. Complete documentation for customizing Nagios is available on the following Nagios website: www.nagios.
# # # # # NRPE GROUP This determines the effective group that the NRPE daemon should run as. You can either supply a group name or a GID. NOTE: This option is ignored if NRPE is running under either inetd or xinetd nrpe_group=new_nagios_group Where new_nagios_group is the group name of the new Nagios user's account. Save the file. 5. Edit the /opt/hptc/nagios/etc/nagios.
# NAGIOS GROUP # This determines the effective group that Nagios should run as. # You can either supply a group name or a GID. nagios_group=new_nagios_group Save the file. 9. Run the Options→IC-Linux→Configure Management Services task. NOTE: The Task Results window may report completion although the operation might not yet be complete. Monitor the console to determine the result. 10. If your system has multiple management hubs, log into each management hub and repeat steps 2 through 8. 11.
To avoid these alerts, use the command sequence listed in the following table to shut down Nagios before performing any maintenance operations and tasks and start or restart Nagios. Purpose Command line To shut down Nagios on the CMS immediately before performing maintenance operations and tasks: # /etc/init.d/nagios stop To start Nagios after a maintenance operation: # /etc/init.d/nagios start To restart Nagios after changing its configuration: # /etc/init.d/nagios restart A.2.
thresholds and generates alerts when a threshold is reached. Depending on your specific site configuration and use, some default thresholds might not be appropriate for your system. The platform-dependent default thresholds serve as a baseline, but they might not be optimal for your site. Determine the threshold values appropriate for your site and customize the Nagios configuration accordingly. The /opt/hptc/nagios/etc/nagios_vars.
Table 24 Supermon metrics collection intervals (continued) Metric name Collection interval btime default* processes default* netinfo default* meminfo default* swapinfo default* time default* switch default* cputotal default* avenrun %LOADAVECOLLECTIONPERIOD% ** mdadm %MDADMCOLLECTIONPERIOD% ** * The default is 5 minutes. ** This value is specified in the /opt/hptc/nagios/etc/nagios_vars.ini file. A.2.5.
Actively Launched on Managed System? Maximum Check Attempts Indicates whether or not Nagios periodically runs this service check at the specified normal check interval. Indicates the number of times Nagios examines the service before reporting a failure. Indicates the frequency of the check interval. Indicates the amount of time Nagios waits before retrying after a failure.
Nagios creates alerts for power, memory, voltage, and Automatic System Recovery (ASR) messages. The rules for alerts are defined in the /opt/hptc/nagios/etc/selRules file. You can modify these rules by editing this file as follows: • Add rules to this file for new alerts. • Change alerts by modifying the corresponding rule in this file. • Remove a rule to delete the corresponding alert.
Glossary A AutoYaST file A configuration file used to effect an unattended SLES operating system installation. B bare-metal Describes a server that is not booted with a running operating system. This could be a brand new server with no OS installed on it, or it could be a server with an OS that is not booted. C central management server See CMS. certificate An electronic document that contains a subject's public key and identifying information about the subject.
HTTPS An extension to the HTTP protocol that supports sending data securely over the web. hypervisor Computer software, specific to a hardware platform, that allows you to run multiple operating systems on a single host at the same time. I iLO Integrated Lights Out. A self-contained hardware technology available on various hardware models that enables remote management of any node within a system. Subsequent generations of this technology are iLO 2, iLO 3, and iLO 4.
PSP ProLiant Support Pack. HP software components that are bundled together and verified to work with a particular operating system. An HP ProLiant Support Pack contains driver components, agent components, and application and utility components. All these are verified to install together. PSP dependency script An optional user-provided script that runs during a PSP deployment to a managed system. PXE Preboot Execution Environment.
Index A Apache self-signed certificate, 204 configuring on the CMS, 154 Apache service does not start, 204 association between server and management processor, 237 between virtual host and virtual guest, 127 AutoYaST file, 88 see also installation configuration file defined, 87 B bare metal discovery iLO to server association lost, 240 power cycle starts , 236 starts after power cycle, 236 bare-metal system discovering (PXE), 71, 72 discovering (virtual media), 72 bare-metal system discovery discovery, 14
digital signing, 25 directories to back up, 21 discover bare-metal systems, 14 bare-metal systems using PXE, 71 bare-metal systems using virtual media, 72 enclosures, 71, 75 servers with supported OS on them, 73 servers with unsupported OS on them, 71 switches, 71, 75 systems, 71 discovery iLO to server association lost, 240 power cycle starts bare metal discovery, 236 documentation ESX, 253 ESXi, 253 HP Insight Control, 251 HP ProLiant Support Pack User Guide, 251 Insight Control for Linux, 250 KVM, 253 Li
Insight Control for Linux troubleshooting, 217 Insight Control power management, 141 Insight Control virtual machine management, 120 install PSP troubleshooting, 234 install SPP troubleshooting, 234 installation custom or other OS, 88, 99 interactive, 87 Linux variant, 99 prerequisites, 92 procedure to install a Linux OS, 101 procedure to install a VMware ESX using a Kickstart file, 97 procedure to install VMware ESX interactively, 98 procedure to install VMware ESXi interactively, 98 PSP, 136 Red Hat inter
removing server from, 153 management processor changing credentials, 191 credentials, 198 enabling telnet on, 188 enabling virtual media, 61 iLO, 160 IPMI, 160 lost association to server, 237 obtaining status of, 237 setting user name and password, 14 memory, 184 menu items, 13 metrics collection interval, 259 mond management agent, 160 monitoring environmental data, 171 hosts and services, 165 hosts and services with problems, 167 network bandwith, 168 real time metrics, 175 services failure, 207 strategy,
onboard administrator see OA operating system see OS /opt/hptc/nagios/etc/selRules file, 261 OS installing on managed systems, 78 supported, 88 OS deployment troubleshooting, 229 OS installation troubleshooting, 229 user guide, 137 PSP dependency script, 45 creating, 137 defined, 137 location in repository, 137 simple example, 137 PXE boot, 18 troubleshooting, 235 P RAM disk booting to, 204 RAM disk environment, 18 reboot managed system, 141 register for technical support and update service, 248 ISO imag
repository web server changing default port, 191 rerun non complete targets button, 33 RHEL installation troubleshooting, 229 RPM signatures validating, 25 RRDtool, 168 defined, 159 documentation, 252 run script troubleshooting, 235 run ssh command troubleshooting, 235 S scalable deployment preparing for, 110 selecting, 114 scalable task results format, 38 Secure Shell, 23 security, 23 sendmail utility, 174 sensor data not reported, 242 sensor metrics archiving, 180 sensor thresholds changing for Nagios, 2
monitoring, 168 Systems Insight Manager documentation, 251 T task operations based view, 38 run now, 29 scheduling, 29 stopping, 31 target based view, 34 view results, 29 task ID, 31 task logs common errors, 203 task management, 17 task queueing, 17 task results, 29 task results page, 29 common areas, 31 controlling view options, 32 HP SIM standard task results format, 34 log button, 37 operation control buttons, 40 operation details log, 33 operation target details table, 40 operations table, 39 parameter
virtual CD, 18 virtual serial port, 83 VMware ESX guidelines for configuring virtual guest, 121 installing, 96 installing using Kickstart file, 97 interactive installation, 98 required BIOS setting, 119 VMware ESXi guidelines for configuring virtual guest, 121 installing, 96 interactive installation, 98 required BIOS setting, 119 volume group, 117 group name, 117 W WBEM provide components deploying, 200 websites HP authorized resellers, 249 HP technical support, 248 Linux vendors, 252 ProLiant servers, 251