HP Insight Management Agents 8.70 Managing ProLiant Servers with Linux HOWTO Abstract This HOWTO provides instructions to help system administrators install, upgrade, and remove Version 8.4.
© Copyright 2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. Confidential computer software.
Contents 1 Software architecture..................................................................................5 System Health application and Command Line utilities(hp-health) ...................................................5 Health monitor....................................................................................................................7 System temperature monitoring..........................................................................................7 System fan monitoring.....
Documentation feedback.........................................................................................................26 Typographic conventions.........................................................................................................26 A Error messages........................................................................................27 B Troubleshooting........................................................................................
1 Software architecture This section describes the features and architecture of the following HP Linux management software: • HP System Health Application and Command Line Utilities (hp-health) • Insight Management SNMP Agents for HP ProLiant Systems (hp-snmp-agents) • HP System Management Homepage (hpsmh) • Descriptions for HP management consoles for Linux System Health application and Command Line utilities(hp-health) The System Health Application and Command Line Utilities (hp-health) package col
Table 1 hp-health applications (continued) Application Details Description The hpasmxld application automatically loads on ProLiant servers that have the HP Integrated Lights-Out 2 (iLO 2) management controller and the necessary IPMI driver support. The iLO 2 management controller contains an Intelligent Platform Management Interface (IPMI) Version 2.0 Base Management Controller (BMC) that replaces the operating system-based software management functionality provided by the legacy hpasmd application.
Table 2 Controller, health-daemon, and kernel driver combinations (continued) Lights-Out Controller Kernel Version (uname -r) Red Hat Enterprise Linux Hp-OpenIPMI hpilo module installed? available? daemon iLO 3 NA NA NA No hpasmlited /dev/ipmi0 distro IPMI iLO 3 NA NA NA Yes hpasmlited /dev/hpilo hpilo dev file kernel driver Another source of information includes the following manpages provided with the hp-health package: • hp-health • hpasmcli • hpuid • hplog • hpbootcfg • hp_mg
Additionally, on some servers, the fans gradually increase to full speed in an attempt to cool the server as the external environment temperature increases. If the server exceeds the normal operating range and does not cool down within 60 seconds, the operating system is, in most cases, shut down to close the file systems. TIP: On servers that do not have variable speed fans, the server is shut down unless the ROM-Based Setup Utility (RBSU) Thermal Shutdown feature is disabled.
Automatic server recovery Automatic Server Recovery (ASR) is configured using RBSU available during the initial boot of the server by pressing the F9 key when prompted. This feature is implemented using a "heartbeat" timer that continually counts down. The Health Monitor frequently reloads the counter to prevent it from counting down to zero. If the ASR counts down to zero, it is assumed that the operating system has locked up and the system automatically attempts to reboot.
Table 3 hplog options (continued) Command Description hplog –p Shows the status of all power supplies hplog –v Shows the IML entries on the standard output For more information about these components, see the online documentation by entering: $ man hplog HP Unique Identifier Utility (hpuid) The HP Unique Identifier Utility (hpuid) allows local manipulation of the ProLiant Unique Identifier (UID) blue light on selected ProLiant servers.
Insight Management SNMP Agents for HP ProLiant (hp-snmp-agents) The ProLiant Insight Management Agents provide proactive notification of server events through the HP Systems Insight Manager console. Alternatively, the ProLiant Insight Management Agents allow the status of the server to be monitored or checked using a standard Web browser.
Table 6 Sub-agents of the Server Agent (continued) Sub-agent Description System Health Agent (cmahealthd) The System Health Agent gathers data for the Health MIB. The data collected includes critical (NMI) errors, correctable memory (ECC) errors, system hang/panic detection, temperature conditions, and fan failures. The System Health Agent then retrieves these errors from the Health Monitor. The System Health Agent executable is /opt/hp/hp-snmp-agents/server/bin/cmahealthd.
Table 7 Sub-agents of the Storage Agent Sub-agent Description IDA Agent (cmaided) The IDA Agent gathers data for the IDA MIB. The data includes: • IDA controller information • IDA accelerator information • IDA logical drive information • IDA physical drive information The IDA Agent is located in /opt/hp/hp-snmp-agents/storage/bin/cmaided. The suggested poll_time is 15 seconds (default). The minimum recommended poll_time is 5 seconds. IDE Agent (cmaided) The IDE Agent gathers data for the IDE MIB.
NIC agent (cmanic) The NIC Agent collects information from network interface controllers at periodic intervals, makes the collected data available to the SNMP agent, and provides SNMP alerts. The NIC Agent gathers data for the NIC MIB from supported NIC device drivers. The data includes: • Physical mapping and configuration data for each network interface • Network statistics for Ethernet interfaces. Information is provided for HP controllers.
Integrated Lights-Out User Guide located at http://h18013.www1.hp.com/manage/ ilo-description.html. Usage: cpqblru [-eql?] [-a address1,address2, . . .] [-c chassis1,chassis2,. . .] See Table 9: “ProLiant BL Rack Upgrade Utility parameters” for parameters and description. Table 9 ProLiant BL Rack Upgrade Utility parameters Parameter Description -a address1,address2,… This optional parameter considers only enclosures with address1, address2, and so on.
HP ProLiant Channel Interface Device Driver for iLO/iLO 2/iLO 3 (hp-ilo) The HP ProLiant Channel Interface Device Driver for iLO/iLO 2/iLO 3 (hp-ilo) enables iLO data collection and integration with the ProLiant Management Agents and the rack infrastructure interface service. The driver enables communication routing of SNMP traffic from the ProLiant Management Agents through the dedicated iLO management NIC.
For installation information, see the SIM Linux Installation and Configuration Guide available under Install and configure section at HP Systems Insight Manager Information library http:// h18013.www1.hp.com/products/servers/management/hpsim/infolibrary.html.
2 Manual installation This section describes how to install, upgrade, and remove HP System Health Application and Command Line Utilities (hp-health) and Insight Management SNMP Agents for HP ProLiant Systems (hp-snmp-agents) packages. The latest versions of this software can be downloaded from http://hp.com/go/proliantlinux Prerequisite: Installing package dependencies The software described in this HOWTO is distributed in standard package formats that provide prerequisite information internally.
$ man hp-health NOTE: The version number for the RPM file varies depending on the supported systems and functionality. The distribution refers to the Linux distribution supported by the RPM. The platform refers to the processor architecture the RPM was built to support. The RPM file has a binary compiled for the supported distribution with the default kernel. After the installation process, the health service is configured to automatically start each time your system boots.
Table 10 Uninstall drivers and agents commands Command Description # rpm –e hp-snmp-agents Removes the hp-snmp-agents package from your system # rpm –e hp-ilo Removes the hp-ilo package from your system # rpm –e hp-health Removes the hp-health package from your system # rpm –e hp-OpenIPMI Removes the hp-OpenIPMI package from your system CAUTION: If a service is running when the corresponding package is removed, it is automatically shut down during the removal process.
RPM provides the -U option to upgrade a package. For example, to upgrade hp-health to a newer version you could use the command: # rpm –Uvh hp-health-...
3 Customization This section includes advanced topics on data center customization. Configuration files The ProLiant Management Agents Configuration file /opt/hp/hp-snmp-agents/cma.conf is shared by all HP ProLiant Management Agents. Currently, exclude directives, taint directives, trap interface, trap email notification configuration, and base socket number (used by cmaX) are supported. The agents are capable of sending email notifications in addition to SNMP traps.
# service hp-snmp-agents restart You can also manipulate the /opt/hp/hp-snmp-agents/cma.conf file which contains one or more exclude directives. Any string after the exclude keyword is interpreted as an agent name that should not be started. Examples include: exclude cmahealthd exclude cmastdeqd These two lines exclude two agents from the startup: the Health Agent (cmahealthd) and the Standard Equipment Agent (cmastdeqd).
Traps are configured using the standard SNMP configuration file (snmpd.conf). See the snmpd.conf manual page for the most current configuration information. When the snmpd.conf or snmpd.local.conf configuration files are changed or when the SNMPCONFPATH environment variable is changed, the cmanic daemon must be restarted. If your operating system has an active firewall configuration, external SNMP requests might be rejected by the system, which prevents remote management operation.
4 Support and other resources Information to collect before contacting HP Be sure to have the following information available before you contact HP: • Software product name • Hardware product model number • Operating system type and version • Applicable error message • Third-party hardware or software • Technical support registration number (if applicable) How to contact HP Use the following methods to contact HP technical support: • In the United States, see the Customer Service / Contact HP U
HP authorized resellers For the name of the nearest HP authorized reseller, see the following sources: • In the United States, see the HP U.S. service locator web site: http://www.hp.com/service_locator • In other locations, see the Contact HP worldwide web site: http://welcome.hp.com/country/us/en/wwcontact.html Documentation feedback HP welcomes your feedback. To make comments and suggestions about product documentation, send a message to: docsfeedback@hp.
A Error messages Messages logged if an ASR event occurs are listed in Table 14 (page 27). Table 14 Error messages Message Number Details Message 1 Message 2 Message 3 Message 4 NMI-Automatic Server Recovery timer expiration – Hour %d-%d/%d/%d Description This message indicates that the Health Monitor detected an ASR timeout and is attempting to gracefully shut down the Operating System.
stopped, dependent applications like the Rack Firmware Upgrade Utility terminate as well. Table 15 (page 28)lists possible issues. Table 15 cpqriisd messages Message Number Details Message 1 Could not setup server semaphores Could not destroy server semaphores Up sem: Ioctl Failure! Down sem: Ioctl Failure! Get sem: Ioctl Failure! Set sem: Ioctl Failure! Message 2 Description These messages indicate that the synchronization objects called “semaphores”, cannot be set up correctly.
Table 15 cpqriisd messages (continued) Message Number Details Recommended action Message 7 This message does not indicate a problem with the Rack Infrastructure Interface Service. However, there might be a problem with the HP ProLiant Rack Daemon (cmarackd). Restart cmarackd. If the problem persists, contact your HP field service engineer.
Table 15 cpqriisd messages (continued) Message Number Details Description These messages indicate a problem that occurred during initialization of the service.
B Troubleshooting This section describes common problems that might occur during installation and operation of the HP ProLiant Management Software for Linux. Table 16 (page 31) describes issues and workarounds for the hp-health and hp-snmp-agents packages. Any problems reported to HP should include the following files: • /var/log/messages • /var/log/boot.log (for Red Hat Linux distributions) • /var/log/warn (for SuSE LINUX distributions) • /var/log/hp-snmp-agents/cma.
Table 16 Issues and workarounds for the hp-health and hp-snmp-agents packages (continued) Issue Number Details # Don’t log private authentication messages! * . info;mail .none;news .none;authpriv.
Table 17 Known issues with agents Issue Number Details Issue 1 Cannot manage server from Systems Insight Manager, grayed-out utilization button, or missing file system space used information in the mass storage window Workaround To work around this issue, complete the following steps: 1. Check if the network is reachable by pinging the server from the system running Systems Insight Manager 2.
Table 17 Known issues with agents (continued) Issue Number Details Workaround Information about the configuration of the device indicates that a SCSI controller is installed, but no further information is available.
Table 17 Known issues with agents (continued) Issue Number Details Workaround To work around this issue, complete the following steps: 1. Be sure that the SNMP agent, the Peer agent and the agent processing the set are all running 2. Check the agent command line arguments in the agent start script files 3. Verify that either the argument “-s OK” is present or that the default set_state is “OK” for the agent. This process enables SNMP sets for this agent only 4.
Table 17 Known issues with agents (continued) Issue Number Details Workaround Stop the agent. Change the agent command line argument trap switch to “-t NOT_OK” in the /opt/hp/hp-snmp-agents//etc/ file. This disables SNMP traps for this agent only. Restart the stopped agent for the changes to take effect.
C hp-snmp-agents command lines and arguments Table 18 (page 37) lists the command lines and Table 19 (page 38) lists the command arguments for hp-snmp-agents.
NOTE: • All agents support –p, –s and –t as startup parameters • Each agent has an associated run level script which is located in /opt/hp/ hp-snmp-agents//etc/. All important settings such as poll time arguments are contained in these individual scripts. Table 19 Command line arguments for hp-snmp-agents 38 Command line argument Description -p poll_time Specifies the number of seconds to wait between data collection intervals.