HP Servicecontrol Manager 3.0 Troubleshooting Guide Edition 2 Manufacturing Part Number: 5187-4198 July 2003 United States © Copyright 2002-2003 Hewlett-Packard Development Company L.P.
Legal Notices The information in this document is subject to change without notice. Hewlett-Packard makes no warranty of any kind with regard to this manual, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. Hewlett-Packard shall not be held liable for errors contained herein or direct, indirect, special, incidental or consequential damages in connection with the furnishing, performance, or use of this material.
Publication History The manual publication date and part number indicate its current edition. The publication date will change when a new edition is released. The manual part number will change when extensive changes are made. Part Number: 5187-4198 • July 2003, Edition 2.0 This edition includes new Servicecontrol Manager issues to support the release of HP-UX 11i version 2. Part Number: 5187-1883 • January 2003, Edition 1.1 This edition includes new Servicecontrol Manager issues.
Typographic Conventions We use the following typographical conventions. 4 audit (5) HP-UX or Linux manual page. mxtool is the name and r is the section. From the command line, you can enter “man mxtool” or “man 4 mxtool” to view the man page. See man (1). Book Title Title of a book. On the Web and on the Instant Information CD, it may be a hot link to the book itself. Command Command name or qualified command phrase. ComputerOut Text displayed by the computer. Emphasis Text that is emphasized.
Contents 1. Troubleshooting SCM 2. Operating System and Networking Issues Agent Unable To nslookup The CMS Hostname . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Authentication Failures Due To Time Synchronization Or Communication Time Limits . . . . . . . . . . CMS Unable To nslookup Node Hostname. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . File Related Errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Contents X Window Tool Does Not Run For mxexec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 Troubleshooting SCM This chapter provides general information on diagnosing issues associated with Servicecontrol Manager (SCM). Complete steps 1-5 on the central management server (CMS) to diagnose the specific problems you are experiencing. Follow step 6 to verify and resolve the specific problems. To Diagnose Issues with SCM Step 1. Check for errors in the following log files. • /var/opt/mx/logs/mx.log • /var/opt/mx/logs/MySQLInstall.log • /opt/hpwebadmin/logs/catalina.
Troubleshooting SCM Step 6. Identify the applicable issue in one of the following chapters to troubleshoot and resolve your problem. Use the data collected in the previous steps to determine which issues may apply. • Operating System and Networking Issues See Chapter 2, “Operating System and Networking Issues,” on page 11.
Troubleshooting SCM — “Removing A CMS From A Managed Node Fails” on page 30 — “Repository Configuration Failure During Install” on page 31 — “Repository Server Failure” on page 32 — “SCM 2.5 Daemons Not Running During SCM 3.
Troubleshooting SCM 10 Chapter 1
2 Operating System and Networking Issues The operating system and networking can impact SCM’s functionality. Failure of the operating system or the networking can sometimes produce conditions that appear to be SCM related.
Operating System and Networking Issues Agent Unable To nslookup The CMS Hostname Agent Unable To nslookup The CMS Hostname Symptom The command mxagentconfig -a cms_name -p password fails with: Unknown host name: ‘cms_name’ Verification Use nslookup cms_name to test the network name resolution on the managed node. Fix Setup network name resolution properly on the managed node by adding CMS information to the managed node’s /etc/hosts file. See “Problems With resolv.
Operating System and Networking Issues File Related Errors File Related Errors Symptom For files you have permission to access and you know exist, you are intermittently getting error messages such as: An internal error occurred. The following file is unreadable by the user: /etc/opt/mx/config/mx.properties (File table overflow). If this file exists, verify that the user has permission to read the file. An internal error occurred. /opt/mx/j2re/lib/PA_RISC2.0/libnet.
Operating System and Networking Issues Hardware Problems (Such As LAN Card Failures) Hardware Problems (Such As LAN Card Failures) Symptom Verification You experience intermittent networking or communication errors with a system on the network. This assumes you are experiencing the errors between the CMS and a single managed node. If you think the CMS is the system having the problem, perform these steps with the CMS and managed node reversed. 1. Log on to the CMS. 2.
Operating System and Networking Issues Incorrect Network Configuration Incorrect Network Configuration If nslookup(1) fails, Servicecontrol Manager cannot resolve hostnames and the mxnode command will fail. Symptom The command mxnode -a hostname fails with the error: Unknown host: ‘node_name’. Node ignored. Verification Verify that this command works: nslookup $(hostname) The output must have a line beginning with Name: as well as one that follows that starts with either Address: or Addresses:.
Operating System and Networking Issues Insufficient File System Space In /var Insufficient File System Space In /var Symptom If one or more of the following issues occur: • You receive Log Manager failures. • You are unable to setup Servicecontrol Manager initially (mxinitconfig fails). • You are unable to add any new objects due to failures in writing the repository. Verification Use df(1) to check for free space on /var.
Operating System and Networking Issues Insufficient Swap Space Insufficient Swap Space Symptom One or more of the following issues happen intermittently depending on system loading. On HP-UX: • You receive “Out of memory” exceptions when running mxcommands. • Other HP-UX commands result in memory errors. On Linux: • The system response becomes very slow.
Operating System and Networking Issues Problems With getpwnam()/uid() Problems With getpwnam()/uid() Symptom Your login user name is a valid SCM user, but you are no longer considered a valid user. Execution of any SCM commands fails with the error: The name ‘?’ does not represent a user in this system. Verification Verify that you are still a HP-UX or Linux user. • Execute the id(1)command.
Operating System and Networking Issues Problems With resolv.conf, hosts, nsswitch Problems With resolv.conf, hosts, nsswitch Symptom The command nslookup node_name fails. Verification Check these files and commands for similar content: -- /etc/resolv.conf -domain nameserver search mycorp.com 15.0.10.123 mycorp.com -- /etc/hosts -aa.a.aa.aaa hostname.domain.name 127.0.0.1 localhost loopback hostname -- /etc/nsswitch.
Operating System and Networking Issues SuSE CMS Reporting An IP Address Instead Of A Hostname SuSE CMS Reporting An IP Address Instead Of A Hostname Symptom A SuSE CMS displays the IP addresses instead of the fully qualified DNS names in the Name field of the SCM nodes list. This is known to occur only on SuSE Linux, but it could occur on other systems using IPv6.
Operating System and Networking Issues Web Server Not Running Web Server Not Running Symptom SCM commands run successfully on the CMS, but Web access is not available. Verification Determine if the Tomcat Web server is running on the CMS using the command: ps -fp $(cat /opt/hpwebadmin/logs/.tomcat.pid) Fix Start the Tomcat Web server: • For HP-UX: /sbin/init.d/mxtomcat start • For Linux: /etc/init.
Operating System and Networking Issues Web Server Not Running 22 Chapter 2
3 Servicecontrol Manager Issues Servicecontrol Manager (SCM) depends on a number of daemon processes executing on the Central Management Server (CMS) and the managed nodes. Failure of the daemons or failure of communication between the daemons can produce conditions that are not easily understood. This chapter covers the following issues: Chapter 3 • “catalina.
Servicecontrol Manager Issues catalina.out Log Fills Up With A Repeated Error catalina.out Log Fills Up With A Repeated Error Symptom The catalina.out log file fills up with the following error: can’t find tag.dat Verification This message is generated each time you launch the View Properties tool from SCM. Fix To suppress the message, change the logging level from 0 to 4 in the file at: /opt/hpwebadmin/webapps/mxpropertypages/jsp/PropertyPages.jsp Then delete the /opt/hpwebadmin/logs/catalina.
Servicecontrol Manager Issues Management Home Page Is Already Open Management Home Page Is Already Open Symptom When you launch the View Management Home Page tool in the SCM GUI, you receive the message: The management home page for CMS is HP Servicecontrol Manager, which is already open. where CMS is the hostname of the CMS. Verification Verify that the CMS is the selected node. Fix SCM is the management home page for the CMS, and the SCM GUI cannot be launched with itself.
Servicecontrol Manager Issues mxserver.bin Install Error mxserver.bin Install Error Symptom When you execute the mxserver.bin, you get the following error: Installing mxagent-B.03.00.00.i386-1.rpm... error: failed dependancies: libstdc++-libc6.1-1.so.2 is needed by mxagent-B.03.00.00 Verification Executing the command: rpm -q compat-libstdc++ returns the message: package compat-libstdc++ is not installed Fix For Red Hat 7.2 and 7.3, install the package compat-libstdc++-6.2-2.9.0.
Servicecontrol Manager Issues Obsolete Tools Migrated from SCM 2.5 SCM Management These SCM agent management tools no longer apply to with SCM 3.0. To remove these tools: cd /var/opt/mx/tools/tools /opt/mx/lbin/def2xml -t scmmgmt.tdef scmmgmt.xml mxtool -rf scmmgmt.xml X Window Tools With Incorrect Tool Type If you have created tools that launch X clients, you need to change the tool type from launch or stdout to x-windows. SCM 2.5 only supported launch and nolaunch (stdout), while SCM 3.
Servicecontrol Manager Issues Optional Tools Migrated from SCM 2.5 Optional Tools Migrated from SCM 2.5 Symptom SCM 3.0 ships with several tool definitions to improve tool definitions that initially shipped with other integrated HP applications. Several of these tool definitions may not be applicable to your environment and could be removed. The tools can be added to SCM again at a later date. NOTE Partition Management tools do not apply to servers running HP-UX 11.00.
Servicecontrol Manager Issues Problems With mxexec On A Managed Node (#1) Problems With mxexec On A Managed Node (#1) Symptom The mxexec command fails on a managed node with the following message: Authentication failed. Verification On the managed node, list the CMS systems that the agent is configured for: mxagentconfig -l The hostname of the CMS should appear in the Servers list. Fix If the CMS doesn’t appear in the Servers list, configure the agent to respond to the CMS.
Servicecontrol Manager Issues Removing A CMS From A Managed Node Fails If the verification steps are successful and the host is accessible from the CMS, verify that the mxagent daemon is running on the managed node by logging onto the managed node and executing: ps -ef | grep mxagent Fix • If you were unable to establish communication, verify that the managed node is powered on, booted and has network connectivity. Then repeat the verification steps.
Servicecontrol Manager Issues Repository Configuration Failure During Install Repository Configuration Failure During Install This issue applies to HP-UX only. Symptom The SCM install fails when you run the command: mxinitconfig -a server You receive the following error messages: Performing server setup. Configuring the repository...FAIL Connection to the repository server failed. Performing server unsetup. Stopping the server daemons...
Servicecontrol Manager Issues Repository Server Failure Repository Server Failure Symptom You get the following error message: Connection to the repository server failed. Verification • Check if the MySQL daemons (mysqld and safe-mysqld) are running: ps -ef |grep mysql Fix • If the MySQL daemons are not running, start them: For HP-UX: /sbin/init.d/mysqld start For Linux: /etc/rc.d/init.
Servicecontrol Manager Issues SCM 2.5 Daemons Not Running During SCM 3.0 Upgrade SCM 2.5 Daemons Not Running During SCM 3.0 Upgrade Symptom Data from SCM 2.5 is not available in SCM 3.0 after an upgrade is performed. Verification The command mxnode -ld only shows the local hostname. Fix This fix requires that the backup files you created from the upgrade procedure are available. See the HP Servicecontrol Manager 3.0 User’s Guide for details. 1.
Servicecontrol Manager Issues SCM Commands And Manual Pages Are Inaccessible SCM Commands And Manual Pages Are Inaccessible When SCM is installed, the default system command shell path and man page path are updated. If the user has customized their login shell environment, the default paths may become invalid.
Servicecontrol Manager Issues SCM Fails On Startup With No Error Messages — If the file was moved to another directory and you can locate the correct file, move it back to the appropriate directory. Alternatively, if there is a backup from which the original file can be retrieved, restore it. — If the file was deleted, reconfigure the CMS: mxinitconfig -a all Reconfiguring the CMS may cause data loss; make sure you have a backup.
Servicecontrol Manager Issues SCM Generates A Certificate Error SCM Generates A Certificate Error Symptom The Tomcat certificate for SCM is by default issued for three months. After three months, you will receive a message that says the certificate has expired. Fix On the CMS, replace the certificate using the keytool command and set the validity of the new certificate for your environment. Step 1. Retrieve the hostname: hostname=$(/opt/mx/bin/mxgethostname) Step 2.
Servicecontrol Manager Issues SCM GUI Isn’t Available SCM GUI Isn’t Available Symptom When you navigate to http://cms_hostname:50000, the SCM GUI fails to load with error messages such as: The page cannot be displayed The connection was refused Verification Verify that your CMS is available on the network by logging on remotely through the command line. If your CMS is available on the network, Tomcat may not have been installed during the SCM installation.
Servicecontrol Manager Issues Syntax Problems In The Definition Files Syntax Problems In The Definition Files Symptom Using one of the following commands yields a syntax error for the associated file.xml. mxnode -a -f file.xml mxngroup -a -f file.xml mxuser -a -f file.xml mxrole -a -f file.xml mxtool -a -f file.xml mxauth -a -f file.xml Fix View the syntax for the known good XML entries by using the optional -lf command format with an mxcommand.
Servicecontrol Manager Issues Task Execution From The SCM GUI Fails With No Error Task Execution From The SCM GUI Fails With No Error A tool executed from the SCM graphical user interface encounters a fatal error. In the tool results screen, the task Status is Failed and the Stderr and Stdout tabs are blank. Symptom 1. Identify the TaskID in the tool results screen. Fix 2.
Servicecontrol Manager Issues Tools Only Available On HP-UX 11i v2 Managed Nodes mxtool -lf -t "Tool Name" > toolname.xml where Tool Name is the name of the non-trusted user's tool. Step 2. Edit the tool definition file to remove the owner element, and commit the changes using the mxtool modify option from the command line: mxtool -m -f toolname.
Servicecontrol Manager Issues WBEM Not Displayed As A Managed Protocol WBEM Not Displayed As A Managed Protocol This issue applies to HP-UX only. Symptom WBEM does not show up as a managed protocol for an SCM node even though WBEM is installed on that node. Verification If the WBEM cimserver is not running on the managed node, SCM won’t list WBEM as a managed protocol.
Servicecontrol Manager Issues WLM and PRM Are Not Available Within SCM WLM and PRM Are Not Available Within SCM This issue applies to HP-UX only. Workload Manager (WLM) and Process Resource Manager (PRM) appear to be added to SCM, but when you try to run either tool within SCM, the tool doesn’t exist. Symptom Fix Step 1. Verify that WLM or PRM is installed on a managed node or the CMS. Step 2.
Index Symbols $MANPATH, 34 $PATH, 34 A apt-get, 26 authentication failure, 12, 29 C catalina.
Index not configured, 25 SCM upgrade data not available, 33 obsolete tools, 26 optional tools, 28 ServiceGuard, 37 shell & operator, 24 SIM incompatibility, 31 slow system response, 17 SNMP patch, 20 startup failure, 24 SuSE, 20 swap space failure, 17 syntax errors, 38 syslog.log, 7, 13, 17, 34 T table is full, 13 tag.