System Fault Management C.07.00.06.
© Copyright 2010 Hewlett-Packard Development Company, L.P Legal Notices ©Copyright 2010 Hewlett-Packard Development Company, L.P.Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor’s standard commercial license.
Table of Contents 1 Introduction......................................................................................................................9 Overview.................................................................................................................................................9 Features and Benefits..............................................................................................................................9 Components of SFM.......................................
Viewing Information About System Summary...............................................................................49 Viewing Information About Cooling Devices.................................................................................49 Viewing Power Supply Instances....................................................................................................49 Viewing Temperature Status and Events........................................................................................
Overview.........................................................................................................................................67 Searching Low Level Logs Using Simple Search............................................................................67 Searching Low Level Logs Using Advanced Search.......................................................................68 Searching Low Level Logs Using CLI.......................................................................................
A Appendix A...............................................................................................................105 B Interpretation of HP SMH Instances.........................................................................109 Processor Instances.............................................................................................................................109 Memory Instances............................................................................................................
List of Figures 1-1 4-1 4-2 4-3 4-4 4-5 4-6 4-7 4-8 4-9 4-10 4-11 4-12 4-13 4-14 5-1 5-2 5-3 5-4 B-1 B-2 B-3 B-4 B-5 B-6 B-7 B-8 B-9 B-10 B-11 B-12 B-13 B-14 B-15 Block Diagram of SFM...................................................................................................................16 HP SIM Home Page.......................................................................................................................36 Global Protocol Settings................................................
List of Tables 1-1 1-2 4-1 4-2 4-3 4-4 4-5 4-6 5-1 5-2 5-3 5-4 5-5 5-6 6-1 6-2 6-3 6-4 6-5 6-6 6-7 6-8 6-9 A-1 B-1 B-2 B-3 B-4 B-5 B-6 B-7 B-8 B-9 B-10 B-11 B-12 B-13 B-14 B-15 B-16 B-17 B-18 B-19 B-20 7-1 7-2 8 Instance Providers.........................................................................................................................12 Indication Providers......................................................................................................................
1 Introduction The System Fault Management (SFM) supports HP Integrity BL860c i2, BL870c i2 & BL890c i2 Server Blades. All the features supported on systems running the HP-UX 11i v3 operating system are available for HP Integrity Server Blades. This chapter introduces you to the System Fault Management (SFM) software and the tools that SFM includes.
NOTE: 10 Introduction SFM is the future replacement of EMS Hardware Monitors.
Components of SFM This section discusses the following topics: • • • • • EVWEB Error Management Technology (EMT) CIMUtil IPMI Event Viewer Providers EVWEB EVWEB is a component of SFM that enables you to administer and view WBEM indications generated on the local system on which SFM is installed. For more information on EVWEB, see “Evweb Overview” (page 53). EMT EMT is a component of SFM, which contains most errors that can occur on the HP-UX 11i v3 system.
Table 1-1 Instance Providers Instance Provider Description Blade The Blade Instance Provider retrieves the following information: • Blade ID • Blade Physical location • Blade Hardware path • Blade Serial number • Blade Part number • Blade Status NOTE: Server CPU The Blade Instance Provider is available on BL860c / BL870c HP The CPU Instance Provider gathers the following types of information: • Logical processor information, such as: — Current clock speed — Processor family — Processor status informat
Table 1-1 Instance Providers (continued) Instance Provider Description DAS Provider Starting with HP-UX 11i v3 March 2009 release, SFM does not monitor information of storage disks or disk enclosures. However, the DAS Provider collects inventory details of the direct attached storage disks and monitors the systems for errors. For more information on the DAS provider, see the HP-UX WBEM Direct Attached Storage (DAS) Provider Data Sheet and Release Notes at: http:// docs.hp.com/en/netsys.
Table 1-2 Indication Providers Indication Provider Description EMS Wrapper Provider The EMS Wrapper Provider does the following: 1. Converts hardware events generated by the EMS Hardware Monitors into WBEM indications. 2. Reports the WBEM indications to the CIMOM. Using a WBEM-based management application, such as HP SIM, you can subscribe to and receive Event Monitoring Service (EMS) events generated on a remote system.
Table 1-2 Indication Providers (continued) Indication Provider Description SFMIndicationProvider The SFMIndicationProvider generates indications that are compliant with the WBEM standards.
Cron job SFM includes the following features from the HP-UX 11i v3 February 2007 release: • • Cron job: Every 15 minutes, the cron job checks if the SFMProviderModule is running. If the SFMProviderModule is not running, the cron job starts the SFMProviderModule which in turn starts all the providers. Vacuum cron job: Configured to be invoked once an hour, the vacuum cron job is used to free up unused memory space of SFM PostgresSQL database.
1. 2. 3. 4. 5. The CIMOM receives requests from the CMS for information about devices. The CIMOM directs the requests to the appropriate SFM provider, for example, the CPU Instance Provider. The SFM provider queries the associated hardware device for property information. The SFM provider returns the query information to the CIMOM. The CIMOM conveys the responses from the provider to the CMS. You can view the information using HP SIM on the remote system and HP SMH on the local system.
1. 2. 3. 4. EVWEB and CMS subscriptions are created. The EVM CIM Provider receives events posted by the posting clients through the EVM Daemon. The provider converts these events into WBEM indications and reports these indications to the CIMOM. CIMOM directs these indications to EMT and the CMS that has created subscriptions for indications.
2 Installing the SFM Software The System Fault Management (SFM) software is installed by default with the HP-UX 11i v3 Operating Environment (OE) media. However, at some point you may need to install the SFM software separately. This chapter describes how to install the SFM software as a standalone component on the HP-UX 11i v3 operating system.
NOTE: WBEM Services, Online Diagnostics, SysMgmtWeb, and HP SIM are available on the Operating Environment (OE) media and can be selected for install during the SFM installation. HP System Management Homepage (SMH) – bundled in SysMgmtWeb – is an optional install. However, without it you cannot access the EvWEB GUI (Event Viewer, Subscription Management and Log Viewer interface). The command line interface for EVWEB will still be accessible. HP Systems Insight Manager (HP SIM) is an optional install.
Selecting these options automatically installs all the dependencies. NOTE: The system selects some options by default. However, you must select the two options mentioned in step 5 to automatically install the prerequisites. 7. 8. Click OK in the Note window to confirm the selection of dependencies. In the SD Install - Software Selection window, select Actions->Install to begin installation, as shown in the following figure: NOTE: SFM is automatically configured after it is installed.
When the SFM software installs, the Install window appears indicating that the SFM software is installed successfully, as shown in the following figure: 9. Unmount the CD. To unmount, enter the following command at the HP-UX prompt: # unmount /tmp/cdrom 10.
2. Mount the CD to a location of your choice, as in the following example: # mount /dev/dsk/c1t2d0 /tmp/cdrom 3. To install the SFM software and all the dependencies, enter the following command at the HP-UX prompt: # swinstall -x autoselect_dependencies=true -x enforce_dependencies=true -s /tmp/cdrom SysFaultMgmt 4. Unmount the CD. To unmount, enter the following command at the HP-UX prompt: # unmount /tmp/cdrom 5.
Verifying the Installation Using the TUI To verify the SFM software installation, complete the following steps: 1. 2. Log in to the system as a superuser. Click Logfile in the Install window, as shown in the following figure: The Logfile, which includes details about the installation, is displayed. If there are no errors in the Logfile, the SFM software is installed properly. If the SFM software is not installed properly, you must repeat the installation procedure. 3.
Verifying the Installation Using the CLI To verify your installation using the CLI, complete the following steps: 1. 2. Log in to the system as a superuser. Enter the following command at the HP-UX prompt: # swjob If the output contains no errors, the SFM software is installed properly. Otherwise, you must install the SFM software again. A sample output is shown in the following figure: 3.
4. Select Actions->Mark for Remove in the SD Remove window, as shown in the following figure: 5.
6.
7. When the SFM software is removed, the Remove Window is displayed, as shown in the following figure: 8. To verify whether the SFM software is removed properly, enter the following command at the HP-UX prompt: # swlist | grep SysFaultMgmt If the SFM software is removed properly, SysFaultMgmt and the version number of the SFM software does not appear in the output. If the SFM software is not removed properly, you must repeat the removal procedure.
3. To verify whether the SFM software is removed properly, enter the following command at the HP-UX prompt: # swlist | grep SysFaultMgmt If the SFM software is removed properly, SysFaultMgmt and the version number of the SFM software do not appear in the output. If the SFM software is not removed properly, you must repeat the removal procedure. For more information, see “Verifying Removal of the SFM Software” (page 29).
3 Configuring Indication Providers This chapter describes how to configure indication filters, error logging, and the SFMIndicationProvider. Configuring Indication Filters You must configure the indication filters to view desired indications. You use the Filter Metadata Provider (FMD) to configure indication filters that deliver important or desired indications, for example, indications with a certain severity.
Filter Filter Filter Filter Filter Filter Query Query Language Source Namespace Description State Last Operation : : : : : : Select * from HP_AlertIndication WQL root/cimv2 Admin Filter Enabled Filter State Add Filter HP_AlertIndication is derived from CIM_AlertIndication and HP_DeviceIndication is derived from HP_HardwareIndication. WBEM severities must be used while specifying in the filter query. For more information on the WBEM severity, see Table 4-1 (page 42).
NOTE: You can also send test events for other devices that the SFMIndicationProvider monitors. For information on the devices monitored by the SFMIndicationProvider, see Table 1-2 (page 14). To view the list of events, enter the following command at the HP-UX prompt: # evweb eventviewer -L A list of events along with the details such as event number, severity, and event category are displayed.
4 Administering Indications and Instances Using HP SIM This chapter describes System Fault Management (SFM) administration on a remote system using HP Systems Insight Manager (HP SIM). NOTE: You can perform similar tasks using other management applications that are compliant with the Common Information Model (CIM) (2.7.2) schema (or later) of the Distributed Management Task Force (DMTF). The terms events and indications are used interchangeably in this document.
2. To create subscriptions, select Options-->Protocol Settings-->Global Protocol Settings in the HP SIM Home page, as shown in Figure 4-1. Figure 4-1 HP SIM Home Page The Global Protocol Settings window is displayed, as shown in Figure 4-2. Figure 4-2 Global Protocol Settings 3. 36 In Figure 4-2, under Default WBEM settings, select Enable WBEM. Click OK to save your settings.
4. Select Configure->Configure or Repair Agents, as shown in Figure 4-3. Figure 4-3 Configuration The Configure or Repair Agents window is displayed, as shown in Figure 4-4. Figure 4-4 Configure or Repair Agents 5. From the Add targets by selecting from: list in Figure 3-4, select All Systems to view and select the systems.
on the selected system. The list of systems is displayed in the Select Target Systems window, as shown in Figure 4-5. Figure 4-5 Select Target Systems 6. To select all the systems in the network, select the Select “All Systems” itself check box, as shown in Figure 4-5. Click Apply. The Verify Target Systems window is displayed, as shown in Figure 4-6.
7. Select the appropriate check box to verify the target systems and click Next, as shown in Figure 4-6. The Enter Credentials window is displayed, as shown in Figure 4-7. Figure 4-7 Enter Credentials 8. Enter your credentials in the given fields, as shown in Figure 4-7. Click Next. The Configure or Repair Settings window is displayed, as shown in Figure 4-8. Figure 4-8 Configure or Repair Settings 9. On the Configure or Repair Settings window, click Run Now.
Figure 4-9 Task Results 10. To obtain a printable report of the indication subscription details, click View Printable Report at the bottom of the window. The report is displayed, as shown in Figure 4-10. Figure 4-10 Printable Report of the Indication Subscription NOTE: For more information, see the HP Systems Insight Manager Installation and User’s Guide at: http://docs.hp.com/en/netsys.
1. Select All Events in the left pane of the HP SIM window. The list of events is displayed, as shown in Figure 4-11. Figure 4-11 Events list 2. To view the details of an event, select the event. The details are displayed at the bottom of the same window, as shown in Figure 4-12.
3. To obtain the printable version of the event details, click View Printable Details at the bottom of the window. The printable report is displayed in a new window, as shown in Figure 4-13.
Table 4-1 EMS, WBEM and Evweb Events Severity values (continued) EMS Severity WBEM Severity Evweb Severity 4 Serious 6 Critical 7 Critical 5 Critical 7 Fatal/Non-recoverable 7 Critical NOTE: Perceived Severities in Syslog is same as WBEM severities. In the SFM mode although the SFMIndicationProvider is generating the events, the name of the provider displayed in the event details is one of the following mentioned under Provider in Table 4–2, depending on the device to which the event is related.
Table 4-4 Command Representation Task In monconfig (Online Diagnostics) In SFM Deleting a monitoring request # /etc/opt/resmon/lbin/monconfig # evweb subscribe -D -f -n Enter D at the main menu selection prompt. Sending a test event # # sfmconfig -t - Changing the status # set_fixed -n of a device to UP.
http://docs.hp.com/en/diag Viewing Instances To view information about processors, memory, cooling devices, power supplies, and disks, complete the following steps: 1. On the System Page of HP SIM, click System Management Homepage, as shown in Figure 4–14. Figure 4-14 System Page The HP SMH home page is displayed. 2. Perform the relevant steps described in “Viewing Instances” (page 47).
5 Administering Indications and Instances Using HP SMH This chapter describes the SFM administration tasks that you can perform using HP SMH on a local system.
1. Select Show All under System on the HP SMH home page. The system page is displayed. Figure 5-1 System Management Homepage 2. Select Processors under System on the HP SMH home page. Information about the processors is displayed. 3. To return to the HP SMH home page, click on Home. NOTE: Starting September 2009 release, in HP SMH GUI, you can refer to “The equivalent command line” option, to view command line information about processors. For more information, view cprop manpage.
NOTE: Starting September 2009 release, in HP SMH GUI, you can refer to “The equivalent command line” option, to view command line information about memory. For more information, view cprop manpage. See "man cprop" Viewing Information About System Summary To obtain information about system summary, such as the model, role, UUID, UUID (Logical), Serial number, Serial number (Logical) and many more, complete the following steps: 1. Select System Summary under System on the HP SMH home page.
3. To return to the System Management Homepage home, click on Home. NOTE: Starting September 2009 release, in HP SMH GUI, you can refer to “The equivalent command line” option, to view command line information about the temperature. For more information, view cprop manpage. See "man cprop" Viewing Voltage Status and Events To view the voltage status and events related to voltage, complete the following steps: 1. 2. Select Show All under System on the HP SMH home page. Select Voltage on the system page.
NOTE: Starting September 2009 release, in HP SMH GUI, you can refer to “The equivalent command line” option, to view command line information about firmware information. For more information, view cprop manpage. See "man cprop" Viewing Information About Onboard Administrator To obtain information about the Onboard Administrator (OA), such as the OA description, OA IP address, and OA MAC address, complete the following steps: 1. Select Enclosure Information under Enclosure on the HP SMH home page.
NOTE: Starting September 2009 release, in HP SMH GUI, you can refer to “The equivalent command line” option, to view command line information about Partition Information. For more information, view cprop manpage. See "man cprop" Viewing Information About Blade To obtain information about the Blade, such as the Onboard Administrator and HashID, complete the following steps: 1. Select Blade under System on the HP SMH home page. Information about the Blade is displayed. 2.
3. 4. Select the option Basic Health Test to run basic test configured in the system. Click Run to execute the CPU Health Test. The CPU Health Test results are displayed in the View Health Test Results pane. NOTE: 5. CPU Health Test Results can also be viewed from the command line interface (CLI). To return to the HP SMH home page, click on Home. For more information, see HP System Management Homepage Online Help. In HP SMH, go to the Help menu.
EVWEB supports new IO & Storage native indication providers to display additional info for the following providers: HPUXSASNativeIndicationProviderModule HPUXRAIDSANativeIndicationProviderModule HPUXFCNativeIndicationProviderModule HPUXStorageNativeProviderModule These native indication provider support is available on the following HP Integrity Server Blades: BL860c i2 BL870c i2 BL890c i2 Launching Evweb for Administration You can launch Evweb either through the CLI or through the HP SMH GUI.
• (page 31). You cannot use the evweb list, evweb subscribe, or the evweb eventviewer command to create and delete Admin Defined event subscriptions. Event subscriptions created using the HP SMH GUI (Event Subscription Administration) – You can create these event subscriptions using the GUI or the CLI. You can modify and delete the event subscriptions that are created using Event Subscription Administration.
• • • (-t [archive|e-mail] ) (-s ) -r For more information on creating event subscriptions using CLI, see evweb_subscribe(1). Copying and Creating a New Event Subscription Using the GUI You can reuse the existing subscriptions to create another subscription. To create an event subscription by copying an existing event subscription, complete the following steps: 1. 2. 3. Repeat steps 1-5 from “Launching Evweb for Administration” (page 54).
• Modify a single event subscription. You can modify a single event subscription using the modify feature by selecting the event subscription from the event subscription table. • Modify similar criteria in multiple event subscriptions. You can modify similar criteria in multiple event subscriptions by using the Copy and Modify subscription feature. Modifying an Event Subscription Using the GUI To modify an event subscription, complete the following steps: 1. 2.
To modify an event subscription, you must specify the criteria and the location. Following are the ways in which you can modify an event subscription: • If you do not specify the -r option and the location, the current location is retained and the subscription criteria are updated. • If you specify the -r option but not the location, the current location is removed and only the subscription criteria is updated. • If you specify both subscription criteria and location, then both of them are updated.
4. Select Delete on the Delete subscription page. The event subscriptions are deleted and a confirmation message is displayed. 5. Click OK on the confirmation message window. NOTE: You cannot delete HP Advised event subscriptions and Admin Defined event subscriptions. For more information on deleting an event subscription using the HP SMH GUI, select Help on the action pane of the Delete Event Subscription page. NOTE: HP recommends deleting all unwanted event subscriptions.
127.0.0.1 localhost evweb@hp.com 3. Enter the following command at the HP-UX prompt: # /opt/sfm/bin/sfmconfig -c /var/opt/sfm/conf/evweb.conf Viewing Event Subscriptions Using Evweb This section describes how to perform non-administration tasks, such as viewing event subscriptions.
internal is an argument used to display information about HP Known event subscriptions and event subscriptions created using Evweb. A summary of event subscriptions is displayed in a tabular format, as shown in Figure 5-2. Figure 5-2 Summary of Evweb Event Subscriptions Table 5-1 describes the fields in the Event Subscription Summary table. Table 5-1 Evweb Event Subscriptions Field Description Subscription Name Displays the name of an event subscription.
-E is an option used to display details of an event subscription. -n is a switch used to specify the name of an event subscription. A table with detailed information about event subscriptions is displayed, as shown in Figure 5-3. Figure 5-3 Details of an Event Subscription Table 5-2 describes the fields in the Details of an Event Subscription page. Table 5-2 Details of an Event Subscriptions Field Description Subscription Name Displays the name of an event subscription.
external is an argument used to display information about external event subscriptions. A list of event subscriptions is displayed in a tabular format, as shown in Figure 5-4. Figure 5-4 External Event Subscriptions The field names of the external event subscriptions are different from the ones created using Evweb. However, field names of both external event subscriptions and subscriptions created using Evweb, can be matched. Table 5-3 describes the fields in the View external event subscriptions page.
To launch Evweb using the CLI, enter the following command at the HP-UX prompt: # evweb To launch Evweb for viewing WBEM indications using the HP SMH GUI, complete the following steps: 1. Log in to the HP SMH. To log in to HP SMH, enter http://:2301 in the address bar of the Web browser. The HP SMH login screen is displayed. 2. 3. Enter your user name and password in the appropriate text boxes. Click Sign In on the login screen. The HP SMH home page is displayed. 4.
For more information on searching for the WBEM events using the HP SMH GUI, select Help on the action pane of the Advanced Search page. Searching for the Subscribed WBEM Events Using the CLI To search WBEM events using the CLI, enter the following command at the HP-UX prompt: # evweb eventviewer -L Where: -L is an option used to list all the WBEM events. A list of WBEM events is displayed.
Viewing Detailed Information Using GUI To view detailed information about the WBEM events, complete the following steps: 1. 2. Repeat steps 1-5 from “Launching Evweb for Viewing WBEM Indications” (page 63). Select the desired WBEM event in the List Events table. The Details of the Event page is displayed. This page includes a table that provides detailed information about the WBEM events.
• • • • • • • • -i -r[is|be|en|co][:]() -a((:)(yy|mm|dd|hh) -t[eq|le|ge|bw] ()[,] -s[asc|desc] () -f -n -b [history|current] For information on deleting WBEM events using the CLI, see evweb_eventviewer(1). Viewing Low Level Logs Using Evweb This section describes how to perform administration tasks such as searching and viewing low level log information.
6. 7. Provide appropriate information in the fields present in the Log Viewer page. Click Search on the Log Viewer page. Based on the search criteria, the log records are displayed in a tabular format. For information on searching the log database using GUI, select Help on the action pane of the Log Viewer page. Searching Low Level Logs Using Advanced Search To search the log database for low level logs using Log Viewer, complete the following steps: 1. 2.
Viewing List of Low Level Logs You can view a list of low level logs summary using the Log Viewer. The list of low level logs are tabulated and include information, such as Log ID, Log Index, Device ID, Device Type, Log Version, and Time of occurrence. In this document, this table is referred to as the Logs Summary Table. The Logs Summary Table is a result of the search operation. Viewing List Of Low Level Logs Using GUI To view a list of low level logs using Log Viewer, complete the following steps: 1. 2.
Viewing Details of Low Level Logs You can view a details of low level logs using the Log Viewer. The detailed information about the low level logs include information, such as Log ID, Log Index, Log Type, Log Source, Log Version, Device ID, Device Type, Log Version, Time of occurrence, and dump of the low level logs in hexadecimal format. Viewing Details Of Low Level Logs Using GUI To view details of low level logs using Log Viewer, complete the following steps: 1. 2.
Tracing Evweb This section provides an overview of tracing and information about the various trace levels in Evweb. This section also describes administrative tasks, such as enabling, modifying, and disabling tracing.
Table 5-5 Trace Levels Trace Level Description 1-Critical The system logs only those situations in Evweb that cause major failures. Example: The database server is not functioning properly or is down. 2-Error The system logs those situations that generate an error. Example: There is more than one subscription name. Evweb accepts only one subscription name. Critical situations are also logged at the Error trace level. 3-Warning The system logs situations that result in warning messages.
7. Select Enable Tracing. The tracing level is set and a confirmation message is displayed. 8. Click OK on the confirmation message window. For more information on enabling tracing using the HP SMH GUI, select Help on the action pane of either the Event Viewer or the Event Subscription Administration page. Enabling Tracing Using the Evweb CLI To enable tracing using the Evweb CLI, you must export the environment variable, EVWEB_TRACE_LEVEL.
1. Log in to the System Management Homepage. To log in to HP SMH, enter http://:2301 in the address bar of a Web browser. The HP SMH login screen is displayed. 2. 3. Enter your user name and password in the appropriate text boxes. Click Sign In on the login screen. The HP SMH home page is displayed. 4. Do one of the following: • Select Tools -> Subscription Administration. or • Select Logs -> Event Viewer. NOTE: 5. The Disable Tracing option is not displayed if tracing is not enabled.
The EMT supports the following user groups: • • Administrator Non-administrator In the CLI, any user with superuser privileges is an administrator. However, in the HP System Management Homepage (HP SMH) GUI, the user groups in EMT are mapped internally to the user groups defined in the HP SMH. The Administrator user group in the HP SMH maps to administrators in EMT. The Operator and the User user groups in the HP SMH map to non-administrators in EMT.
Querying CER for Events Using the CLI To query CER for events using the CLI, enter the following command at the HP-UX prompt: #emtui -q -w [, -i ] Where: -q is an option that enables you to specify a query string to query the CER for information about errors, cause, and recommended actions. -w is an option that enables you to specify the match type. Following are the match types: • any (default) - Searches for at least one word specified in the query string.
The List Events page, which contains the Error Summary Table, is displayed. For information about viewing summary information of events in CER using the HP SMH GUI, select Help on the action pane of the List Events page. Viewing Summary Information Using the CLI To view summary information about events in CER using the CLI, enter the following command at the HP-UX prompt: # emtui -b Where: -b is an option used to view information about events in brief. A list of events in CER is displayed.
Adding a Custom Solution If you are a system administrator, you can add your own solution for an error generated on a HP-UX system. The custom solution is permanently stored in CER and is available to all EMT users. Adding a Custom Solution Using the GUI To add a custom solution using the HP SMH GUI, complete the following steps: 1. 2. 3. Repeat steps 1-5 from “Launching EMT” (page 75). Search for the event from CER using either the Simple Search or the Advanced Search feature.
3. If there is no cause associated with the event, skip to Step 4. If a cause is associated with the event, select the cause for the event from the Detailed Error Information (Administrative View) pane. You can select multiple causes for an event. 4. Click Modify Selected Solution on the right corner of the Detailed Error Information (Administrative View) pane. The Modify a Custom Solution page is displayed. 5.
Deleting a Custom Solution Using the CLI To delete a custom solution using the CLI, enter the following command at the HP-UX prompt: # emtui -d -u Where: -d is an option used to delete a custom solution present in the CER. -u is an option used to specify the number associated with the custom solution that you want to delete.
Enabling Tracing Using the EMT GUI To enable tracing using the EMT GUI, complete the following steps: 1. Repeat steps 1-5 from “Launching EMT” (page 75). NOTE: The Enable Tracing option is not displayed if tracing is already enabled. Instead, the Disable Tracing and the Modify Tracing option are displayed. 2. Select Enable Tracing on the top right corner of the page. The Enable Tracing page is displayed. 3. 4. Set the trace level by selecting the level from the trace level list. Select Enable Tracing.
1. Repeat steps 1-5 from “Launching EMT” (page 75). NOTE: 2. The Disable Tracing option is not displayed if tracing is not enabled. Select Disable Tracing on the top right corner of the Simple Search page. The tracing is disabled and a confirmation message is displayed. 3. Click OK on the confirmation message window. For more information about disabling tracing using the HP SMH GUI, select Help on the action pane of the Disable Tracing page.
6 Troubleshooting SFM This chapter describes how to troubleshoot SFM providers and EVWEB. This chapter addresses the following topics: • “Troubleshooting Instance Providers” (page 83) • “Troubleshooting Indication Providers” (page 90) • “Troubleshooting EVWEB” (page 100) NOTE: For information on issues with oserrorlogd, see the Using PSB Components section of the ProviderSvcsBase administrator guide.
Table 6-1 Troubleshooting Instance Providers Problem Cause Solution The serial number and part number property of memory modules, or bulk power supplies are not available. Either or both of the following can be the cause of this problem: • These properties are not supported on the system. • The provider is not configured properly. Complete the following steps to check whether the part number and serial number are supported on the given system, and whether the provider is configured properly: 1.
Table 6-1 Troubleshooting Instance Providers (continued) Problem Cause Solution To register the provider, complete the following steps: 1. Enter the following command at the HP-UX prompt: # cimprovider -ls | grep SFMProviderModule 2. If the following output is displayed, all the providers are registered properly: SFMProviderModule OK 3. If the output displayed is different from this output, the provider module is not registered.
Table 6-1 Troubleshooting Instance Providers (continued) Problem Cause Solution Cause 3 To check if SFMProviderModule is running, enter the following command at the HP-UX prompt: SFMProviderModule is not running.
Table 6-3 Troubleshooting Instance Providers Problem: Requests for instances do not return any value. Causes Solution Cause 1 Enter the following command at the HP-UX prompt: The Common Information Model Object Manager (CIMOM) is not running. # ps -eaf | grep cimserver If the name cimserver is displayed in the output, the CIMOM is running properly.
Table 6-3 Troubleshooting Instance Providers (continued) Problem: Requests for instances do not return any value. Causes Solution To register the provider, complete the following steps: 1. Enter the following command at the HP-UX prompt: # cimprovider -ls | grep SFMProviderModule 2. If the following output is displayed, all the providers are registered properly: SFMProviderModule OK 3. If the output displayed is different from this output, the provider module is not registered.
Table 6-3 Troubleshooting Instance Providers (continued) Problem: Requests for instances do not return any value. Causes Solution Cause 3 To check if SFMProviderModule is running, enter the following command at the HP-UX prompt: SFMProviderModule is not running.
Table 6-5 Troubleshooting Instance Providers Problem: Indications fulfilling the conditions defined in the HP-Known HP-Defined filters, are not logged in the Event Archive. Cause Solution The HP-Known filters and HP-Known Create the following EnumerateInstances.xml file and save it in any subscriptions have been deleted from location: the CIMOM. PAGE 91Table 6-7 Troubleshooting Indication Providers Problem: Indications corresponding to events generated by the Event Monitoring Service (EMS) monitors, are not logged in the Events List. Causes Solution Cause 1 Enter the following command at the HP-UX prompt: CIMOM is not running. # ps -eaf | grep cimserver If the name cimserver is displayed in the output, the CIMOM is running properly.
Table 6-7 Troubleshooting Indication Providers (continued) Problem: Indications corresponding to events generated by the Event Monitoring Service (EMS) monitors, are not logged in the Events List.
Table 6-7 Troubleshooting Indication Providers (continued) Problem: Indications corresponding to events generated by the Event Monitoring Service (EMS) monitors, are not logged in the Events List. Causes Solution Cause 3 The provider is not registered under the module.
Table 6-7 Troubleshooting Indication Providers (continued) Problem: Indications corresponding to events generated by the Event Monitoring Service (EMS) monitors, are not logged in the Events List.
Table 6-7 Troubleshooting Indication Providers (continued) Problem: Indications corresponding to events generated by the Event Monitoring Service (EMS) monitors, are not logged in the Events List.
Table 6-7 Troubleshooting Indication Providers (continued) Problem: Indications corresponding to events generated by the Event Monitoring Service (EMS) monitors, are not logged in the Events List. Causes Cause 4 Subscriptions do not exist.
Table 6-7 Troubleshooting Indication Providers (continued) Problem: Indications corresponding to events generated by the Event Monitoring Service (EMS) monitors, are not logged in the Events List. Causes Solution Create the following enumerateInstances_sub.xml file and save it in any location: PAGE 98Table 6-7 Troubleshooting Indication Providers (continued) Problem: Indications corresponding to events generated by the Event Monitoring Service (EMS) monitors, are not logged in the Events List. Causes Solution CIM_ComputerSystem hpdst348 Cause 5 The indication providers are not loaded properly.
Table 6-7 Troubleshooting Indication Providers (continued) Problem: Indications corresponding to events generated by the Event Monitoring Service (EMS) monitors, are not logged in the Events List. Causes Solution Cause 6 To check if SFMProviderModule is running, enter the following command at the HP-UX prompt: SFMProviderModule is not running.
Table 6-8 Troubleshooting Indication Providers Problem: Subscription associated with a filter and a handler, created by CIMUtil, does not appear in the output. Causes Solution Cause 1 # CIMUtil -c -f MyFilter2 "Select * from HP_AlertIndication" # CIMUtil -c -h MyHandler 'localhost/CIMListener/EMArchiveConsumer' # CIMUtil -c -s MyFilter2 MyHandler # cimsub -ls | grep MyFilter2 root/PG_InterOp root/PG_InterOp:MyFilter2 root/PG_InterOp:CIM_IndicationHandlerCIMXML.
Table 6-9 Troubleshooting EVWEB (continued) Problem Cause Solution WBEM Indications are SFMProviderModule is To check if SFMProviderModule is running, enter the following not logged in to the Event not running. command at the HP-UX prompt: Archive.
Table 6-9 Troubleshooting EVWEB (continued) Problem Cause Solution WBEM Indications are SFMProviderModule is To check if SFMProviderModule is running, enter the following not mailed to your email not running. command at the HP-UX prompt: address.
Table 6-9 Troubleshooting EVWEB (continued) Problem Cause Solution What is "su: + tty?? root-sfmdb" only once logged in syslog.log? When the startup script, You can safely ignore this message. /sbin/rc2.d/S550psbdb (IA) or /sbin/rc2.d/S550sfmdb (PA), runs at the system start-up, the su command is issued in the script to launch a postmaster process as the sfmdb user. "+" in the message indicates the su attempt is successful (if "-" is present instead of "+", it indicates the attempt fails).
Table 6-9 Troubleshooting EVWEB (continued) Problem Cause Both EMS and SFM log The syslog functionality the same symptom in the is available from SFM syslog. Version C.06.00.07.01, September 2009 release, to provide a summary of event information of critical and serious events. The default subscription to syslog HP_defaultSyslog is configured.
A Appendix A Following is a sample EMT Message file: $ Descriptor Header begins $ <> DescriptorID=0000023100000010800000AA006D2EA3 $ <> ProductName = myProduct $ <> ProductID = ID $ <> ProductEmailAlias=myproduct@abc.com $ <> OrgName = myorg $ <> OrgType = ISV $ <> Subsystem={(Type=EMS, Name= dm_chassis),(Type=WBEM, Name=FileSystemProvider)} $ <> ProductCategory=Kernel $ <> MsgCat={ID=1,(Path=./ lvmcommonmessages.cat, Locale= ja_JP.
Table A-1 EMT Message File Description (continued) Tag Description ProductCategory ProductCategory= Specify one or more of the following product categories that best describes your Kernel,IO,Network product: • Hardware • Network • IO • Kernel • Commands • Others MsgCat Specify a list of message catalogs. The MsgCat tag has the following attribute: ID – A unique number used to identify an error message. 106 Usage MsgCat={(ID=1, Path=../../../bin/cat/en_US.iso88591/module1.cat,LocaleName= en_US.
Table A-1 EMT Message File Description (continued) Tag Description Usage Cause_Action Used to associate a cause with an action for Cause_Action={(CauseID=1, (CauseID=2, ActionID=3), a specific error message. The Cause_Action ActionID={1,2}), (CauseID=3, ActionID={2,3})} tag is mandatory if there is more than one cause and at least one corrective action for a given error message.
B Interpretation of HP SMH Instances This appendix describes the fields and enables you to interpret the instances in the HP SMH property pages.
Table B-1 Description of the Processors Fields and Values (continued) Fields and Values Description Tag Indicates the physical position of the processor. Location Indicates the location of the processor. Attributes such as Cabinet Number, Cell Slot, and Slot Number help narrow down the location of the processor. Processor Type Indicates the type of the processor. Architecture Revision Indicates the processor architecture revision. Firmware ID Indicates the ID of the processor firmware.
Table C-2 and C-3 describes the fields and enables you to interpret the values displayed in Figure C-2. Table B-2 Description of the Memory Slots Fields and Values Fields and Values Description Status Indicates the status of the memory module. An OK status indicates that all the modules are configured properly. If the status of the memory module indicates an error, click Events to see the details of the errors. Location Indicates the location of the memory.
System Summary Instances This section describes the system summary instances. Figure B-3 Sample System Summary property page Table C-4 and C-5 describes the fields and enables you to interpret the values displayed in Figure C-3. Table B-4 Description of the General Information Fields and Values 112 Fields and Values Description Model Describes the system model. Role Specifies the administrator-defined roles this system is assigned in the managed environment.
Table B-4 Description of the General Information Fields and Values (continued) Fields and Values Description UUID UUID (Logical) Universally Unique ID (UUID) indicates the asset number of the system. Indicates the UUID of the logical server. A logical server is a software configuration that can be applied to a server blade or a virtual machine. Also, you can move a logical server from one server blade or a virtual machine to another.
Cooling Device Instances This section describes the cooling device instances. Figure B-4 Sample Cooling device property page Table C-6 describes the fields and enables you to interpret the values displayed in Figure C-4. Table B-7 Description of the Cooling Device Fields and Values 114 Fields and Values Description Status Indicates the status of the fans. An OK status indicates that all the modules are configured properly.
Power Supply Instances This section describes the power supply instances. Figure B-5 Sample Power property page Table C-7 describes the fields and enables you to interpret the values displayed in Figure C-5. Table B-8 Description of the Power Supply Fields and Values Fields and Values Description Status Indicates the status of the power supply. An OK status indicates that the power supplies are configured properly. To view the details of the error and the recommended action, click Events.
Temperature Instances This section describes the temperature instances. Figure B-6 Sample Temperature property page Table C-8 describes the fields and enables you to interpret the values displayed in Figure C-6. Table B-9 Description of the Temperature Fields and Values 116 Fields and Values Description Status Indicates whether the sensor temperature in the system is normal or not. However, the status of the sensor temperature does not reflect the status of the cooling devices.
Voltage Instances This section describes the voltage instances. Figure B-7 Sample Voltage property page Table C-9 describes the fields and enables you to interpret the values displayed in Figure C-7. Table B-10 Description of the Voltage Fields and Values Fields and Values Description Status Indicates whether the sensor voltage in the system is normal or not. An OK status indicates that the sensor voltage in the system is normal. HashID Identifies an instance of the device.
FRU Information Instances This section describes the FRU Information instances. Figure B-8 Sample FRU Information property page Table C-10 describes the fields and enables you to interpret the values displayed in Figure C-8. Table B-11 Description of the MP Fields and Values 118 Fields and Values Description Name Indicates the FRU Name of the Physical Element. Serial Number Indicates the serial number of the FRU. HashID Identifies an instance of the device.
Management Processor Instances This section describes the Management Processor (MP) instances. Figure B-9 Sample MP property page Table C-11 describes the fields and enables you to interpret the values displayed in Figure C-9. Table B-12 Description of the MP Fields and Values Fields and Values Description Status Indicates whether the Management Processor (MP) is functioning properly or not. An OK status indicates that the MP is functioning properly.
Firmware Information Instances This section describes the Firmware Information instances. Figure B-10 Sample Firmware Information property page Table C-12 describes the fields and enables you to interpret the values displayed in Figure C-10. Table B-13 Description of the Firmware Information Fields and Values 120 Fields and Values Description Name Indicates the name of the entity, such as the system firmware, MP, or the system backplane cell, whose firmware information is displayed.
Enclosure Information Instances This section describes the Enclosure instances. Figure B-11 Sample Enclosure property page Table C-13 describes the fields and enables you to interpret the values displayed in Figure C-11. Table B-14 Description of the Enclosure Information Fields and Values Fields and Values Description Status Indicates the status of the enclosure. An OK status indicates that the components of the enclosure are functioning properly.
Complex-wide Info Instances This section describes the Complex-wide Info instances. Figure B-12 Sample Complex-wide Info property page Table C-14, C-15 and C-16 describes the fields and enables you to interpret the values displayed in Figure C-12.
Table B-15 Description of the Complex Information Fields and Values Fields and Values Description Complex Name Describes user defined name for the complex. Model Defines Model identification string. Serial Number Indicates the serial number of the complex as assigned by the original manufacturer. Revision Displays string for the revision number of the profile, consisting of the major and minor revision numbers concatenated with a period as a separator.
Cell Board Instances This section describes the Cell Board instances. Figure B-13 Sample Cell Board property page Table C-17 describes the fields and enables you to interpret the values displayed in Figure C-13. Table B-18 Description of the Cabinet Fields and Values 124 Fields and Values Description Firmware Version Displays string for the firmware revision number, consisting of the major number separated from the minor number by a period. Status Indicates the status of the component.
Table B-18 Description of the Cabinet Fields and Values (continued) Fields and Values Description Total Empty Processor Slots Indicates the number of all empty processor slots. Processors Per Module Indicates the number of processors per processor module on the cell. Total Installed Processor Indicates the number of all installed processor modules in the cell. Modules Total Configured Processor Modules Indicates the number of all configured processor modules in the cell.
Partition Information Instances This section describes the Partition Information instances. Figure B-14 Sample Partition Information property page Table C-18 describes the fields and enables you to interpret the values displayed in Figure C-14. Table B-19 Description of the Partition Fields and Values Fields and Values Description Partition Name Describes user defined name with the numeric label for the Partition. nPartition ID Indicates the ID of the nPartition in the complex.
Table B-19 Description of the Partition Fields and Values (continued) Fields and Values Description Total Deconfigured Processor Modules Indicates the number of all deconfigured processor modules in the partition. Total Installed Memory Displays the total amount of memory installed in the partition, in megabytes. Total Installed Cells Indicates the number of all cells installed in the partition. Total Active Cells Indicates the number of all active cells in the partition.
Glossary A-B Admin-Defined event subscription Subscriptions created by the administrator using the CLI. These subscriptions cannot be deleted. Admin-Defined filters Filters that can be created, deleted, and modified to set the criteria for indications that must be logged. C Central Management Server (CMS) The server monitoring the client systems in the network using SFM. CIM client An entity in WBEM architecture which sends CIM Operation requests and receives CIM Operation responses.
Event Subscription Administration A component of EVWEB used to subscribe to indications. Event Viewer A component of EVWEB used to view indications present in the Event Archive. EVWEB User component that enables administering and viewing WBEM indications generated on the system on which SFM is installed. External subscriptions These are subscriptions created by tools other than EVWEB.
P-R Provider Services Base Software to support WBEM providers delivered in SysFaultMgmt and Providerdefault bundle. ProviderSvcsBase The name of the bundle that includes PSB software. S sfmdb The output of a command that indicates that Event Archive Database service is running properly. subscription Configuring SFM for consumers to receive indications. For example, HP SIM could subscribe to indications generating on hardware devices on a system.
Index A administrator, 53, 75 Autoselect dependency, 20 B benefits SFM, 9 C Central Management Server (see CMS) CER, 74 CIMOM, 17 cimserver, 84, 85, 87, 88 -s option, 84, 87 CMS, 16 command-line interface, 20 Common Information Model Object Manager (see CIMOM) configuration monitor mode, 32 SFM, 21 cooling devices on a system, 49 creation subscription, 35 cron, 16 custom solution adding, 78 deleting, 79 modifying, 78 D delete event subscription, 58 WBEM indication, 66 E Email Consumer configuring, 59 EM
J jobid, 24 L Log Viewer, 67 Archive Log Database, 67 Current Log Database, 67 Logfile, 24 logs /var/opt/sfm/log/sfm.log file, 71, 80 /var/sam/log/samlog.log file, 72 M modify event subscription, 57 module troubleshooting, 93 N non-administrator, 53, 75 O /var/opt/sfm/conf/evweb.
7 Support and other resources About This Document This document describes how to install, administer, and troubleshoot the System Fault Management (SFM) software and its components. Document updates may be issued between editions to correct errors or to document product changes. To ensure that you receive the updated or new editions, subscribe to the appropriate product support service. Contact your local HP sales representative for more information. This document can also be found online at: http://docs.
Chapter 5 Administering Indications and Instances Using HP SMH Describes how to use the HP System Management Homepage (HP SMH) GUI to administer indications and view instances on the local system. Chapter 6 Troubleshooting SFM Describes how to troubleshoot SFM providers and EVWEB. Appendix A Appendix A Describes the EMT message file. Appendix B Appendix B Describes the Descriptor file. Appendix C Appendix C Interpretation of HP SMH Instances.
Related Information Additional information about SFM is available at: http://docs.hp.com/en/diag Following lists the other documents on SFM SFM Release Notes SFM Frequently Asked Questions SFM Provider Data Sheets SFM Tables of Versions SFM Patch Descriptions SFM Event Descriptions HP Welcomes Your Comments HP welcomes your comments concerning this document. We are committed to providing documentation that meets your needs. Send your comments or suggestions to: diag_lp@presskit.rsn.hp.