Reference SHELF MANAGEMENT SOFTWARE SOFTWARE VERSION 4.2.
Revision history Version -0000 -0001 -0002 Date December 2006 September 2007 May 2008 -0003 September 2008 -0004 January 2009 -0005 July 2009 -0006 August 2010 -0007 December 2011 -0008 April 2012 Description First edition. Second edition. New features and editorial corrections. See the Release Notes for new feature descriptions. Third edition. SNMP access to HPI services. New features related to ATCA software version 3.2.0 and editorial corrections.
Table of Contents Preface ................................................................................................................................................ 6 About this manual........................................................................................................................................6 What’s new in this manual...........................................................................................................................6 Shelves and modules supported ....
Table of Contents Chapter 4: Changing Shelf Settings Using HPI ............................................................................. 53 Radisys HPI implementation details ..........................................................................................................53 Using the platform-management CLI.........................................................................................................56 Writing HPI applications ..........................................................
Table of Contents Appendix A: Shelf Manager Initialization ..................................................................................... 137 Shelf Manager and HPI interface initialization .........................................................................................137 HPI subagent initialization .......................................................................................................................139 Initialization in basic mode............................................
Preface About this manual This manual describes the Radisys Shelf Management Software (the “Shelf Manager”), the Hardware Platform Interface (HPI) server framework, the Intelligent Platform Management Interface (IPMI) infrastructure, and the related software interfaces for AdvancedTCA® (ATCA®) shelves. The Shelf Manager is a software component that resides on a supported front module.
Preface Shelves and modules supported The Shelf Manager described in this manual is available on these Radisys shelves with a supported front module installed. Figure 1.
Preface Where to get more product information Visit the Radisys web site at www.radisys.com for product information and other resources. Downloads (manuals, release notes, software, etc.) are available at www.radisys.com/downloads. Related Radisys manuals The following software and hardware information is available for Radisys modules: • Front module installation and initial setup instructions.
Preface IPMI Intelligent Platform Management Interface Specification Second Generation, v2.0, Document Revision 1.0, Intel, Hewlett‐Packard, NEC, Dell, February 12, 2004, 2/15/06 Markup. PICMG 3.0 Revision 3.0 AdvancedTCA Base Specification, March 24, 2008, PICMG. (Throughout this manual, this specification is referred to as the AdvancedTCA Base Specification.) PICMG 3.1 Revision 1.0 Specification, Ethernet/Fibre Channel for AdvancedTCA Systems, January 22, 2003, PICMG. PICMG AMC.0 R1.
Chapter 1 Shelf Management Overview The Shelf Manager is responsible for monitoring conditions of modules and other shelf components and controlling their operation in order to keep them working properly. The Shelf Manager works together with the Intelligent Platform Management Interface (IPMI) infrastructure to monitor the status of the system and correct problems when necessary. The Shelf Manager reports events and anomalies to a system manager and responds to action requests from the system manager.
1 Shelf Management Overview Event and error management The Shelf Manager manages normal events and error conditions in these ways: • Powers devices up and down as they are inserted and removed • Confirms module compatibility using electronic keying before enabling backplane ports for an inserted module • Manages power by negotiating the power to allocate for each FRU • Manages temperatures, adjusting fan speeds as necessary • Recovers from failures by resetting or power cycling a FRU if warranted • Respon
1 Shelf Management Overview Shelf Manager compliance As of this manual’s publication date, the Radisys Shelf Manager is fully compliant with the PICMG 3.0 Revision 3.0 AdvancedTCA Base Specification. The Radisys hardware platform interface (HPI) is fully compliant with the mandatory requirements of the SAF HPI‐to‐ATCA Specification, SAI‐HPI‐B.03.02. The HPI is also fully compliant with the SAF‐ATCA Mapping Specification, SAIM‐HPI‐B.01.01‐ATCA.
1 Shelf Management Overview Figure 3.
1 Shelf Management Overview Major elements of the shelf management infrastructure The following sections describe the major elements of the shelf management infrastructure. Shelf Manager The Shelf Manager watches over managed devices, reporting anomalous conditions to the system manager and taking whatever corrective actions it can to prevent system failure.1 The Shelf Manager also manages hot‐swap events, detecting the entry, removal, and shutdown of removable devices in the shelf.
1 Shelf Management Overview Field replaceable units (FRUs) AdvancedTCA defines two types of FRUs that are visible to and controlled through the IPMI infrastructure: • Intelligent FRUs physically include an IPM Controller. • Managed FRUs are either Intelligent FRUs or represented by an Intelligent FRU and visible to the IPMI infrastructure. Each FRU that directly attaches to IPMB‐0 must be an Intelligent FRU.
1 Shelf Management Overview Management controllers (IPMCs and MMCs) Each intelligent FRU includes a management controller, either an Intelligent Platform Management Controller (IPMC) or a Module Management Controller (MMC). Most FRUs use IPMCs; however, AMCs and intelligent RTMs use MMCs instead. IPMCs and MMCs manage FRU resources such as sensors and controls. IPMCs are the first shelf components to be powered up, allowing them to control power to the resources they manage.
1 Shelf Management Overview The shelf contains a non‐volatile storage device that stores shelf FRU information, including details of the shelf itself and the shelf configuration. The MultiRecord area of the shelf FRU information contains data such as shelf addressing, power management, and E‐Key records. A proxy FRU makes this information available to the Shelf Manager. Redundant copies of the shelf FRU information are stored on the shelf management server (ShMS) file system.
Chapter 2 Software Architecture Shelf management architecture Shelf management services for the Radisys platform are partitioned across two processors: the local management processor (LMP) and the Intelligent Platform Management Controller (IPMC). • High‐level functions are contained within the shelf management server (ShMS) and execute on the LMP. • Low‐level functions are contained within the shelf management controller (ShMC) and execute on the IPMC.
2 Software Architecture Shelf interface At the shelf interface level, the ShMC firmware runs on the IPMC, performing the low‐level shelf management operations as well as the IPMC responsibilities for the Shelf Manager module FRU.
2 Software Architecture HPI architecture The primary management interface to the Radisys shelf management services is the Hardware Platform Interface (HPI). The Intelligent Platform Management Interface (IPMI) is the lower‐level infrastructure and interface that the HPI relies upon to carry out tasks. The Shelf Manager supports integrated HPI client‐server services that are compliant with the SAF HPI Specification. This section assumes familiarity with the HPI model section of the specification.
2 Software Architecture As shown in Figure 5, the HPI client library (HCL) is a dynamically linked library (DLL) that provides the HPI API as well as an integrated RMCP client for remote HPI client‐server communication. The HCL source code is independent of the operating system and management processor, and works with any generic HPI‐compliant application. The HCL Linux library is installed on the Shelf Manager module. The HCL source code is available in the Radisys software distribution.
2 Software Architecture RMCP, a request‐response protocol, is a simple packet‐based communication mechanism. RMCP packets include a field that indicates the class of message embedded in the packet. The messages between the HPI client and HPI server are encapsulated in RMCP packets using the OEM class. For more information, see the RMCP section of the Intelligent Platform Management Interface Specification v2.0, Document Revision 1.0.
2 Software Architecture For information on viewing the log messages, see Viewing the system event log on page 101. Events may be cleared from the shelf’s SEL when the Shelf Manager is restarted, depending on the configuration explained in Configuring the system event log (SEL) on page 45 Domain event log (DEL) HPI provides a domain event log that contains most of the events that are in the shelf’s SEL.
2 Software Architecture Basic and enhanced shelf management operation The Shelf Manager in an ATCA‐6002 shelf can operate in two modes: enhanced and basic. By default, the Shelf Manager operates in basic mode. For details on operating in enhanced mode, refer to Changing the Shelf Manager operating mode on page 39. Note: The ATCA‐45xx series and the ATCA‐46xx series CPMs only operate in the enhanced mode when they are installed in the ATCA‐6002 shelf.
2 Software Architecture In enhanced mode, the Shelf Manager uses the /etc/shmgr.conf file. The Shelf Manager does not need to use the shmgr_basic.conf file provided for basic mode; it can discover and manage any type of ATCA node module. The rsys‐ipmitool utility is also available for performing operations at the command shell and in scripts. Software redundancy The following sections apply only to shelves where an SCM is acting as the Shelf Manager. They do not apply to ATCA‐6002 shelves.
2 Software Architecture Figure 7.
2 Software Architecture Peer communication loss causes failover If communication between the peer ShMCs breaks down, the first of the previous communication mechanisms (ShMC‐to‐ShMC over IPMB) fails. The standby Shelf Manager attempts to ping its peer over the LAN interface to confirm its full high availability (HA) state. If this fails, this causes the standby Shelf Manager to effect a failover, assume the active role, and send out an event notifying all event receivers of the failover.
2 Software Architecture • Using the front panel reset button to reset the active SCM either manually or using HPI with the saHPIResourceResetStateSet() function. When the SCM is acting as the active Shelf Manager, its Active LED glows amber. On the standby Shelf Manager, this LED flashes periodically, about once every 2 seconds.
2 Software Architecture Shelf FRU information redundancy During normal shelf operation with redundant Shelf Managers, there are always at least three persistent copies of the shelf FRU information. The shelf FRU information is initially stored in a single shelf FRU device in an ATCA‐6000 shelf, and in two shelf FRU devices in an ATCA‐6006, ATCA‐6014, or ATCA‐6016 shelf. In addition, both the active and standby Shelf Managers keep a persistent cache of the shelf FRU information.
2 Software Architecture Shelf FRU invalid alarm This alarm signals a catastrophic fault condition. The active Shelf Manager keeps all other modules in the M2 (pre‐activation) state and waits for the operator to intervene. On the ATCA‐6000 shelf, the “Shelf FRU Invalid” text is visible on the shelf display panel. On any Radisys shelf, both SCMs are placed into debug mode.
2 Software Architecture HPI server redundancy The Radisys HPI server uses a redundancy model very similar to the active‐standby model used by the Shelf Manager. Since the HPI server is combined with the Shelf Manager Server (ShMS) daemon, it shares the same redundancy status with the Shelf Manager. The active Shelf Manager hosts the active HPI server while the standby Shelf Manager hosts the standby HPI server.
2 Software Architecture In addition to the HPI data, automatic operations are synchronized with the peer HPI servers to keep HPI databases in sync. These operations are called by the following functions and are only synchronized with the standby HPI server when the HPI application makes the request to the active HPI server. • saHpiDiscover() – This function overcomes potential latencies in the hot‐swap mechanism and keeps the resource presence table (RPT) updated.
2 Software Architecture Shelf cooling To ensure that all modules in the shelf operate efficiently, their operational temperatures must be maintained at optimal levels. All modules affected by on‐board temperature fluctuations have sensors that generate events in case their operational temperatures cross pre‐set threshold levels. Temperature sensors generate events at three different threshold levels: Minor, Major, and Critical.
2 Software Architecture If both SCMs are removed from the ATCA‐6014 or ATCA‐6016 shelf, the active RCM adjusts the fans to full speed until the Shelf Manager resumes managing the shelf. When a fan module is deactivated (by toggling its hot‐swap switch or performing the software hot‐swap procedure through the CLI), the motor will be turned off. A fan whose switch is in the wrong position at shelf power up will immediately be powered off as soon as the Shelf Manager initializes.
2 Software Architecture Shelf Manager cooling override The Shelf Manager provides autonomous cooling management for the shelf. However, there may be situations under which a system manager might need to override the Shelf Manager's cooling management and control fan speeds itself. HPI applications can change fan speeds incrementally using standard HPI controls in the fan resources. For more details on the HPI controls, see the cooling and fan control section of the SAF Mapping Specification.
Chapter 3 Configuring Shelf Manager and HPI Behavior This chapter explains how to: • Disable or enable the Radisys Shelf Manager. • Change the operating mode of the Shelf Manager in the ATCA‐6002 shelf. • Configure Shelf Manager and HPI behavior by editing the /etc/shmgr.conf file. • Configure Shelf Manager behavior in basic mode in the ATCA‐6002 shelf. • Configure blade HPI behavior.
Configuring Shelf Manager and HPI Behavior 3 Disabling the Shelf Manager Important: When there is only one SCM or Shelf Manager module in the shelf, disabling its Shelf Manager causes all shelf management functionality to be lost. Additionally, if you power cycle the shelf or the module with the disabled Shelf Manager, all modules in the shelf (including the SCM or Shelf Manager module) will be unable to power up their payloads.
Configuring Shelf Manager and HPI Behavior 3 Enabling the Shelf Manager Important: If you are removing a module from a non‐Radisys or Radisys MPCH0001 shelf and reinstalling it in an ATCA‐6xxx shelf, you must first enable the module’s Shelf Manager before removing it. Otherwise, you will need to install the module in a shelf (Radisys or non‐Radisys) that already has an active Shelf Manager running. 1. Establish a serial connection to the module and log on as root. See the Installation Guide for details.
Configuring Shelf Manager and HPI Behavior 3 Changing the Shelf Manager operating mode The Shelf Manager in an ATCA‐6002 shelf can operate in two modes: enhanced and basic. For a description of these operating modes, refer to Basic and enhanced shelf management operation on page 24. By default, the Shelf Manager runs in basic mode for most of the modules. The exception to this are the ATCA‐45xx series and the ATCA‐46xx series CPMs, which only operate in enhanced mode. To change the operating mode: 1.
Configuring Shelf Manager and HPI Behavior 3 Overview of Shelf Manager and HPI configuration Shelf Manager and HPI configuration settings can be either specific to the module running the Shelf Manager or specific to the shelf. • Shelf Manager‐specific settings affect how an individual module running the Shelf Manager performs its processing. These settings are stored on the module in the /etc/shmgr.conf file. All configuration settings described in this chapter are specific to a Shelf Manager module.
3 Configuring Shelf Manager and HPI Behavior Table 4. Shelf Manager and HPI configuration settings Shelf Manager configuration settings Configuring verbosity and log settings Disabling the HPI service Configuring RMCP Parameter to edit in /etc/shmgr.
Configuring Shelf Manager and HPI Behavior 3 Editing the shmgr.conf configuration file File‐based configuration settings are stored in the /etc/shmgr.conf file for each Shelf Manager module and are saved permanently. Any changes you make take effect when the Shelf Manager or the module running the Shelf Manager is restarted. On an ATCA‐6002 shelf, the /etc/shmgr.conf file is supported only when the Shelf Manager is operating in enhanced mode.
Configuring Shelf Manager and HPI Behavior 3 Configuring Shelf Manager behavior This section describes the shelf management configuration settings you can change by editing shelf management server parameters (parameters with the prefix SHMS) in the /etc/shmgr.conf file. Configuring verbosity and log settings The SHMS_CMDLINE_ARGS parameter in /etc/shmgr.conf is where you can set verbosity and log settings.
Configuring Shelf Manager and HPI Behavior 3 Disabling the HPI service The SHMS_CMDLINE_ARGS parameter in /etc/shmgr.conf lets you disable HPI service. SHMS_CMDLINE_ARGS sets the ShMS‐related command‐line parameters passed to the Shelf Manager program when it starts. ‐‐nohpi Disables the HPI service, which is not recommended except in limited debugging situations. Disabling HPI causes the platform‐management CLI to be unavailable and prevents the use of HPI applications.
Configuring Shelf Manager and HPI Behavior 3 Configuring the system event log (SEL) The SHMS_SYNC_SEL_TIME parameter in /etc/shmgr.conf sets the system event log (SEL) time on all IPMCs to the time on the active Shelf Manager. Y Default. Synchronizes the SEL time with the active Shelf Manager time. N Disables this feature. The SHMS_SEL_AUTOCLEAR parameter in /etc/shmgr.conf controls how the SEL is cleared. The SEL drops new events if the log is full.
Configuring Shelf Manager and HPI Behavior 3 Configuring HPI behavior This section describes the HPI configuration settings you can change by editing HPI server daemon parameters (parameters with the prefix HSD) in the /etc/shmgr.conf file. Setting the HPI redundancy mode The HSD_SHM_NON_REDUNDANT parameter in /etc/shmgr.conf lets you specify whether HPI operates non‐redundantly (without a second SCM installed). Y Enables non‐redundant HPI operation.
Configuring Shelf Manager and HPI Behavior • HSD_CHASSIS_FRU_MONITOR_ENABLE enables the monitoring of shelf FRUs, including the alarm module and fans. If monitoring is enabled, the shelf FRUs are scanned for their presence or removal from the shelf and appropriate actions are taken. Y N • 3 Default. Enables monitoring of the shelf FRUs. Disables shelf FRU monitoring. HSD_FAN_RPM_ALARMS enables fan RPM sensor events to be logged as alarms. Y N Default.
Configuring Shelf Manager and HPI Behavior 3 Assigning an entity type for a new FRU The HSD_FRU_ENTITY_TYPE parameter in /etc/shmgr.conf lets you assign an entity type to a new FRU. To assign an entity type, use the following syntax: HSD_FRU_ENTITY_TYPE=:: You must add a new entry in the configuration file for each FRU whose entity type you want to configure.
Configuring Shelf Manager and HPI Behavior 3 Loading saved HPI resource configuration files The HSD_LOAD_SAVED_CONFIGURATION parameter in /etc/shmgr.conf enables the loading of previously saved HPI resource configuration files during Shelf Manager initialization. Y Enables the loading of saved HPI resource configuration files. This may result in increased startup time. N Default. Custom configuration files are not loaded at startup.
Configuring Shelf Manager and HPI Behavior 3 Configuring the Shelf Manager in basic mode For a Shelf Manager running on the ATCA‐6002 shelf, a configuration file for basic mode is located at /etc/shmgr_basic.conf. When basic mode is enabled, the shelf management init script loads this file, which specifies the supported platform configurations. Usually this file does not need to be modified unless you need to install a replacement module. The following options are available in shmgr_basic.conf.
Configuring Shelf Manager and HPI Behavior 3 Editing the rsyshsd.conf configuration file For each front module there are blade HPI configuration settings that are stored in the /etc/rsyshsd.conf file and saved permanently. The default configuration settings work for most of the front modules without any modifications. For CPMs, however, the RSYSHSD_BASE_INTERFACE parameter must be edited to specify the name of the bonded interface. For details, refer to Configuring blade HSD interfaces on page 52.
Configuring Shelf Manager and HPI Behavior 3 Configuring log settings • • The RSYSHSD_LOG_FILE parameter is where you can specify the name of the log file. If you do not specify a value, the software chooses a suitable name. The RSYSHSD_CMDLINE_ARGS parameter is where you can set verbosity and log settings. You can use the following values: ‐‐verbosity [level] ‐‐maxlog [log size] The default is 5.
Chapter 4 Changing Shelf Settings Using HPI This chapter describes the Radisys HPI implementation, including how HPI affects your applications, how to write HPI applications, and how to change various shelf settings. You can change the shelf settings using any of these methods: • Platform‐management CLI commands • Standard PICMG commands • Standard HPI controls • SNMP requests to operate on HPI‐B0101‐MIB module objects • hpiapp menu options Note: The settings described in this chapter are shelf‐specific.
Changing Shelf Settings Using HPI 4 Table 5. Radisys-specific resources Resource type1 Shelf Container resource Root SPM slot Shelf Virtual SPM slot Shelf SPM COM-E Slot SPM slot SCM FRU resource COM-E slot COM-E Module RCM slot RCM Notes: Shelf RCM slot Description Entity (type, location)2 Lifetime3 This logical resource represents the shelf. There is exactly one created by each HPI server. This resource is implemented as described in the mapping specification.
Changing Shelf Settings Using HPI 4 The following is the ResourceId format used by shelf and Shelf Manager resources in the HPI resource presence table (RPT): 0x01 Shelf resource. This represents the entire shelf and provides access to some shelf‐ related configuration such as the shelf address, Shelf Manager IP address, and so forth. This resource contains an inventory data record that also provides access to the shelf FRU information. 0x02 Shelf Manager resource.
Changing Shelf Settings Using HPI 4 Using the platform-management CLI You can access several command modes, including platform‐management mode, from the Shelf Manager module’s master (top‐level) CLI. The platform‐management mode provides a CLI you can use to configure the Shelf Manager, control the Shelf Manager service, view and acknowledge all alarms in the shelf, and view information about the FRUs in the shelf. For information about specific CLI commands, see the CLI Reference.
Changing Shelf Settings Using HPI 4 Writing HPI applications The following are required to create an HPI application: SaHpi.h The official HPI B.03.02 header file, which is required to create any HPI B.03.02 user application. The header file is also backward compatible with HPI B.01.01 and HPI B.02.01. You can download this from the Service Availability Forum (SAF) Web site (http://www.saforum.org), and it is also included as part of the Radisys HPI client library and application package. SaHpiAtca.
Changing Shelf Settings Using HPI 4 To run any HPI application that links with the Radisys HPI client library, libhcl.so, consider the following: • The SaHpi.h header file and the HCL support HPI B.03.02 and are backward compatible with HPI B.02.01 and HPI B.01.01. Radisys recommends recompiling your HPI applications with the new header file. • Ensure the run‐time environment is properly set up so that applications can find the HPI client library.
Changing Shelf Settings Using HPI 4 Using OpenHPI tools OpenHPI is an open source project created with the intent of providing an implementation of SAF’s HPI. This open source community provides several HPI applications, plug‐ins, and utilities that can be used with the Radisys HPI implementation. The Radisys HPI client library is fully compatible with all OpenHPI tools. You can download the OpenHPI software suite from the OpenHPI Web site: http://www.openhpi.
Changing Shelf Settings Using HPI 4 Using SNMP SNMP access to the shelf management functionality is provided through the HPI‐B0101‐MIB module, which works seamlessly with the B.02.01 HPI. The HPI subagent on each Shelf Manager module facilitates SNMP management of the ATCA shelf by populating the objects and generating the notifications defined in the HPI‐B0101‐MIB module. On each Shelf Manager module the HPI subagent communicates with the HPI server using the HPI client library functions.
Changing Shelf Settings Using HPI 4 Each of these tables is indexed by: saHpiDomainId, saHpiResourceId saHpiResourceIsHistorical saHpiCtrl*EntryId object The control number is used as the value for each instance of the saHpiCtrl*EntryId object. Thus, the value for the saHpiCtrl*EntryId variable in each table row is the same as the value for the saHpiCtrl*Num variable. For a list of the tables and notifications that the HPI subagent supports, see the Software Guide.
Changing Shelf Settings Using HPI 4 Configuring HPI subagent logging HPI subagent logging is disabled by default. To enable logging: 1. In the /etc/rc.d/init.d/hpiSubagent startup script, set the desired type of logging: • To set logging to /var/log/messages via syslog, change the OPTIONS setting from ‐Ln to ‐Lsd • Alternatively, to set logging to /var/log/hpiSubagent.log, change the OPTIONS setting from ‐L to ‐Lf /var/log/hpiSubagent.log. 2.
Changing Shelf Settings Using HPI 4 Using the example HPI application (hpiapp) Radisys provides an example HPI application, hpiapp, that is compliant with the SAF HPI Specification. You can use it to perform tasks and as an example when writing your own HPI applications. Important: • Do not call the hpiapp example application programmatically. It is an example only and is not intended for use in high availability applications.
Changing Shelf Settings Using HPI 4 Running hpiapp The following examples show how to run the hpiapp example application to connect to an HPI server running Shelf Manager at the IP address of 10.100.22.248. From the Linux prompt: • Use the ‐h (host identifier) option: hpiapp ‐h 10.100.22.248 • or Set the default domain identifier to 10.100.22.248: export SAHPI_UNSPECIFIED_DOMAIN_ID=10.100.22.
Changing Shelf Settings Using HPI 4 Changing the shelf address The first 4 bytes of the shelf address should be changed manually to uniquely identify the shelf according to its location. Note: When changing the shelf address, do not deviate from the address format defined in Radisys shelf address format (below). Deviations from this format can cause the fruinfo fail alarm on the Shelf Manager module, which causes all modules to boot up in debug mode.
Changing Shelf Settings Using HPI 4 Changing the Shelf Manager IP address The active shelf management server has a dedicated IP address that is derived from the Shelf IP Connection Record in the shelf FRU device. The IP Connection Record stores: • ShMS IP address (default is 192.168.16.17, or C0 A8 10 11 hexadecimal) • Gateway address (default is 0.0.0.0) • Subnet mask (default is 255.255.255.
Changing Shelf Settings Using HPI 4 You can either change one line at a time or change all three lines together using the saHpiControlSet() function. Having the option of changing one line at a time allows you to update just one of the addresses (IP address, gateway or subnet mask) while leaving the other two untouched. For details on the Shelf Manager IP address control, refer to the SAF Mapping Specification.
Changing Shelf Settings Using HPI 4 Changing the FRU power-on sequence using the CLI You can use the CLI to: • View the current power‐on sequence • Change the power‐on sequence • Commit the current power‐on sequence (make it permanent). • View the committed status sensors.
Changing Shelf Settings Using HPI 4 FRU power-on sequence controls The shelf resource (ResourceId 0x01) provides a set of HPI discrete controls, each of which maps to a power descriptor. The discrete control state values map to the slot ResourceIds of each of the sites associated with the power descriptors. Each intelligent FRU slot in the shelf has one power descriptor in the shelf activation and power management record in the shelf FRU information.
Changing Shelf Settings Using HPI 4 FRU power-on sequence commit status sensor The shelf resource contains an additional sensor called the FRU power‐on sequence commit status sensor (0x1300). This sensor indicates whether the configured power‐on sequence has been committed. This sensor is an OEM sensor supporting only event states. No numeric readings or thresholds are supported.
Changing Shelf Settings Using HPI 4 Querying the current hot-swap state of a FRU You can query a FRU’s current hot‐swap state: • Using the CLI commands to get a summary of FRU information, including firmware versions and hot‐swap states.
Changing Shelf Settings Using HPI 4 Overriding the default ShMS hot-swap mechanism Whether a FRU sitting in a particular slot is activated by the Shelf Manager or by the system manager is defined by the Shelf Manager controlled activation bit in byte 4 of the power descriptor associated with the hardware address of the slot. The power descriptors for all permanent slots associated with intelligent FRUs are present in the shelf activation and power management record of the shelf FRU information.
Changing Shelf Settings Using HPI 4 While in manual mode, you can also update the Delay Before Next Power On for that hardware address. This is the analog state value of the control. New values are entered in tenths of a second. To change the delay before the next power on: 1. Briefly set the control in manual mode. 2. Change the analog state value of the control. 3. Reset the control to auto mode.
Changing Shelf Settings Using HPI 4 Overriding the cooling algorithm System managers can completely override the Shelf Manager’s cooling algorithm and use an external cooling module to control the fans. The analog control for the fans is 0x1400. The possible fan speed values for each shelf are: • ATCA‐6000: 0 to 10 • ATCA‐6002: N/A. The fans always run at full speed and are not under the control of the Shelf Manager.
Changing Shelf Settings Using HPI 4 Overriding the cooling algorithm with an HPI application The following steps illustrate how you can do this using HPI: 1. Open an HPI session using the active Shelf Manager IP address as DomainId (see Writing HPI applications on page 57 for information). 2. Using the fan resources, set all fan controls to the SAHPI_CTRL_MODE_MANUAL mode, keeping their speeds unchanged. This places them in the system manager override mode.
Changing Shelf Settings Using HPI 4 Changing shelf power properties The shelf resource contains two controls that enable system managers to update shelf power properties in the Shelf Power Distribution record. In general, system managers should never need to change these properties. However, in some cases, a new board with high power requirements can exhaust the power budget of the shelf.
4 Changing Shelf Settings Using HPI HPI parameter control The shelf resources, along with all intelligent FRU resources and slot resources, support the parameter control feature. These resources have the SAHPI_CAPABILITY_CONFIGURATION bit set in the ResourceCapability mask of their resource presence table (RPT) entry. There are three basic saHpiParmControl() actions: 0 – SAHPI_DEFAULT_PARM Loads the default settings for a resource and restores it to factory state.
Changing Shelf Settings Using HPI 4 HPI server parameter control configuration files Each resource has two configuration files: one for default settings and one for custom saved settings. The following are the file name formats: %08X.hcf.deflt – Default configuration file name format %08X.hcf.saved – Saved configuration file name format %08X represents the ResourceId in 4‐byte hexadecimal format. For example, a resource with ID 0x70082 will have configuration files 00070082.hcf.deflt and 00070082.hcf.
Changing Shelf Settings Using HPI 4 How configuration files are created for a new resource During HPI initialization (when resources are being created for the shelf, slots, and FRUs), initial default configuration files are automatically created if none are found in /tmp/shmgr/hpi. If no corresponding saved configuration file is found for the resource in /var/lib/shmgr, the initial default configuration file is copied as the initial saved configuration file (with the .hcf.saved suffix).
Changing Shelf Settings Using HPI 4 Saving a configuration file to store custom settings for a resource After a resource has been configured to a desired operational state, it is a good idea to save the configuration so it can be loaded whenever the FRU is re‐introduced in the shelf in the same slot. If you need to restore the state of the resource later, you can load the saved configuration using the parameter control action SAHPI_RESTORE_PARM.
Changing Shelf Settings Using HPI 4 Error conditions The parameter control operations SAHPI_DEFAULT_PARM or SAHPI_RESTORE_PARM will fail and return the error code SA_ERR_HPI_INVALID_DATA if a configuration file for one resource is loaded with another resource of a different kind. This check is done by matching the Manufacturer ID and Product ID of the FRU.
Chapter 5 Managing Alarms and Events This chapter describes: • The Shelf Manager’s role in managing alarms. • How you can change the alarm reporting by assigning severities to sensor events. • The Shelf Manager interactions with the alarm functions of the shelves. • How you can override the Shelf Manager’s default use of the shelf alarm indicators. For tips on responding to specific alarms, see Chapter 6, Troubleshooting, on page 98.
5 Managing Alarms and Events Domain alarm types and troubleshooting overview Table 8 lists the alarm and sensor types along with sample descriptions and causes for the types of alarm maintained in the DAT. is the sensor number (in hexadecimal) of the sensor generating the alarm. Table 8.
5 Managing Alarms and Events Threshold alarms The first eight alarm types in Table 8 are threshold alarms which indicate that a threshold sensor detected a voltage, current, temperature, or fan speed measurement that was out‐of‐ range.
5 Managing Alarms and Events Upper Major temperature events may indicate that the ambient temperature is too high, in which case you might need to lower the ambient temperature. They may also indicate that air flow into the shelf has become restricted, in which case you need to investigate the cause. Remove, clean, and reinsert the fan filter if it has become clogged. Upper Minor and Upper Major temperature event assertions cause the fan speeds to increase in increments until the condition is cleared.
5 Managing Alarms and Events Resource utilization alarms The resource utilization alarms indicate that the Shelf Manager is using a high percentage of the host module’s resources. If there is another process on the module using all of the resources, the sensor does not generate an alarm. A process monitoring service, rsys_pms, runs as a background process on the module and every 30 seconds checks the utilization percentage for the CPU, RAM, and Shelf Manager and HPI threads.
5 Managing Alarms and Events Domain alarm table entries The alarm strings are based on the associated sensor events. Text within the alarm string indicates the FRU location, sensor number, and severity of the sensor event. The reference manuals for each module contain detailed information on their associated sensors. The domain alarm table (DAT) can hold a maximum of 1024 alarm entries. The alarm entries include system alarms, which are generated automatically, and any user‐defined alarms.
5 Managing Alarms and Events Collecting information for the alarm configuration file For each type of FRU and each sensor for which you will assign severities, you must provide identifying information. Table 10 explains how to obtain the information to identify the FRU type. Table 10.
5 Managing Alarms and Events Table 11 lists the information needed to identify the sensor, the event state, and the new severity to assign. Table 11. Information needed to identify a sensor and event state Sensor information needed Sensor index Method for obtaining it Notes Retrieve a list of the FRU’s sensors from the RPT using an HPI application.
5 Managing Alarms and Events Configuration file structure and contents The /etc/shmgralarm.
5 Managing Alarms and Events Alarm configuration procedure Changes to the alarm configuration file take effect when the Shelf Manager is restarted. This procedure explains how to make the change take effect when the shelf has redundant SCMs. To change the event severities: 1. Modify the /etc/shmgralarm.conf file on the SCM hosting the active Shelf Manager. Tip: To check the role of an SCM, you can use this command from the SCM Linux prompt: shmgr role 2.
5 Managing Alarms and Events ATCA-6000 shelf display and alarm indicators The shelf display panel (SDP) on the ATCA‐6000 shelf consists of the following: • LCD display • Shelf Power LED • Alarm LEDs to indicate Minor, Major, and Critical alarms • Audible alarm with an Alarm Acknowledge button The LCD display on the SDP supports two scrolling 40‐character lines to display the eight most recent alarms in the DAT. Figure 9 on page 93 shows the alarm format used on the LCD display.
5 Managing Alarms and Events Figure 9. Format of alarm messages on the SDP 1 Alarm ID 9 10 Severity 14 18 Location 40 Alarm Description Text description of the alarm. Set to the location of the resource that has the alarm. The format is NNM for front modules, where NN is the slot number and M is the Managed FRU Id, which is set to 0 when alarm refers to the front module.
5 Managing Alarms and Events ATCA-6006 shelf alarm indicators The optional shelf alarm panel on the ATCA‐6006 shelf consists of the following: • Shelf Power LED • Alarm LEDs to indicate minor, major, and critical alarms • Audible alarm • Audible alarm reset pinhole • Alarm acknowledge pinhole • Telco alarm connector The telco alarm outputs and shelf alarm indicators are controlled by the Shelf Manager through HPI, but provide the capability for the system manager to override the state of the alarm outputs
5 Managing Alarms and Events ATCA-6014 and ATCA-6016 shelf alarm indicators The ATCA‐6014 and the ATCA‐6016 shelves both have a shelf alarm panel (SAP) and a shelf alarm display (SAD). The SAP is a telco alarm connector located on the front of the ATCA‐6014 shelf and on the rear of the ATCA‐6016 shelf. The SAD consists of alarm LEDs that indicate minor, major, and critical alarms, and is located on the front of both the ATCA‐6014 and the ATCA‐6016.
5 Managing Alarms and Events The controls for the telco alarm contacts (Alarm Critical, Alarm Major, and Alarm Minor) are standard HPI controls. On the ATCA‐6000 shelf, the controls for the audible alarm and LCD panel are also standard HPI controls. The controls for the Alarm LEDs on the ATCA‐6000 shelf are OEM controls, and are defined in detail in the LEDs section of the HPI‐to‐AdvancedTCA Mapping Specification.
5 Managing Alarms and Events Parameter values for setting alarm controls The following list identifies the possible saHpiControlSet() function parameter values: • SessionId = • ResourceId = • CtrlNum = • CtrlMode = SAHPI_CTRL_MODE_MANUAL • CtrlState = For Alarm LED controls: (ATCA‐6000 only)
Chapter 6 Troubleshooting This chapter provides detailed information on the following issues: • Available diagnostic tools for FRU and Shelf Manager problems. • General troubleshooting procedures that apply to alarms, system events, and sensor and temperature values. • IPMI and IPMB topics and troubleshooting procedures. • Issues related to FRU or shelf operation, such as responding to temperature and other sensor alarms or dealing with loss of communication with a FRU.
6 Troubleshooting Using the Shelf Manager diagnostics utility The shmgrdiag utility gathers information on the current state of the Shelf Manager, the Shelf Manager module resource usage, and the key IP interfaces. To run the utility from the Shelf Manager module’s Linux prompt, enter: shmgrdiag The diagnostics display as they run and list the types of diagnostics being run as well as a summary of the results.
6 Troubleshooting General troubleshooting procedures This section explains how to acknowledge and view alarms, view system events, and check sensor and temperature values. Responding to alarms You can use platform‐management CLI commands to perform common procedures in response to alarms. For troubleshooting information on specific alarms, see Responding to specific alarms on page 111.
6 Troubleshooting Viewing the system event log You can view a module’s local SEL or the shelf’s SEL using the open‐source ipmitool utility or the rsys‐ipmitool supplied by Radisys. You have these alternatives: • For a shorter listing that contains fewer details: a. Log in to the module and access the CLI. To view the shelf’s SEL, log in to the active SCM and access the CLI. b.
6 Troubleshooting Investigating sensor values To view the sensor value for a sensor that has generated an alarm: 1. Write down the FRU slot identified by the alarm and the severity of the alarm. 2. In the platform‐management CLI, to access the front mode of an identified slot enter: Syntax: front 3. Display the event log for the FRU using the command show with the events option. Syntax: show { events all | informational | minor | major | critical } 4.
6 Troubleshooting 2. List the sensors for the SPM resource: q s b 3. 4. 5. 6. Note the intake temperatures. Repeat steps 1, 2, and 3 for the other SPM. In the platform‐management CLI, access the fan slot mode of one of the fans. Display the sensor using the command show with the sensors option. Syntax: show { sensors { all | sensorId } } 7. Repeat steps 5 and 6 for the other fans. 8. Compare the intake temperatures to the exhaust temperatures.
6 Troubleshooting On the ATCA-6014 and ATCA-6016 shelves: Each fan tray has an exhaust temperature sensor. Check the exhaust temperatures as follows: 1. In the platform‐management CLI, access the fan mode for one of the fans. 2. Display the exhaust sensor using the command show with the sensors option. 3. Repeat steps 1 and 2 for the other fan trays. 4. Average the exhaust temperatures to get an aggregate exhaust reading.
6 Troubleshooting IPMI frame structure and message flow There are no frames in IPMB messages. Figure 10 shows the message flow through an SCM pair for IPMB messages. Figure 10.
6 Troubleshooting Only the switching portion of the SCMs act in Active‐Active mode. The ShMCs are never in dual active state unless there's a catastrophic bus fault that does not allow any communication between the SCMs. The ShMCs arbitrate to become active or standby as soon as they initialize. If they ever become dual‐active, they are constantly probing each other's states and would quickly notice the state mismatch and do re‐arbitration.
6 Troubleshooting Each IPMC that is not the active ShMC monitors its “own side” of the bus. In other words, each IPMC is monitoring its own I2C interfaces that are connected to IPMB‐A or B. Periodically it checks for errors on the I2C interfaces using registers in the IPMC. If an error is detected, the IPMC isolates itself from the IPMB, resets itself, and tries to rejoin the bus.
6 Troubleshooting The user can do to several things respond to a persistent IPMB isolation issue when there is a shelf wide IPMB isolation alarm that is always reported by the active SCM. Perform the following steps: 1. Read the IPMB sensor (sensor 0x1100) on the active SCM. If the reading is 0xF8 or 0x8F then it indicates that the SCM is detecting an external fault on the bus. This means that some other IPMC on the bus is causing the IPMB issue.
6 Troubleshooting d. To return the board to its default state where it is joined on that IPMB you can again use the CLI as follows: platform‐mgmt(front 1)# configure control 0x1101 auto Default Control Details : Control Type : ANALOG Analog : 0 platform‐mgmt(front 1)# 4.
6 Troubleshooting Troubleshooting FRU and shelf operation issues This section explains how to respond to FRU and shelf operation alarms and other issues. The Shelf Manager may be able to resolve some of these issues automatically, but if a problem continues, you may need to intervene. FRU information layout The FRU information storage locations use a common layout to support platform management tasks.
6 Troubleshooting Responding to specific alarms Responding to a communication lost alarm The Shelf Manager is not getting keep‐alive responses from the target FRU, usually because the FRU was extracted without going through the normal hot‐swap extraction procedure. Try these steps: • If the FRU has been removed, the FRU can be removed from the RPT using HPI control 0x101e on the Shelf Manager resource, the Failed Resource Extract control. This also removes the alarm.
6 Troubleshooting There are no hardware presence detect lines in ATCA backplanes. The ShMS thus relies on message based pings to periodically query each intelligent FRU whose presence is known to the ShMS, as follows: 1. To the IPMC of every intelligent FRU in the shelf the ShMS sends a GetDeviceId ping every 3 seconds by default. If a ping times out with a 0xC3 IPMI response code the ShMS schedules a second ping for that IPMC after 2 seconds. 2.
6 Troubleshooting After making any shmgr.conf file update, remember to synchronize it to the standby SCM from the platform‐management CLI before rebooting the SCMs. The procedure from the active SCM is: mcli platform‐mgmt shelf‐mgmt configure sync_config Responding to a ResourceFailed alarm The event SAHPI_RESE_RESOURCE_FAILURE is received both when a FRU is extracted too quickly (that is without the hot‐swap LED powered on) and when there is an IPMC failure.
6 Troubleshooting Responding to an IPMB isolated alarm The module’s IPMC has detected a problem on one or both of its IPMB interfaces, and has isolated the IPMC from the external bus. 1. Verify the module is seated properly. 2. Use the hot‐swap procedure to extract and re‐insert the module. If the condition persists, contact Radisys Technical Support for assistance. Responding to a fruinfo fail alarm The fruinfo startup script failed to retrieve the information it needed. 1.
6 Troubleshooting 5. Disable event generation on that sensor with these commands: 8 The screen displays the final message: SensorEventsEnabled : 0x0 6. Exit hpiapp: q q Responding to a redundancy lost alarm Note: This section applies only to shelves where an SCM is acting as the Shelf Manager. It does not apply to ATCA‐6002 shelves. An alarm indicates that the Shelf Manager has not detected a second, redundant Shelf Manager.
6 Troubleshooting 5. Look for the “inet addr” value. The SCM in the lower‐numbered slot in the chassis should have an eth1 IP address of 10.0.1.1. The SCM in the higher‐numbered slot should have the address 10.0.1.2. If the addresses were changed from these values—even in the template files that set the addresses upon SCM bootup—restore them to the original settings persistently. See the Software Guide for more information on the IP address assignment and template files.
6 Troubleshooting Before addressing the possible cause and trying to correct it, some of the information from the INSERTION_PENDING state may be useful. The HPI INSERTION_PENDING state internally has the following two stages: 1. The first stage is where the FRU is functionally INACTIVE. This is the ATCA M2 state (PICMG 3.0 ATCA specification). This is the stage where the FRU is waiting for the Shelf Manager or system manager to verify its presence in the shelf and 'push' it to the second stage.
6 Troubleshooting 3. If the FRU is found to be in M2 and AutoInsertion timer is not ‐1 or a large value, check the state of the 'FRU Activation Control' 0x1020 on the Slot Resource associated with the FRU. For example if the FRU is in slot 2 in a 14 slot shelf with ResourceId 0x20096 then its slot Resource will be 0x2FF96. You can determine the slot resource from a Resource's entity path. The FRU Resource's Entity Path will be a superset of its associated Slot Resource's entity path.
6 Troubleshooting c. Check the 'Desired Power' controls on all FRUs installed on that slot (parent and managed FRUs) and check if the total exceeds the maximum power capability of the slot. 6. If the FRU is in M3, the checks suggested in Step 5 above all pass, and the FRU along with its managed FRUs is below the slot's maximum power capability, it is possible that the Shelf's total power capability is exceeded.
6 Troubleshooting Responding to a fan not running 1. Check that the Shelf Manager can communicate with the fan and that it is in the active state. Use the CLI command show chassis status. 2. If the fan is not in the active state, check that the hot‐swap switch is closed. a. On the ATCA‐6000 shelf, the fan should have its hot‐swap switch in the leftmost position. b. On the ATCA‐6006 shelf, the hot‐swap latch thumbscrew must be secured. c.
6 Troubleshooting Troubleshooting Shelf Manager issues General issues When you encounter problems related to shelf management, check the following items first: Make sure the Shelf Manager module is installed properly Make sure the module running the Shelf Manager software is in the Radisys shelf and seated properly. Make sure a proxy FRU is installed properly Make sure at least one proxy FRU is in the Radisys shelf and seated properly.
6 Troubleshooting Make sure the Shelf Manager software is running Make sure the Shelf Manager software is running. The Active LED of the SCM (or one SCM of a redundant pair) should be amber and the OOS (Out of Service) LED should be off. For a module running the Shelf Manager in an ACTA‐6002 shelf, the PWR LED should be green. In this shelf, the OOS LED is not updated by the Shelf Manager. Additionally, the Linux command shmgr status should show a status of “started.
6 Troubleshooting Blade and platform management issues The topics in this section include ShMC‐to‐IPMC communications, SEL location, sensor event reporting, and IDR, HPI, and IPMI topics that address typical questions. ShMC-to-IPMC communications The ShMC and IPMC communicate with each other on the following occasions: • Hot swap. At each occurrence according to the sequence of the events. • Events and alarms. Immediately when the event is noted. • Presence detection. Every three seconds.
6 Troubleshooting Typically each IPMC has only its own events in its SEL. The ShMCs are exceptions to this rule because they log events from all IPMCs in the shelf in their SEL. As a consequence, the ShMC SEL is quite large and has nearly 19KB (about 1200 entries) of SEL space. Shelf Manager tables The Shelf Manager uses tables and files to store information and data related to events, alarms, sensors and resources, etc.
6 Troubleshooting Inventory Data Record (IDR) topics Radisys can provide a tool called fruupdate that can be used to update existing FRU inventory data on IPMCs. Currently however adding new records to IPMC FRU inventory data is not allowed via HPI. The fruupdate tool accepts a new FRU data image and config file as input. The new FRU data image must be created by the user with their custom FRU data records. The config file specifies the identity of those records.
6 Troubleshooting Specific issues Table 15 lists specific issues that can be encountered while using the shelf management and HPI services on the Shelf Manager module. Table 15.
6 Troubleshooting Table 15. Symptom/response for Shelf Manager problems (continued) Symptom Response (continued from previous page) If the modules are still in debug mode, use these steps: • From each Shelf Manager module, verify that the shelf FRU device is accessible using this Linux command: rsys‐ipmitool ‐t fru print 1 is the module IP address (which is usually 127.0.0.1), and is the IPMB address of the proxy FRU (see IPMB addresses of slots and FRUs on page 134).
6 Troubleshooting Table 15. Symptom/response for Shelf Manager problems (continued) Symptom SCM resets abruptly forcing the LMP to reboot. There are no associated hot-swap events. Response Possible causes and steps to take are as follows: 1. The SCM LMP hung, most likely because a process running on the LMP got into an error state that consumed all CPU cycles. This prevents the Shelf Manager from getting enough CPU cycles to strobe the IPMC Watchdog. The watchdog timer expires and resets the LMP.
6 Troubleshooting Table 15. Symptom/response for Shelf Manager problems (continued) Symptom Sessions cannot be opened on the HPI server. Error code: SA_ERR_HPI_OUT_OF_SPACE Response The HPI server has reached the limit (32) for the number of sessions that can be opened simultaneously. Close a few sessions and try again, or use any of the already opened sessions for the necessary operations.
6 Troubleshooting Table 15. Symptom/response for Shelf Manager problems (continued) Symptom RPT is incomplete; not all FRUs present in the shelf have associated Resources. Response 1. Initiate a discovery operation using the CLI command rediscoverShelf, using the saHpiDiscover() function, or using the SNMP saHpiDiscover.0 variable. This will initiate a fresh discovery of the shelf and create Resources for any FRU that was not already present in the RPT. 2.
6 Troubleshooting Table 15. Symptom/response for Shelf Manager problems (continued) Symptom A user alarm cannot be created. Error code: SA_ERR_HPI_OUT_OF_SPACE Response The DAT has reached the limit (50 by default) for the number of user alarms that can be active simultaneously. Consider these options: • Delete the oldest user alarm from the DAT • Delete the most minor user alarm from the DAT Data cannot be added to the MultiRecord area.
6 Troubleshooting Restoring shelf FRU information The procedure below is useful when the shelf FRU information is bad and needs to be replaced in the shelf FRU device(s). This procedure applies to all Radisys shelves except for the ATCA‐6002 shelf. To write to the shelf FRU device: 1. Log in to any system with LAN connectivity to the SCM. 2. Locate a shelf FRU information image to copy to all needed locations and place it on the same system where you perform this procedure.
6 Troubleshooting Restoring shelf FRU information on ATCA-6002 shelves The ATCA‐6002 shelf uses a file to store shelf FRU information, because there is no physical shelf FRU device. This file is located here on the Shelf Manager module: /var/lib/shmgr/atca‐6002.bin If the shelf FRU information gets corrupted, replace the atca‐6002.bin file with the factory copy on the release CD.
6 Troubleshooting IPMB addresses of slots and FRUs Table 16 shows the IPMB addresses of slots in the ATCA‐6002 shelf. Table 17 shows the IPMB addresses of slots and FRUs in the remaining Radisys ATCA‐6xxx shelves. The addresses are given in hexadecimal notation. Table 16. Hex IPMB addresses of slots in the ATCA-6002 shelf Slot Active Shelf Manager Front slot 1 Front slot 2 IPMB address 20 82 84 Table 17.
6 Troubleshooting Table 17. Hex IPMB addresses of slots and modules in larger Radisys shelves (continued) IPMB address / location ATCA-6000 shelf ATCA-6006 shelf n/a 52 Slot or FRU Shelf alarm panel a ATCA-6014 shelf n/a ATCA-6016 shelf n/a With the ATCA-5010 and ATCA-5014, use the virtual SPM address.
6 Troubleshooting The following are the slot resource types in hex: F0 Unknown slot F1 SPM or RCM slot F2 Shelf display panel slot F3 Virtual SPM or virtual RCM slot F6 RTM slot F7 PMC slot F8 AMC slot F9 Alarm device slot FA Fan filter tray slot FB Fan tray slot FC Dedicated ShMC slot FD Shelf FRU slot FE PEM slot FF FRU slot F5 COM‐E module slot 136
Appendix A Shelf Manager Initialization Shelf Manager and HPI interface initialization The shelf management server (ShMS) is installed as part of the local management processor (LMP) software image. The Shelf Manager Server daemon automatically starts as a background process at boot time using the Linux init script named shmgr. This is one of the first applications the kernel starts. The following is the initialization sequence.
A Shelf Manager Initialization d. Populates the RPT with resources and their associated resource data records (RDRs). Resources are created for each slot present in the chassis as well as all installed FRUs. RDR entries created include sensors (both physical and logical), controls, watchdogs, and inventory data records. e. Checks for the presence of previously saved HPI parameter control configuration files and loads them for each resource in the RPT.
A Shelf Manager Initialization HPI subagent initialization When the HPI subagent is started, initially you will see just the static scalar objects defined in the HPI‐B0101‐MIB module. One of these scalar objects is the saHpiDiscover.0 variable, which initially returns the value true(1). Once the HPI server is running, the HPI subagent opens an HPI session to it and starts to dynamically populate all of the columns and rows in the HPI‐B0101‐MIB module tables.
Appendix B Shelf Management Files This appendix describes files in the shelf management infrastructure, including programs, scripts, and configuration files. Shelf management configuration files The configuration files are listed by directory location as follows: In the /etc directory: • rsyshsd.conf An optional blade HPI configuration file that overrides the defaults specified in usr/share/shmgr/shmgr.defs. • shmgr.
B Shelf Management Files Shelf Manager log file The Shelf Manager log file is /tmp/shmgr/shmgr.log. Backup files in the same directory are created as described in Configuring verbosity and log settings on page 43. Shelf management executable files The paths to the following executable files are included in the PATH environment variable: frurw Utility for rewriting the contents of an IPMI FRU data device.
Appendix C IPMI Commands and Managed Sensors Supported IPMI commands The Shelf Manager supports all IPMI commands listed in the AdvancedTCA Base Specification. Note: • On the ATCA‐6002 shelf, the Get Telco Alarm Location command does not return a valid location because no telco alarm site exists on this shelf. All other IPMI commands are supported on the ATCA‐6002 shelf when the Shelf Manager is running in enhanced mode.
IPMI Commands and Managed Sensors C Table 18. Managed sensors for the Shelf Manager # Name 4097 Shelf Manager (0x1001) Redundancy Status Type Operational Normal reading Explanation of values or event state Redundancy 0x1 State 0x1: Shelf Manager is redundant. Category State 0x4: Shelf Manager redundancy is degraded due to different Shelf Manager software versions. State 0x8: Shelf Manager is non-redundant due to loss of sufficient resources.
IPMI Commands and Managed Sensors Table 19. Managed sensors for each FRU # Name 256 FRU Operational Operational Status (0x100) Type Category Device enabled Normal reading Explanation of values or event state 0x1 State 0x1: FRU is operational without faults. State 0x2: FRU operation is degraded or disabled.
IPMI Commands and Managed Sensors Table 19.
IPMI Commands and Managed Sensors C Table 19. Managed sensors for each FRU (continued) # Name Type Category 258 FRU Thermal Status Temperature Threshold (0x102) Normal reading Explanation of values or event state 0x1 State 0x1: Lower minor. State 0x2: Lower major. State 0x4: Lower critical. State 0x8: Upper minor. State 0x10: Upper major. State 0x20: Upper critical. Possible readings are encoded as a bit mask of 64 values: 0 Minor Temp Alarm 1 Major Temp Alarm 2 Critical Temp Alarm Table 20.