Express5800/ftServer: System Administrator’s Guide for the Linux® Operating System NEC Corporation of America 456-01730-000
Notice The information contained in this document is subject to change without notice. UNLESS EXPRESSLY SET FORTH IN A WRITTEN AGREEMENT SIGNED BY AN AUTHORIZED REPRESENTATIVE OF NEC, NEC MAKES NO WARRANTY OR REPRESENTATION OF ANY KIND WITH RESPECT TO THE INFORMATION CONTAINED HEREIN, INCLUDING WARRANTY OF MERCHANTABILITY AND FITNESS FOR A PURPOSE.
Contents Preface 1.
Contents Pre-Installation Checklist Initial Linux Operating System and Express Builder Installation or Default Reinstallation Booting the Operating System Installing the Operating System Installing Express Builder for Fault Tolerance Avoiding CD-ROM Mount Command Failure After Disabling SELinux Reinstalling Express Builder After a Failed Installation Booting in Linux Rescue Mode Post-Installation Tasks and Considerations Default Configuration Notes Configuring the Network Adding Fault-Tolerant Utilities to
Contents Default Internal Disk Configuration for a Newly Installed System Checking the Current State of the Internal Disk Subsystem Storage Device Definition Setting Up RAID Arrays RAID Array Overview Creating a RAID-1 Array Creating a RAID-0 Array Creating and Mounting a File System Checking the Current State of RAID Removing and Replacing Disks Disk Insertion Administering RAID Arrays To Stop a RAID Array and Move It to Another System Errors and Faulty Mirrors Removing a Faulty Mirror Resynchronization R
Contents 7. Using ftServer Fault-Tolerant Utilities and Software The ftsmaint Command Device Path Enumeration ftServer System Device Path Enumeration ftsmaint Examples Displaying System Status Bringing System Components Down and Up Removing a PCI Adapter From Service and Bringing It Into Service Kernel Memory Dump File Management 8.
Contents ftServer System Operation State Management SNMP Network Management Station Considerations Initial SNMP Testing Initial Testing of ftltrapsubagent Initial Testing of ftlsubagent Removing ftlSNMP OpState:State Definitions OpState:Reason Definitions GET and SET Operations for ftlSNMP MIB Objects SRA-ftLinux-MIB OID Values and Properties Trap Filtering Trap-Filtering Capability Activating and Deactivating Trap Filtering Trap-Filtering Examples 8-25 8-26 8-27 8-27 8-29 8-29 8-30 8-31 8-32 8-32 8-33 8-
Figures Figure 2-1. SAS (SATA) Drive Arrangement for Installation 2-9 Figure 5-1. CPU-I/O Enclosures: Front Panel with Drive Slots Fully Populated 5-3 Figure 7-1. ftServer Enclosures: Locations of Major Enumerated Devices (Front View) 7-8 Figure 7-2. ftServer Enclosures: Locations of Major Enumerated Devices (Rear View) 7-9 Figure 8-1. AgentX-Enabled Extensions and Subagents 8-17 Figure 8-2.
Tables Table 2-1. Table 5-2. Table 7-1. Table 8-1. Table 8-2. Table 8-3. Table 8-4.
Examples Example 5-1. Checking the Current State of the Internal Storage Subsystem Example 5-2. Checking the Current State of RAID Example 5-3. Resynchronization Example 5-4. Running GRUB Example 5-5. Pairing a Spare Internal Disk with the Running System Disk Example 5-6. Default Configuration of Embedded Ethernet Devices Example 7-1. Displaying System Status with the ftsmaint Command Example 8-1. Traps that Can Occur for I/O Element 11 When Trap Filtering Is Off Example 8-2.
Preface The Express5800/ftServer: System Administrator’s Guide for the Linux Operating System documents tasks and information for system administrators of NECAM systems running a supported Linux distribution and ftControl system software sfor the Linux Operating System (Express Builder).
Preface ! CAUTION A caution indicates a situation where failure to take or avoid a specified action could damage a hardware device, program, system, or data. NOTE A note provides important information about the operation of an ftServer system. Typographical Conventions The following typographical conventions are used in this document: • The italic font introduces or defines new terms.
Preface • % indicates you are logged in to a user account and are subject to certain access limitations. • # indicates you are logged in to the system administrator account and have superuser access. Users of this account are referred to as root. The # prompt sign used in an example indicates the command can only be issued by root. Syntax Notation This document uses the following format conventions for documenting commands: • Square brackets ([ ]) enclose command argument choices that are optional.
Preface • VTM is not available with Express5800/ftServer for Linux systems. • Express Service Network is not available with Linux systems. • Although this guide may document modem functionality, modems are not available for all systems. Ask your sales representative about modem availability.
Chapter 1 Introduction to ftServer System Administration 1- This chapter discusses the following topics: • ‘‘ftServer System Terminology” • ‘‘System and Network Administration Overview” • ‘‘Additional Documentation and Resources” ftServer systems running a supported Linux distribution together with Express5800/ftServer System Software for the Linux Operating System (Express Builder) operate as fault-tolerant servers.
ftServer System Terminology ftServer System Terminology Each ftServer system houses two CPU-I/O enclosures. Each CPU-I/O enclosure includes a CPU element and an I/O element, as follows: • CPU element 0 and I/O element 10: The upper enclosure, also referred to as CPU-0, I/O-10. • CPU element 1 and I/O element 11: The lower enclosure, also referred to as CPU-1, I/O-11.
Additional Documentation and Resources Configuring Your ftServer System After installing the Linux operating system and Express Builder, you must configure your system. See Chapter 5 for configuration information. Managing Data Storage Devices In addition to the SAS (SATA) disk storage discussed in Chapter 5, your system supports CD-ROM drives and USB storage devices. Chapter 6 provides a discussion of these devices and the information needed to manage them.
Additional Documentation and Resources Red Hat Enterprise Linux Documentation for the Red Hat Linux operating system is available at http://www.redhat.com/docs. Express5800/ftServer Documentation The ExpressBuilder CD-ROM provided with your system contains all of the system documentation for ftServer systems that run the Linux operating system. It is provided in Adobe Acrobat® Portable Document Format (PDF) for viewing and printing.
Additional Documentation and Resources http://vig.prenhall.com/catalog/academic/product/0,4096,0130084662,00.html This volume is a reference manual for both system and network administration of the Linux operating system. It focuses on available (at time of publication) open source tools but incorporates in-depth knowledge of UNIX administration utilities and network management practices. • Linux in a Nutshell--A Desktop Quick Reference, 4th Ed.
Additional Documentation and Resources 1-6 Express5800/ftServer: System Administrator’s Guide for the Linux Operating System
Chapter 2 Installing the Operating System and Express5800/ftServer System Software 2- This chapter discusses the following topics: • ‘‘Installation Overview” • ‘‘Separately Released and Optional Distribution Components” • ‘‘Installation Interfaces” • ‘‘Supported Hardware and Firmware” • ‘‘Pre-Installation Checklist” • ‘‘Initial Linux Operating System and Express Builder Installation or Default Reinstallation” • ‘‘Post-Installation Tasks and Considerations” • ‘‘Performing an Installation Without a Kickstar
Installation Overview Installation Overview An installable distribution CD-ROM (CD) set is provided. Table 2-1 lists the CDs included in this distribution. Table 2-1. CD-ROMs Which May Be Included With ftServer Systems C D -R O M C o n te n ts ExpressBuilder for Linux CD Express5800/ftServer fault-tolerant system software ftControl Software Update Updated ftServer fault-tolerant system software. Express Builder Debug Info Includes debuginfo RPMs.
Installation Overview Express Builder, use the versions of firmware and software that are supplied on the ExpressBuilder for Linux (1 of 2) CD. From time to time, NECAM may issue an update to Express Builder. See Chapter 4 for information about updating from an Express Builder update disk. ! CAUTION The procedure described in this chapter is for a full installation or reinstallation of a supported Linux operating system and Express Builder.
Installation Overview NOTE The Linux operating system installer program does not anticipate customer-added and unknown hardware. Any such hardware should be added, and the system configured as required to support it, only after installation procedures have been completed and the system has been determined to function as expected. Linux Version Information You can check the installed version of the Linux operating system on your system using the uname command.
Installation Overview The installation process creates a disk drive RAID array, pairing sda and sdd drives as a mirrored set that holds the entire installed Linux software distribution and Express Builder. On this mirrored drive set, the GRUB bootloader on the master boot record at track 0 makes both drives in the set bootable using GRUB configuration data stored in the /boot partition. Storage is allocated as shown in Table 5-1.
Separately Released and Optional Distribution Components In most cases, attached devices are recognized and addressable on installation (as is a standard USB keyboard, for example), although hot-plugged devices may not be. USB 2.0 interface specifications are supported. After installation, you may need to set serial-port flow control and data-rate characteristics for attaching serial data communications equipment or data terminal equipment, such as an asynchronous terminal, a printer, or attached modem.
Installation Interfaces to check dependencies. Note that rpm does not always reveal specific release-level dependencies. From the Express Builder distribution, Express Service Network and ftlSNMP packages are installed as options and require additional configuration before they can be used. See ‘‘Kernel Memory Dump File Management’’ on page 7-13 and Chapter 8 for information on configuring and using these utilities. Installation Interfaces The installation process has two parts.
Pre-Installation Checklist support restrictions on hardware that apply either to this installation procedure or to the current Express Builder release generally. ❏ The installation CDs ask that you read and accept end user license agreements (EULAs). You should not perform the installation if you cannot accept the EULAs or are not authorized to accept them. Installation terminates without completion if you decline a required EULA.
Initial Linux Operating System and Express Builder Installation or Default Reinstallation Figure 2-1. SAS (SATA) Drive Arrangement for Installation sdc sdb sda sdf sde sdd asys076 Make sure that the system and monitor power connections are secure and firmly plugged in before beginning an installation procedure. Power cabling should be guarded against inadvertent disconnection during the installation process. The monitor may use a separate power source.
Initial Linux Operating System and Express Builder Installation or Default Reinstallation Avoiding CD-ROM Mount Command Failure After Disabling SELinux During the installation of Express Builder, you choose whether to enable or disable SELinux. If you choose to enable SELinux, and then disable SELinux at a later time, the command to mount the CD-ROM device fails unless you edit the /etc/fstab file to remove a parameter that is added to the file when you disable SELinux.
Initial Linux Operating System and Express Builder Installation or Default Reinstallation To boot in rescue mode 1. Disconnect any floppy disk drive attached to the system’s USB port. NOTE If a floppy drive is connected when you boot in rescue mode, the system will be unable to find the internal storage drives. 2. Insert Red Hat Enterprise Linux CD-ROM #1 into the CD-ROM drive in the upper CPU-I/O enclosure. The system boots from this CD. 3.
Post-Installation Tasks and Considerations Post-Installation Tasks and Considerations After installing the operating system and Express Builder consider the following topics. • ‘‘Default Configuration Notes” • ‘‘Configuring the Network” • ‘‘Adding Fault-Tolerant Utilities to PATH” Default Configuration Notes After installation, the default installed system should appear as described in ‘‘Default System Setup’’ on page 2-3. The following notes apply to the default system configuration. NOTES 1.
Performing an Installation Without a Kickstart File Adding Fault-Tolerant Utilities to PATH Express5800/ftServer fault-tolerant utilities, like ftsmaint and ASNConfig, reside in the /opt/ft/bin and /opt/ft/sbin directories. Consider setting your PATH to include these directories. Performing an Installation Without a Kickstart File 1. After installation, while the system is booting, the GRUB menu must supply: linux reboot=warm nmi_watchdog=0 i8042.noaux At the boot prompt.
Additional Documentation and Resources 2. Manually make the second disk a bootable disk. At the command prompt, type the following lines to make both system disks bootable: # /sbin/grub device (hd0) /dev/sda root (hd0, 0) setup (hd0) device (hd0) /dev/sdb root (hd0, 0) setup (hd0) quit • Make sure that the system is running the SMP kernel. • You must install all required software packages. You may have to manually resolve package dependency failures when installing Express Builder.
Chapter 3 Updating ftServer System Firmware 3- This chapter discusses the following topics: • ‘‘Updating the System BIOS” • ‘‘Updating BMC Firmware” Consult the Release Notes: Express5800/ftServer for the Linux Operating System for the Express5800/ftServer System Software for the Linux Operating System (Express Builder) version you have (or will upgrade to) to determine what firmware version numbers are required.
Updating the System BIOS The preceding example displays a BIOS version number of 20.0 for the top CPU-I/O enclosure (see Table 7-1 for a list of system device IDs). Express5800/ftServer BIOS updates are image files that you must transfer from removable media or download from a network-accessible archive. Take care when updating firmware. It is a necessary failover characteristic for the CPU-I/O enclosures to be paired in duplexed operation.
Updating the System BIOS 4. Use the ftsmaint command to verify that you are starting from a known, good state. At this point, both CPU-I/O enclosures should be operating duplexed.
Updating the System BIOS 6. Perform the BIOS burn by issuing the following commands to one of the CPU-I/O enclosures. # /opt/ft/bin/ftsmaint bringDown 0 Completed bringDown on the device at path 0. # /opt/ft/bin/ftsmaint burnProm Path and filename for the BIOS Updated firmware on the device at path 0. # /opt/ft/bin/ftsmaint jumpSwitch 0 Transferred processing to the device at path 0. # /opt/ft/bin/ftsmaint bringUp 1 Completed bringUp on the device at path 1.
Updating BMC Firmware a character file rather than a binary file. You can detect such corruption by computing a checksum with the md5sum command before and after copying. A repeated BIOS burn failure is likely to be caused by a command syntax error or by using a damaged or inappropriate BIOS image file. 10. If it is necessary to update the BMC firmware, follow the procedure described in ‘‘Updating BMC Firmware’’ on page 3-5. 11.
Updating BMC Firmware 4. Obtain the latest BMC image for the Express5800/ftServer and copy it to the ftServer tmp folder. NOTE All ftServers running a supported Linux distribution and Express5800/ftServer system software for the Linux Operating System (Express Builder) use the same BMC firmware. 5. Type the following commands to update the BMC firmware on each I/O element: # /opt/ft/bin/ftsmaint burnProm path and filename for the BMC firmware Updated firmware on the device at path 10/120.
Chapter 4 Updating the Operating System and Express5800/ftServer System Software 4- This chapter documents how to upgrade the Linux operating system and the Express5800/ftServer System Software for the Linux Operating System (Express Builder).
General Upgrade Considerations General Upgrade Considerations When upgrading the Linux operating system or the Express Builder, be aware of the following requirements and related considerations. Upgrade Requirements First, ensure that the system’s BIOS and BMC firmware levels support the new Express Builder version. You can obtain required versions of firmware from the ExpressBuilder for Linux CD. If necessary, upgrade the firmware (see Chapter 3).
Upgrading or Restoring the Linux Operating System • To preserve your changes, incorporate the updates into the files you have modified. Compare files in the /etc/OPT/ft/network-scripts/ARCHIVE directory that have a .rpmnew extension to your modified files, and copy the updates from the .rpmnew file to your modified file.
Upgrading or Restoring the Linux Operating System The following topics apply when upgrading or restoring the Linux operating system. • ‘‘Express5800/ftServer Kernel Modules” • ‘‘Upgrading or Restoring the Linux Operating System” Express5800/ftServer Kernel Modules Whenever the Linux operating system is upgraded, a new Linux kernel is installed. Whenever Express Builder is installed or upgraded, the fault-tolerant Express5800/ftServer kernel modules are automatically rebuilt at the next boot time.
Upgrading or Restoring the Linux Operating System By default, the Update Agent on a system running Express Builder is configured to access the following servers: • The Red Hat RHN server for Linux OS patches To upgrade the Linux operating system 1. Start the Red Hat Update Agent from the graphical desktop, or by running the up2date command on the command line. The Update Agent does the following: • Queries the RHN server for new versions of RPMs that are already installed on your system.
Upgrading or Restoring Express Builder If the operating system reinstallation failed or you want to return the system to the previous operating system version, see ‘‘Recovering from a Failed Software Upgrade’’ on page 4-8. Your system should now have the same version of operating system software installed as it had previously. But since it has no Express Builder software, its fault-tolerant features are not operational, so you must upgrade or restore Express Builder on your system.
Creating a Backup System Disk a. Remove all NECAM packages except eula_display with this command: rpm -e --nodeps --allmatches ‘rpm -qa | grep lsb-ft | grep -v eula_display` b. Remove eula_display with this command: rpm -e --nodeps lsb-ft-eula_display 4. Follow the instructions in ‘‘Installing Express Builder for Fault Tolerance’’ on page 2-13. Your system now has a new version of Express Builder installed and the upgrade is complete.
Recovering from a Failed Software Upgrade NOTE Backup disks can be new, factory-fresh disks or disks recycled from other systems. However, care must be taken with recycled disks. The partition table and RAID superblocks that exist on a recycled disk can confuse the system. Recovering from a Failed Software Upgrade Use this procedure if an upgrade procedure failed or if you want to go back to the software versions installed before an upgrade procedure was performed.
Chapter 5 Setting Up the ftServer System 5- This chapter discusses the following topics: • ‘‘Setting Up Internal Disk Storage” • ‘‘Setting Up RAID Arrays” • ‘‘Removing and Replacing Disks” • ‘‘Administering RAID Arrays” • ‘‘System Backup and Disaster Recovery” • ‘‘Ethernet Devices” • ‘‘Other System Configuration Information” • ‘‘Additional Documentation and Resources” At system startup, the operating system autoprobes hardware for legacy devices and attached devices that are not already configured for us
Setting Up Internal Disk Storage Setting Up Internal Disk Storage This section discusses the following topics: • ‘‘Internal Disk Storage Overview” • ‘‘The Console Log and the /var/log/messages File” • ‘‘Configuring Internal Disks” • ‘‘Managing Partitions” • ‘‘Default Internal Disk Configuration for a Newly Installed System” • ‘‘Checking the Current State of the Internal Disk Subsystem” • ‘‘Storage Device Definition” Internal Disk Storage Overview ftServer systems support up to six internal Serial Advanced
Setting Up Internal Disk Storage Since some disk-configuration operations produce considerable console output, it can be helpful to log on to another session. Configuring Internal Disks The six internal storage disks are persistently named based on the slot that they occupy. As shown in Figure 5-1, in the upper CPU-I/O enclosure, the disks are /dev/sda, /dev/sdb, and /dev/sdc, from bottom to top. In the lower CPU-I/O enclosure, they are /dev/sdd, /dev/sde, and /dev/sdf, from bottom to top.
Setting Up Internal Disk Storage autodetect). After the installation, you use the fdisk utility to add data disks with the type 0x83 (Linux). Both disks of a RAID pair must have the same geometry, partition table, and type. You can use the fdisk command to manage disk partitions.The following example uses the internal storage enclosure disk sdb. To display the partition table 1. Enter the fdisk command. # fdisk /dev/sdb The number of cylinders for this disk is set to 17849.
Setting Up Internal Disk Storage To create a new partition table and add a partition 1. If fdisk is not already running, enter the fdisk command. # fdisk /dev/sdb The number of cylinders for this disk is set to 17849. There is nothing wrong with that, but this is larger than 1024, and could in certain setups cause problems with: 1) software that runs at boot time (e.g., old versions of LILO) 2) booting and partitioning software from other OSs (e.g., DOS FDISK, OS/2 FDISK) Command (m for help): 2.
Setting Up Internal Disk Storage 5. Enter the partition number you wish to assign (the choices depend on the type specified). Partition number (1-4): 1 First cylinder (1-8924, default 1): 6. Enter the desired starting cylinder number for the partition, or press ENTER to accept the default (this example accepts the default). Using default value 1 Last cylinder or +size or +sizeM or +sizeK (1-8924, default 8924): 7.
Setting Up Internal Disk Storage Default Internal Disk Configuration for a Newly Installed System The Linux operating system is installed on the sda/sdd pair of SAS (SATA) disks. See User Guide (Setup) for configuration of internal disk. NOTE The RAID array is not deterministic. Checking the Current State of the Internal Disk Subsystem The /proc/scsi/scsi file displays the current state of the internal disk subsystem.
Setting Up RAID Arrays Example 5-1. Checking the Current State of the Internal Storage Subsystem # cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: ATA Model: ST380013AS Rev: 3.00 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi4 Channel: 00 Id: 00 Lun: 00 Vendor: ATA Model: ST380013AS Rev: 3.00 Type: Direct-Access ANSI SCSI revision: 05 Storage Device Definition The Linux operating system automatically creates device nodes for all devices in a system.
Setting Up RAID Arrays Device files are created for the first 128 RAID arrays. Use the mknod command (see mknod(1)) to create additional device files as needed. The number is the minor device number, and it is also used in the name. The smaller numbers are used by the installer, so it is convenient to add new RAID arrays above 10. When RAID arrays are intended to be moved between systems, try to pick numbers that are unique among all of the systems. The /etc/mdadm.
Setting Up RAID Arrays 3. Edit the /etc/mdadm.conf file so that the new RAID array will start each time the system boots. a. Use an existing ARRAY line as a model. Copy it to the bottom of the file. b. Edit the device and the two disks. In this example, the device is changed to md20 and the two disks are changed to /dev/sdb1 and /dev/sde1.
Setting Up RAID Arrays When the command completes, the RAID array is up and running. You can use the mdadm command to see the status of the new RAID array: # mdadm -Q --detail /dev/md20 /dev/md20: Version : 00.90.01 Creation Time : Wed Sep 28 15:20:08 2005 Raid Level : raid1 Array Size : 143371968 (136.73 GiB 146.81 GB) Device Size : 143371968 (136.73 GiB 146.
Setting Up RAID Arrays 2. Edit the /etc/mdadm.conf file so that the new RAID-0 array starts each time the system boots. a. Use an existing ARRAY line as a model. Copy it to the bottom of the file. b. Edit the device and the two disks. In this example, the device is changed to md30 and the two devices take the RAID-1 array names. The result should look like the following: ARRAY /dev/md30 level=0 DEVICES=/dev/md20,/dev/md21 c. Save the file and exit the editor. 3.
Setting Up RAID Arrays To stop the device, use the mdadm command with the -S argument, as follows: mdadm -S /dev/md30 Creating and Mounting a File System The RAID arrays created in the preceding examples are raw disk block devices. You can mount a file system on the RAID array. The following command creates an ext-3 journaled file system in the RAID-0 array created above: # mkfs.
Removing and Replacing Disks NOTE The device names displayed in /proc/mdstat are the kernel names for each device. These are different from the user device names displayed by the mdadm command. Example 5-2.
Administering RAID Arrays Disk Insertion When you reinsert a pulled disk, the OSM storage plugin attempts to match it with an existing disk. If it finds a match, it hot-adds the mirror partitions on the inserted disk back into the existing RAID arrays and resynchronizes them (see ‘‘Resynchronization’’ on page 5-18). Similarly, if you replace a failed disk, the OSM plugin automatically adds the replacement disk to a running RAID array.
Administering RAID Arrays You can start a RAID array when it is stopped. Use the following command to start a RAID array that was already configured in /etc/mdadm.conf: # mdadm -A /dev/md30 Errors and Faulty Mirrors When an error is reported on a mirror, the mirror is marked faulty and it is no longer used. The last active mirror is never marked faulty even if errors are reported against it.
Administering RAID Arrays completely remove the disk (the OSM storage plugin automates these tasks). This means that until all mirrors are removed, a replacement disk inserted in the same slot will not spin up. You can use the mdadm command to add a mirror into a running RAID array. The following example shows how to do this. # mdadm /dev/md20 -a /dev/sdb1 In the preceding example, /dev/md20 is the RAID array and /dev/sdb1 is the mirror.
Administering RAID Arrays md2 : active raid1 sdb3[1] sda3[0] 31647936 blocks [2/2] [UU] md0 : active raid1 sdb1[1] sda1[0] 104320 blocks [2/2] [UU] unused devices: # NOTE The device names displayed in /proc/mdstat are the kernel names for each device. These are different from the user device names displayed by the mdadm command. As long as there is a missing mirror or a resynchronization in process, RAID and the CPU-I/O enclosure are simplex for the active mirror.
Administering RAID Arrays NOTE If the running member of the RAID array was a system disk, the bootloader (grub) is added to the newly-inserted disk. To replace a failed disk 1. While the system and RAID array are running, remove the failed disk. 2. Insert a blank disk. The blank disk is automatically added to the array. Configuring Safe Mode By default, the OSM configuration file, /opt/ft/osm/config.xml, configures this automatic pairing of disks in safe-mode.
Administering RAID Arrays Replacing Defective Disks Interactively To replace a defective disk, perform the following procedures: • Remove the defective disk and insert a spare disk. • Run the duplex_blank_disk command (see ‘‘The duplex_blank_disk Command’’ on page 5-25). NOTE Replacement disks can be new, factory-fresh disks or disks recycled from other systems. Care must be taken with recycled disks. The partition table and RAID superblocks that exist on the disk can confuse the system.
Administering RAID Arrays To verify that the spare disk is not in use Type the following commands and check the resulting output: # mdadm --detail --scan # swapon -s # cat /etc/mtab To zero the spare disk Perform one of the following procedures: • Zero the spare disk’s RAID superblocks by typing a command such as the following for each partition on the spare disk (substitute the device node of the partition you wish to zero for sdb1 in this example): # mdadm --zero-superblock /dev/sdb1 NOTE Zeroing the di
Administering RAID Arrays Occasionally, sfdisk returns the following error while writing the saved partition table to the spare disk: Checking that no-one is using this disk right now ... BLKRRPART: Input/output error This error indicates that the disk is currently in use, so you should not repartition it. Perform these steps to correct this error: a. Unmount all file systems. b. Swap off all swap partitions on this disk. c. Use the --no-reread flag to suppress this check. d.
Administering RAID Arrays 2. Add each partition on the spare disk to the RAID-1 array containing the corresponding partition on the running disk with commands like the following: # mdadm -a /dev/md0 /dev/sdd1 mdadm: hot added /dev/sdd1 # mdadm -a /dev/md1 /dev/sdd3 mdadm: hot added /dev/sdd3 # mdadm -a /dev/md2 /dev/sdd2 mdadm: hot added /dev/sdd2 Perform the following procedure only if the running disk is the system disk.
Administering RAID Arrays The duplex_blank_disk Command The duplex_blank_disk command prompts you for all of the information required to pair a spare disk with a running disk. You can run it by typing: # /opt/ft/bin/duplex_blank_disk In Example 5-5, the command prompts you for information that is needed to pair a spare internal disk with the running system disk. Example 5-5. Pairing a Spare Internal Disk with the Running System Disk # /opt/ft/bin/duplex_blank_disk Device Path ID of blank disk (e.g.
System Backup and Disaster Recovery System Backup and Disaster Recovery Your ftServer system provides many safeguards against losing data due to hardware failures. However, it cannot cover all contingencies, so it is still important to perform regular backups and enact a good disaster-recovery program. Ethernet Devices Network interface naming on ftServer systems running a supported Linux distribution together with Express Builder is different from that on other Linux systems.
Ethernet Devices . Table 5-2.
Ethernet Devices Monitoring and Configuring Channel-Bonding Interfaces By default, the physical Ethernet interfaces listed in Table 5-2 are bound together into two channel-bonding interfaces, called bond0 and bond1. The two channel-bonding interfaces are set to operate in active-backup mode (mode 1) with Dynamic Host Configuration Protocol (DHCP) enabled. In many cases, no additional configuration is necessary.
Ethernet Devices eth080010 eth080011 bond0 bond1 DUPLEX DUPLEX UP UP - - In Example 5-6, there are two online channel-bonding interfaces (masters), bond0 and bond1, each composed of two physical interfaces (slaves). The output shows the four physical slave interfaces in the system and also shows their status and the name of the bond to which they belong. Note that three other channel-bonding interfaces are defined by default, but they are not configured and are therefore offline.
Ethernet Devices NOTES 1. There must be at least one alias for an active bond in the /etc/modprobe.d/ft-network.conf file, or bonding cannot occur. 2. The /etc/modprobe.d directory should contain no more than one ft-network.conf file. Determining Interface Device Names When you add a PCI Ethernet adapter to a system, you must determine the device names of the physical interfaces on the adapter before you can configure it.
Other System Configuration Information 2. Determine the interface device name for each physical Ethernet interface on this newly installed adapter. See ‘‘Determining Interface Device Names’’ on page 5-30 for details. You must add this device name to the physical interface’s configuration file (see step 5). 3. Repeat steps 1 and 2 for the second adapter in the corresponding slot in the second paired CPU-I/O enclosure. 4.
Other System Configuration Information You also need to perform the following configuration tasks, using standard Linux procedures: • Configuring the IP address for the bond0 and bond1 interfaces (static or DHCP, and gateway in /etc/sysconfig/network-scripts/ifcfg-bond0 and /etc/sysconfig/network-scripts/ifcfg-bond1) • Configuring DNS resolution for the system (/etc/nsswitch.conf and /etc/resolv.
Other System Configuration Information Configuring the System Video Display Your ftServer system's video comes configured by default. There is normally no need to change the video displays settings, and the system is strictly limited in some of its parameters. For instance, the screen resolution is limited to 1024x768 pixels. However, it is possible, though not advisable, to change the video configuration.
Additional Documentation and Resources Additional Documentation and Resources Linux System Administrator’s Guide v0.8, Linux Documentation Project: http://www.ibiblio.org/pub/Linux/docs/linux-doc-project/system-admin-guide/ http://unthought.net/Software-RAID.HOWTO/ Managing RAID on Linux, Derek Vadala, O’Reilly & Associates, 2003: http://www.oreilly.
Additional Documentation and Resources 5-34 Express5800/ftServer: System Administrator’s Guide for the Linux Operating System
Chapter 6 Managing Data Storage Devices 6- This chapter discusses the following topics: • ‘‘CD-ROM Drives” • ‘‘USB Storage Devices” • ‘‘Additional Resources” Chapter 5 explains basic storage device definition and the configuration and management of the internal disk drives embedded in CPU-I/O enclosures. This chapter briefly discusses other data storage devices that are included with or can be optionally attached to the system.
USB Storage Devices If a device is plugged into a USB hub, the name has two numbers. For example, sd1.3usb is the name of the device attached to port 3 of a hub connected to port 1 of the root USB hub. If you add another hub to the chain, the device name would contain a third number. The udevinfo command translates the internal name into the name assigned by the udev command.
USB Storage Devices Once you have the name assigned by the udev command (in this case, sd1usb), you can use udevinfo to find the internal name (in this case, sde): # udevinfo -q path -n /dev/sd1usb /block/sde For more information about the udevinfo command, see udevinfo(8). ! CAUTION Before unplugging the device, make sure that it is not being used (the usage count is 0). If a file system is mounted, unmount it (and make sure the umount command completes) before unplugging the device.
USB Storage Devices Most floppy disks and solid-state devices come with a virtual file allocation table (VFAT) file system. You can create ext-2 or other file systems on the device as well. You can mount them on a convenient mount point, for example: # mkdir /mnt/floppy # mount /dev/sdg1 /mnt/floppy USB Floppy Drives The USB floppy drive appears as follows in /proc/scsi/scsi.
Additional Resources The system log provides details about the device, including its size: scsi5 : SCSI emulation for USB Mass Storage devices Vendor: LEXAR Model: JUMPDRIVE SECURE Rev: 3000 Type: Direct-Access ANSI SCSI revision: 02 SCSI device sdaz: 506880 512-byte hdwr sectors (260 MB) Additional Resources Linux Allocated Devices, LANANA: http://www.lanana.org/docs/device-list/devices.
Additional Resources 6-6 Express5800/ftServer: System Administrator’s Guide for the Linux Operating System
Chapter 7 Using ftServer Fault-Tolerant Utilities and Software 7- This chapter discusses the following topics: • ‘‘The ftsmaint Command” • ‘‘Kernel Memory Dump File Management” • ‘‘Kernel Memory Dump File Management” The Express5800/ftServer System Software for the Linux Operating System (Express Builder) provides a special command interface, ftsmaint, for managing the faulttolerant components of your ftServer system.
The ftsmaint Command The task arguments are as follows (See also ftsmaint(8)): • ftsmaint ls path This command displays the status of the hardware specified by the enumerated path. Specifying a path displays a detailed status of the hardware at that path. Omitting the path argument displays a less-detailed table of all fault-tolerant devices on the system. See ‘‘Device Path Enumeration’’ on page 7-5 for more information.
The ftsmaint Command NOTE The ftsmaint bringDown command will not permit you to bring down a simplex device, because this would disable the system. • ftsmaint bringUp path This command brings into service the CPU element, I/O element, or CPU-I/O enclosure slot specified by path. No other devices are supported. • ftsmaint burnProm fw_file path This command updates the firmware contained in the file fw_path into the EPROM devices on the ftServer device specified by path.
The ftsmaint Command NOTE Do not use this feature to retain a faulty or degraded device in service. It may be useful if the MTBF for a device has been degraded by testing or configuration error. • ftsmaint runDiag path This command starts diagnostics on the CPU element or I/O element specified by path. • ftsmaint setPriority level path This command sets the priority level of the CPU element specified by path to the value in the level argument.
The ftsmaint Command The opstates for the sensors are as follows: • FATAL: above uf or below lf • CRITICAL: above uc or below lc • WARNING: above unc or below unc • NORMAL: default • ftsmaint -version This command returns the build number of the ftsmaint command on your system. This number coincides with the build number of Express Builder installed on the system. Device Path Enumeration Some subsystems and components of the ftServer system are addressable by device path IDs.
The ftsmaint Command Table 7-1. Device Paths of ftServer Devices (Page 2 of 3) Location Device Path Bottom CPU element Bottom CPU element 1 DIMMs (addressed by slot) 1/0—1/7 Processors 1/20, 1/23 CPU internal temperature sensor 1/20/130, 1/23/130 CPU 12v sensors 1/20/150, 1/23/150 Ambient air temperature sensor 1/130 Fan sensors 1/140, 1/141 Voltage sensors 1/150—1/152 Top I/O element 10 Mass storage controller —EIDE controller 10/0 —05:00.
The ftsmaint Command Table 7-1. Device Paths of ftServer Devices (Page 3 of 3) Location Device Path Bottom I/O element Fan speed sensor 10/140 Voltage sensors 10/150—10/162 Bottom I/O enclosure 11 Mass storage controller —EIDE controller 11/0 —7c:00.0 SAS (SATA) controller —SAS (SATA) controller 11/1 —7c:01.0 USB controllers —USB host controller 11/2 —7c:02.0–7c:02.2 VGA controller —Graphics controller 11/3 —7c:03.0 Ethernet controller —Ethernet card 11/5 —7b:02.0, 7b:02.
The ftsmaint Command Figure 7-1 and Figure 7-2 show the locations of the major enumerated devices. Figure 7-1.
The ftsmaint Command Figure 7-2.
The ftsmaint Command ftsmaint Examples The following sections provide examples of how to use the ftsmaint command: • ‘‘Displaying System Status” • ‘‘Bringing System Components Down and Up” • ‘‘Removing a PCI Adapter From Service and Bringing It Into Service” Displaying System Status To display the status of the fault-tolerant devices and subsystems in your ftServer system, issue the following command: # ftsmaint ls Example 7-1.
The ftsmaint Command AA-G90730 AA-U57500 AA-U57500 AA-D64200 AA-D64300 AA-G90730 - 1/23 1/23/130 1/23/150 1/130 1/140 1/141 1/150 1/151 1/152 10 10/0 05:00.0 10/1 05:01.0 10/2 05:02.0 05:02.1 05:02.2 10/3 05:03.0 10/4 10/5 04:02.0 eth000010 04:02.1 eth000011 10/6 10/7 10/8 03:01.0 10/9 04:01.0 eth000008 04:01.1 eth000009 10/10 10/11 10/40 10/40/1 10/40/2 10/120 10/140 10/150 10/151 10/152 10/153 10/154 10/155 10/156 10/157 10/158 10/159 10/160 10/161 10/162 11 11/0 7c:00.
The ftsmaint Command AA-U57500 AA-U57500 AA-D64200 AA-D64300 - 11/1 7c:01.0 11/2 7c:02.0 7c:02.1 7c:02.2 11/3 7c:03.0 11/4 11/5 7b:02.0 eth080010 7b:02.1 eth080011 11/6 11/7 11/8 7a:01.0 11/9 7b:01.0 eth080008 7b:01.1 eth080009 11/10 11/11 11/40 11/40/1 11/40/2 11/120 11/140 11/150 11/151 11/152 11/153 11/154 11/155 11/156 11/157 11/158 11/159 11/160 11/161 11/162 Mass Storage Ctlr PCI/PCI-X SATA Ctlr Serial Bus Ctlrs USB 1.0 Host Ctlr USB 1.0 Host Ctlr USB 2.
Kernel Memory Dump File Management For example, the first command below brings down the bottom I/O element; the second command brings it back up: # /opt/ft/bin/ftsmaint bringDown 11 # /opt/ft/bin/ftsmaint bringUp 11 NOTE Before removing an essential component, like an I/O element, from service, first verify that its partner is running.
Kernel Memory Dump File Management It is important that you monitor and maintain the size of the /var/crash directory. Back up old crash dump data before deleting it.
Chapter 8 Simple Network Management Using Net-SNMP and ftlSNMP 8- This chapter discusses the following topics: • ‘‘Installing and Configuring ftlSNMP” • ‘‘SNMP Foundations and Concepts” • ‘‘Installing Remote Network Management Services” • ‘‘Managing SNMP” • ‘‘SNMP and MIBS” • ‘‘SNMP Network Management Station Considerations” • ‘‘Initial SNMP Testing” • ‘‘Trap Filtering” If you are reading this chapter for the first time, be sure you first read Release Notes: Express5800/ftServer for the Linux Operating S
Installing and Configuring ftlSNMP power supplies (UPS). Net-SNMP is a suite of applications used to implement SNMP v1, SNMP v2c, and SNMP v3 using both IPv4 and IPv6. This suite includes: – Various command-line applications for retrieving, manipulating, converting, and displaying information – A daemon application for receiving SNMP notifications – An extensible agent for responding to SNMP queries for management information – A library for developing new SNMP applications See www.net-snmp.
Installing and Configuring ftlSNMP The ftlSNMP package is preinstalled with the default Express Builder installation, and only requires configuration and deployment. Files in the ftlSNMP package are located in the following directories: • /etc/opt/ft/snmp—Contains the fault-tolerant subagent configuration templates and the Net-SNMP master agent configuration template. • /etc/opt/ft/snmp/scripts—Contains the start, stop, and restart scripts. • /opt/ft/doc/lsb-ft-snmp-4.0—Contains the README file.
Installing and Configuring ftlSNMP Enter the following commands, and optionally, add them to the login user’s profile (for example /etc/.bash_profile). # export MIBDIRS=/usr/share/snmp/mibs:/opt/ft/mibs # export MIBS=ALL This installs or upgrades the MIBs and subagents. ftlSNMP Prerequisites ftlSNMP requires Express Builder and the following Net-SNMP packages to be installed. Note that n.n.n.n represents the current supported Net-SNMP release number. • net-snmp-libs-n.n.n.n • net-snmp-n.n.n.
Installing and Configuring ftlSNMP The snmpd.conf File ! CAUTION Use SNMPv3 when the manager and master agent are separated on a public network. The following is an example only. Failure to use SNMPv3 when communicating over a public network is a server and network security risk. SNMP V3 includes true authentication and encryption. The three authentication models are NoAuthnoPriv, authNoPriv, and authPriv. Note that you must have auth status for encryption.
Installing and Configuring ftlSNMP With sraTraceLevel set to brief, data flows to and from ftlSNMP external items are traced. With sraTraceLevel set to verbose, internal items are also traced. You can also change the location of the log file. Agent and subagent startup and shutdown events are separately logged in syslog. With trace levels other than off, logs may grow rapidly (depending on the number of managed objects and their activity).
Installing and Configuring ftlSNMP Run the following commands each time SNMP is restarted (or write a script to manage this task): # snmpusm -v3 -u admin -n "" -l authNoPriv -a MD5 -A create v3user admin your_passwd localhost # snmpusm -v3 -u v3user -n "" -l authNoPriv -a MD5 -A passwd old_passwd new_passwd your_passwd localhost These commands clone an initial (template) SNMPv3 user, admin, as v3user, and then change the password of v3user.
SNMP Foundations and Concepts 3. Edit /etc/opt/ft/snmp/snmpd.conf and add the user to the VACM using a current group and view or creating new ones. The following example lines add a new user paul to the current view and group in snmd.conf by inserting the (highlighted) line for paul: group v3group group v3group group v3group view v3view access v3group v3view v3view usm usm usm included "" usm admin v3user paul .1.3.6.
SNMP Foundations and Concepts Net-SNMP provides a functional network administration package for use on ftServer systems to meet identified customer needs. ftlSNMP is a unique extension of Net-SNMP that provides the SRA-ftLinux-MIB to define manageable systems and components of ftServer Linux-based systems. ftServer subagents and MIB provide SNMP support and services for fault-tolerant operations.
SNMP Foundations and Concepts The Basic Net-SNMP Commands These tools provide a basic set of features for exercising and managing objects using a standard command syntax and core functionality: NOTE Although these commands are documented as user commands (man (1)), you should treat SNMP utilities as the administrative tools they are, and closely limit privileges to execute these commands.
SNMP Foundations and Concepts • snmptranslate—This command converts object ID values into more easily understood forms. See snmptranslate(1). • snmptable—This command repeatedly uses SNMP GETNEXT or GETBULK requests to get information on a network entity, which is specified as, and must be mapped by, a table. See snmptable(1). • snmpset—This command uses the SNMP SET request to control, or set information on, a network entity. See snmpset(1).
SNMP Foundations and Concepts NOTE The Express Builder installation automatically creates the SRA-ftLinux-MIB file in the /opt/ft/mibs directory, while Net-SNMP creates its MIBs in the /usr/share/snmp/mibs directory. MIBs can be stored in a variety of locations, but running SNMP agents must still be directed to the location of a MIB the first time it is to be used, if the MIB is added after the agents have already started.
SNMP Foundations and Concepts For UDP or TCP/IP communications and collection of statistical data about communications and communications channels, MIB-II defines some necessary objects. MIB-II defines these objects for querying: system interfaces at ip icmp tcp udp snmp The Net-SNMP implementation requires basic support of the Host Resources MIB.
SNMP Foundations and Concepts SNMPv3 Support SNMPv3 support includes implementation of IETF RFCs 3410 through 3418. The third version of the Simple Network Management Protocol, presented by the IETF as the Internet Standard Management Framework RFC3410, SNMPv3 incorporates elements of SNMPv1 and SNMPv2, and shares the same basic modular architecture.
SNMP Foundations and Concepts SNMP, applications developed for any SNMP implementation tend to be easily adaptable and useful with other SNMP implementations. Conceptually, every managed object on a network is uniquely identifiable. SNMP uses ISO Abstract Syntax Notation Standard 1 (ASN.1) to place every SNMP object within the internet hierarchy of managed objects. All these unique managed objects can be managed by their defined characteristics in the MIB.
Installing Remote Network Management Services for extensible SNMP agents, and then defines master agents and subagents as processing daemons. An AgentX protocol is defined for communication between an AgentX-capable master agent and subagents. RFC2741 also defines elements of procedure for an AgentX daemon to process SNMP protocol messages. Traditional CMU SNMP management utilities are modestly refined and enhanced in Net-SNMP.
Installing Remote Network Management Services Configuring SNMP for Remote Service Management The procedure for configuring Net-SNMP is very similar to ‘‘Configuring SNMP for Service Management’’ on page 8-134, which describes enabling remote services by adding SNMP users and groups. If you are using a network management station, you may have some other procedure provided with your software.
Managing SNMP Verifying Traps You can easily verify traps using snmptrapd on a remote Linux system with Net-SNMP installed. 1. On the remote Linux system, set up Net-SNMP to autostart, and verify it using the chkconfig command, or manually start Net-SNMP. 2. On the ftServer system with the Linux operating system, Express Builder, Net-SNMP, and ftlSNMP installed, configure /etc/opt/ft/snmp/snmpd.
Managing SNMP instance, PCI adapter device names—may differ from what is applicable to your system. Testing Your SNMP Configuration The following are some Net-SNMP commands that you can use to test or exercise MIBs. If you run these remotely, the target name and IP address will differ. To walk the SRA-ftLinux-MIB file: # snmpwalk -v 1 -c public -t 120 localhost 1.3.6.1.4.1.458 # snmpwalk -v 2c -c public localhost 1.3.6.1.4.1.
Managing SNMP To use SNMPv3 with snmpwalk: # snmpwalk -v 3 -l authNoPriv -u v3user -A new_passwd localhost ucdavis # snmpwalk -v 3 -l authNoPriv -u v3user -A new_passwd localhost system # snmpwalk -v 3 -l authNoPriv -u v3user -A new_passwd localhost 1.3.6.1.4.1.458 # snmpwalk -v 3 -t 40 -l authNoPriv -u v3user -A new_passwd 1.3.6.1.4.1.458 localhost In these command examples, v3user and new_passwd are the user name and password set up in ‘‘Configuring SNMP for Service Management’’ on page 8-134.
Managing SNMP To control firmware burn (FWBURN) # snmpset -v 3 -t 40 -l authNoPriv -u v3user -A new_passwd 1.3.6.1.4.1.458.107.1.2.1.2.3.1.15.1 s FWBURN localhost Example: Managing Hardware In this example, only relevant portions of the ftsmaint command output are shown. The following example illustrates bringing a CPU-I/O enclosure down and then back up.
Managing SNMP To check that the CPU-I/O enclosure status has changed: # /opt/ft/bin/ftsmaint ls 0 H/W Path Description State Op State Reason LED State ... : : : : : : 0 Combined CPU/IO OFFLINE REMOVED_FROM_SERVICE OK_FOR_BRINGUP RED To bring CPU element 0 back up by invoking the ftcCpubdInitiateBringUp command, use the numeric OID (see ‘‘SRA-ftLinux-MIB OID Values and Properties’’ on page 8-160) for that command (1.3.6.1.4.1.458.107.1.2.1.2.3.1.11), again followed by CPU element 0’s index: # .
Managing SNMP Testing Ethernet Ports You can test Ethernet ports for proper traps and changes to OIDs. On an ftServer system running a supported Linux distribution together with Express Builder, Ethernet ports are uniquely identified. When testing cable pulls or bringdowns, the system should generate traps, and the data that Express5800/ftServer MIB objects returned should reflect these changes.
SNMP and MIBS The instances of interest are the following: Instance Instance Name (I/O element / Slot) Device ftcEtherInstanceName.3 10/5 eth080010 ftcEtherInstanceName.4 10/5 eth080011 ftcEtherInstanceName.5 11/5 eth000010 ftcEtherInstanceName.6 11/5 eth000011 These are the instances to check for when pulling cables. State changes include DUPLEX, SIMPLEX, BROKEN, and of course, various counters such as frames and collisions.
SNMP and MIBS Device Enumeration See Table 7-1 for information on the enumeration of hardware components for ftServer systems running a supported Linux distribution together with Express Builder. ftServer System Operation State Management Figure 8-2 illustrates the operational states and state changes in an ftServer system. Figure 8-2.
SNMP Network Management Station Considerations bringing the device into an Online state for fault-tolerant operations. A partnered device on an ftServer system typically reaches a Simplex state (if its partner is missing or not functioning) or a Duplex state. The interpretation of Duplex depends on the individual device type, as shown in Table 8-1. Table 8-1.
Initial SNMP Testing NOTES 1. Net-SNMP and ftlSNMP do not require the SNMP NMS, and the package does not provide one. Choice, installation, and configuration of the SNMP NMS is your responsibility. 2. The SRA-ftLinux-MIB file is only useful for managing ftServer LInux-based systems. Initial SNMP Testing On a system with an SNMP-aware NMS, you start the NMS before starting SNMP servers.
Initial SNMP Testing NOTES 1. Do not use this procedure on a deployed network host. 2. Before continuing, read ftsmaint(8) for information on single-digit device path IDs, and ‘‘ftServer System Device Path Enumeration’’ on page 7-5 if you have not already done so. Select an enclosure that can be safely brought down.
Initial SNMP Testing Use the following command to bring the CPU-I/O enclosure up again: # /opt/ft/bin/ftsmaint bringUp 11 # /opt/ft/bin/ftsmaint bringUp 1 Initial Testing of ftlsubagent Use the snmpwalk tool to perform a get next operation on a system where an SNMP master agent is running. See snmpwalk(1). For example, for the ftcPcidevcnf table: # ./snmpwalk -Os -c public -v 1 -t 40 localhost 1.3.6.1.4.1.458.107.1.2.5.2.1 . . . ftcPcidevcnfMasterDataParityError iso.3.6.1.4.1.458.107.1.2.5.2.1.14.0 iso.3.
Initial SNMP Testing OpState:State Definitions Table 8-2 lists operation state (OpState) names, SRA-ftLinux-MIB codes, and definitions for ftServer systems running a supported Linux distribution together with Express Builder. Table 8-2. Operation State Values, Names, and Definitions (Page 1 of 2) Value 8-30 Operation State (OpState) Definition 1 UNKNOWN The state of a component could not be determined.
Initial SNMP Testing Table 8-2. Operation State Values, Names, and Definitions (Page 2 of 2) Value Operation State (OpState) Definition 19 ONLINE The unit can be communicated with. 20 SIMPLEX A component is online and has no partner; it is not safe to remove this component. Applies to components that can be partnered. 21 DUPLEX The component is online and has a partner component that is running in lockstep, mirrored, or available for failover (depending on the type of component).
Initial SNMP Testing Table 8-3. Reason Codes, Names, and Definitions (Page 2 of 2) Code Reason Definition 13 FIRMWARE_PROM_ERROR Could not write to the firmware PROM. 14 AUTOBURN_DISABLED Cannot match a new enclosure’s BIOS or firmware with that of the existing enclosure. 16 PRIMARY With duplex devices, this indicates that the specific device is primary in the pair. 17 SECONDARY With duplex devices, this indicates that the specific device is secondary in the pair.
Trap Filtering Trap Filtering This section discusses the following topics: • ‘‘Trap-Filtering Capability” • ‘‘Activating and Deactivating Trap Filtering” • ‘‘Trap-Filtering Examples” Trap-Filtering Capability ftlSNMP provides the ability to filter out transitional traps. Traps are messages that inform you about network events. Hardware components that go in and out of service trigger a number of traps that are seen at the management client.
Trap Filtering Traps with the following reason codes are also filtered out: • PARENT_EMPTY • PARENT_BROKEN To deactivate the trap-filtering capability, change the above configuration line as follows: sraTrapFiltering off By default, trap filtering is turned off (that is, sraTrapFiltering is set to off in the configuration file). Trap-Filtering Examples Example 8-1 shows some traps that can occur when I/O element 11 is brought down and trap filtering is off.
Trap Filtering RFC1213-MIB::sysUpTime.0 = Timeticks: (12030) 0:02:00.30 SNMPv2-MIB::snmpTrapOID.0 = OID: SRA-ftLinux-MIB::ftcTrapGenericInformationTrap.0 SRA-ftLinux-MIB::ftcTrapDevicePathID.0 = STRING: "10 40 1" SRA-ftLinux-MIB::ftcTrapAlertType.0 = STRING: "OPSTATE_CHANGE" SRA-ftLinux-MIB::ftcTrapGenName.0 = STRING: "SIMPLEX" SRA-ftLinux-MIB::ftcTrapGenDetailInfo.0 = STRING: "NONE" SRA-ftLinux-MIB::ftcTrapGenAction.0 = STRING: "UNKNOWN" SRA-ftLinux-MIB::ftcTrapGenEventTimeStampWithOffsetFromUTC.
Trap Filtering SRA-ftLinux-MIB::ftcTrapAlertType.0 = STRING: "OPSTATE_CHANGE" SRA-ftLinux-MIB::ftcTrapGenName.0 = STRING: "OFFLINE" SRA-ftLinux-MIB::ftcTrapGenDetailInfo.0 = STRING: "PARENT_EMPTY" SRA-ftLinux-MIB::ftcTrapGenAction.0 = STRING: "UNKNOWN" SRA-ftLinux-MIB::ftcTrapGenEventTimeStampWithOffsetFromUTC.0 = STRING: RFC1213-MIB::sysUpTime.0 = Timeticks: (12535) 0:02:05.35 SNMPv2-MIB::snmpTrapOID.0 = OID: SRA-ftLinux-MIB::ftcTrapGenericInformationTrap.0 SRA-ftLinux-MIB::ftcTrapDevicePathID.
Trap Filtering SRA-ftLinux-MIB::ftcTrapGenEventTimeStampWithOffsetFromUTC.0 = STRING: "20051206135032.759280-300" RFC1213-MIB::sysUpTime.0 = Timeticks: (18744) 0:03:07.44 SNMPv2-MIB::snmpTrapOID.0 = OID: SRA-ftLinux-MIB::ftcTrapGenericInformationTrap.0 SRA-ftLinux-MIB::ftcTrapDevicePathID.0 = STRING: "11 6" SRA-ftLinux-MIB::ftcTrapAlertType.0 = STRING: "OPSTATE_CHANGE" SRA-ftLinux-MIB::ftcTrapGenName.0 = STRING: "OFFLINE" SRA-ftLinux-MIB::ftcTrapGenDetailInfo.
Trap Filtering RFC1213-MIB::sysUpTime.0 = Timeticks: (26444) 0:04:24.44 SNMPv2-MIB::snmpTrapOID.0 = OID: SRA-ftLinux-MIB::ftcTrapGenericInformationTrap.0 SRA-ftLinux-MIB::ftcTrapDevicePathID.0 = STRING: "11 120" SRA-ftLinux-MIB::ftcTrapAlertType.0 = STRING: "OPSTATE_CHANGE" SRA-ftLinux-MIB::ftcTrapGenName.0 = STRING: "OFFLINE" SRA-ftLinux-MIB::ftcTrapGenDetailInfo.0 = STRING: "NONE" SRA-ftLinux-MIB::ftcTrapGenAction.0 = STRING: "UNKNOWN" SRA-ftLinux-MIB::ftcTrapGenEventTimeStampWithOffsetFromUTC.
Trap Filtering SRA-ftLinux-MIB::ftcTrapGenAction.0 = STRING: "UNKNOWN" SRA-ftLinux-MIB::ftcTrapGenEventTimeStampWithOffsetFromUTC.0 = STRING: "20051206141315.508290-300" RFC1213-MIB::sysUpTime.0 = Timeticks: (5746) 0:00:57.46 SNMPv2-MIB::snmpTrapOID.0 = OID: SRA-ftLinux-MIB::ftcTrapGenericInformationTrap.0 SRA-ftLinux-MIB::ftcTrapDevicePathID.0 = STRING: "10 40 1" SRA-ftLinux-MIB::ftcTrapAlertType.0 = STRING: "OPSTATE_CHANGE" SRA-ftLinux-MIB::ftcTrapGenName.
Trap Filtering RFC1213-MIB::sysUpTime.0 = Timeticks: (6845) 0:01:08.45 SNMPv2-MIB::snmpTrapOID.0 = OID: SRA-ftLinux-MIB::ftcTrapGenericInformationTrap.0 SRA-ftLinux-MIB::ftcTrapDevicePathID.0 = STRING: "10 120" SRA-ftLinux-MIB::ftcTrapAlertType.0 = STRING: "OPSTATE_CHANGE" SRA-ftLinux-MIB::ftcTrapGenName.0 = STRING: "SIMPLEX" SRA-ftLinux-MIB::ftcTrapGenDetailInfo.0 = STRING: "PRIMARY" SRA-ftLinux-MIB::ftcTrapGenAction.0 = STRING: "UNKNOWN" SRA-ftLinux-MIB::ftcTrapGenEventTimeStampWithOffsetFromUTC.
Trap Filtering "SRA-ftLinux-MIB::ftcTrapGenAction.0 = STRING: "UNKNOWN "SRA-ftLinux-MIB::ftcTrapGenEventTimeStampWithOffsetFromUTC.0 = STRING: "20051207143640.794126-300" RFC1213-MIB::sysUpTime.0 = Timeticks: (466169) 1:17:41.69 SNMPv2-MIB::snmpTrapOID.0 = OID: SRA-ftLinux-MIB::ftcTrapGenericInformationTrap.0 SRA-ftLinux-MIB::ftcTrapDevicePathID.0 = STRING: "2 "SRA-ftLinux-MIB::ftcTrapAlertType.0 = STRING: "OPSTATE_CHANGE "SRA-ftLinux-MIB::ftcTrapGenName.
Trap Filtering "SRA-ftLinux-MIB::ftcTrapGenName.0 = STRING: "DUPLEX "SRA-ftLinux-MIB::ftcTrapGenDetailInfo.0 = STRING: "PRIMARY "SRA-ftLinux-MIB::ftcTrapGenAction.0 = STRING: "UNKNOWN "SRA-ftLinux-MIB::ftcTrapGenEventTimeStampWithOffsetFromUTC.0 = STRING: "20051207144108.553784-300" RFC1213-MIB::sysUpTime.0 = Timeticks: (10511) 0:01:45.11 SNMPv2-MIB::snmpTrapOID.0 = OID: SRA-ftLinux-MIB::ftcTrapGenericInformationTrap.0 SRA-ftLinux-MIB::ftcTrapDevicePathID.0 = STRING: "2 "SRA-ftLinux-MIB::ftcTrapAlertType.
Chapter 9 Troubleshooting ftServer Systems 9- This chapter discusses the following topics: • ‘‘LED and Visual Diagnostics” • ‘‘LED and Visual Diagnostics” • ‘‘System Log Messages” This chapter provides information that will help you use available ftServer system and Linux operating system features to diagnose system problems. In many cases, you will be able to identify the source of the problem.
System Boot Problems Normal Boot Sequence The active CPU-I/O enclosure (that is, the primary enclosure whose power switch is lit green) initiates the boot by starting the BIOS. The BIOS on the booting enclosure scans the list of bootable devices (as configured in the BIOS) looking for a device to boot. When the search finds the disks, they are analyzed from bottom to top in the boot enclosure. When a disk with a boot partition is found, it is booted.
System Boot Problems NOTE If a RAID array fails to start, the boot stops and enters a debug shell. This is almost always because of a configuration error in /etc/fstab or /etc/mdadm.conf. Exiting the shell forces a reboot. Depending on your system’s RAID configuration, you may see one or more error messages similar to the following: md: could not bd_claim sdar1 md: error, md_import_device() returned -16 These messages indicate that md is refusing to start an array that has already been started.
System Boot Problems In the output above, reason is one of the following: • ERROR building Stratus kernel objects -- see logfile • ERROR: missing Stratus kernel objects -- see logfile • ERROR: incorporating Stratus kernel objects -- see logfile In the output above, logfile is the name of a file that contains relevant details. To override the system’s fault-tolerant policy and allow the system to boot to a non-fault-tolerant state, at the console, type NON-FT-BOOT and press ENTER.
System Boot Problems ! CAUTION In particular, specifying the GRUB noapic option can make the operating system unbootable. RAID Problem If a RAID-1 array has one type 0xfd (Linux RAID autodetect) mirror and one 0x83 (Linux) mirror, at boot, the RAID array is started in degraded mode using the type 0xfd mirror, and the type 0x83 mirror is not automatically added. You can add the mirror with mdadm. To fix this problem, just change the partition type with fdisk.
System Log Messages ! CAUTION Boot monitoring is one of the fault-tolerant features of your ftServer system. You must reenable it for full fault tolerance. System Log Messages System log messages contain information on the operation state of the system. The file /var/log/messages contains system log messages. You can find logs that are specific to ftServer systems in the directory /var/opt/ft/log.
Index B backing up a system, 4-7, 5-26 creating disk, 4-7 BIOS changing settings of Setup program, 9-5 firmware, 2-7, 3-1 updating, 3-1 BMC firmware, 2-7 updating, 3-5 bonding.
Index components, 2-6 documentation, 1-3, 4-8, 5-34 ftServer systems, 1-4 Linux operating system, 1-4 UNIX, 1-4 drivers missing, 9-3 dumps, 7-13 system, 7-13 duplex, 8-31 LED indicator, 8-28 E error log messages keyboard, 9-6 mouse, 9-6 Ethernet configuration, 2-5 channel bonding, 5-28 MAC addresses, 5-31 naming Ethernet devices, 5-26 PCI adapters adding, 5-30 testing Ethernet ports, 8-23 Express Service Network, 7-1 architecture, 1-1 optional software, 1-3 Express Builder operational states, 8-25 recoveri
Index ftsmaint command, 7-1 acSwitch, 7-2 bringDown path, 7-2 bringUp path, 7-3 burnProm fw_file path, 7-3 clearMtbf path, 7-3 dump path, 7-3 examples, 7-10 ls path, 7-2 lsLong, 7-2 lsPeriph, 7-2 lsVND, 7-2 powerOff modem, 7-3 powerOn modem, 7-3 reset modem, 7-3 resetMtbf path, 7-3 runDiag path, 7-4 setMtbfThresh value path, 7-4 setMtbfType policy path, 7-4 setPriority level path, 7-4 setSensorThresh th_name value path, 7-4 task arguments, 7-1 -version, 7-5 G GRUB boot loader, 9-2 documentation, 2-19 probl
Index N naming devices, 6-1 Net-SNMP, 8-1, 8-26 basic commands, 8-10 description, 8-2 network administration, 1-2 network management stations (NMS), 8-26 Network Time Protocol (NTP) NTP.
Index serial ports data rate, 2-6 flow control, 2-6 setting up the Linux operating system, 2-3 shells debug, 9-3 GRUB, 2-6 simplex, 8-31 LED indicator, 8-28 SNMP (Simple Network Management Protocol), 8-1 See also ftlSNMP and Net-SNMP basic Net-SNMP commands, 8-10 concepts, 8-8 configuring for remote service management, 8-17 configuring for service management, 8-6 configuring to start at system initialization, 8-6 fault-tolerant operation, 8-15 initial testing, 8-27 managing, 8-18 managing hardware, 8-20 tes
Index floppy drives, 6-4 restoring after enclosure failure, 6-3 solid-state, 6-4 storage, 6-1 UTC.