AlphaServer DS20 Service Manual Order Number: EK–AS140–SV. A01 This manual is for anyone who services this system. It includes troubleshooting information, configuration rules, and instructions for removal and replacement of field-replaceable units.
Notice The information in this publication is subject to change without notice. COMPAQ COMPUTER CORPORATION SHALL NOT BE LIABLE FOR TECHNICAL OR EDITORIAL ERRORS OR OMISSIONS CONTAINED HEREIN, NOR FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES RESULTING FROM THE FURNISHING, PERFORMANCE, OR USE OF THIS MATERIAL. This publication contains information protected by copyright. No part of this publication may be photocopied or reproduced in any form without prior written consent from Compaq Computer Corporation.
Contents Preface ........................................................................... ix Chapter 1 System Overview 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.8.1 1.8.2 1.8.3 1.9 1.9.1 1.10 1.11 1.12 1.13 1.14 1.15 System Enclosure .....................................................................................1-2 Operator Control Panel and Drives ...........................................................1-4 System Consoles ............................................................................
Chapter 3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 Ibox Status Register - I_STAT..................................................................3-2 Memory Management Status Register – MM_STAT.................................3-3 Dcache Status Register – DC_STAT.........................................................3-5 Cbox Read Register..................................................................................3-7 Miscellaneous Register (MISC) ...........................................................
Appendix B B.1 B.2 B.2.1 B.3 B.4 B.5 RCM Overview ....................................................................................... B-2 First-Time Setup ..................................................................................... B-3 Using RCM Locally or with a Modem on COM1 ............................... B-4 RCM Commands..................................................................................... B-5 Using the RCM Switchpack..........................................................
Figures 1-1 1-2 1-3 1-4 1-5 1-6 1-7 1-8 1-9 1-10 1-11 1-12 1-13 1-14 1-15 1-16 1-17 1-18 1-19 1-20 2-1 2-2 2-3 4-1 4-2 4-3 4-4 4-5 4-6 4-7 4-8 4-9 4-10 4-11 4-12 4-13 4-14 4-15 4-16 4-17 4-18 vi System Enclosure .........................................................................................1-2 Cover Interlock Circuit.................................................................................1-3 Control Panel Assembly ..........................................................................
4-19 B-1 B-2 C–1 Removing StorageWorks UltraSCSI Bus Extender...................................... 4-40 RCM Connections ...................................................................................... B-3 Location of RCM Switchpack on Server Feature Module .......................... B-10 Starting LFU from the AlphaBIOS Console.................................................
Preface Intended Audience This manual is written for the customer service engineer. Document Structure This manual uses a structured documentation design. Topics are organized into small sections for efficient online and printed reference. Each topic begins with an abstract, followed by an illustration or example, and ends with descriptive text. This manual has four chapters and three appendixes, as follows: • Chapter 1, System Overview, introduces the Compaq AlphaServer DS20 system.
Documentation Titles Table 1 lists books in the documentation set. Table 1 AlphaServer DS20 Documentation Title Order Number User and Installation Documentation Kit QZ–014AA–G8 User’s Guide EK–AS140–UG Basic Installation EK–AS140–IN Service Information Service Manual EK–AS140–SV Information on the Internet Using a Web browser you can access the AlphaServer InfoCenter at: http://www.digital.com/info/alphaserver/products.
Chapter 1 System Overview The Compaq AlphaServer DS20 system consists of up to two CPUs, up to 4 Gbytes of memory, 6 I/O slots, and up to 7 SCSI storage devices. AlphaServer DS20 systems can be mounted in a standard 19” rack. AlphaServer DS20 systems support OpenVMS, Compaq Tru64 UNIX, Windows NT, and Linux.
1.1 System Enclosure The system has up to two CPU modules and up to 4 Gbytes of memory. A single fast wide UltraSCSI StorageWorks shelf provides up to 128 Gbytes of storage.
The numbered callouts in Figure 1-1 refer to the system components. ➊ System card cage, which holds the system board and the CPU, memory, and system I/O. ➋ PCI/ISA section of the system card cage. ➌ Operator control panel assembly, which includes the control panel, the LCD display, and the floppy drive. ➍ CD-ROM drive. ➎ Cooling section containing two fans and the server feature module. ➏ StorageWorks shelf. Cover Interlock The system has a single cover interlock switch tripped by the top cover.
1.2 Operator Control Panel and Drives The control panel includes the On/Off, Halt, and Reset buttons and an LCD display. Figure 1-3 Control Panel Assembly CD-ROM Floppy OCP Display 1 2 3 PKW-0501-97 OCP display. The OCP display is a 16-character LCD that indicates status during power-up and self-test. While the operating system is running, the LCD displays the system type. Its controller is on the XBUS. CD-ROM. The CD-ROM drive is used to load software, firmware, and updates.
➊ On/Off button. Powers the system on or off. When the LED to the right of the button is lit, the power is on. The On/Off button is connected to the power supplies through the system interlock and the RCM logic. ➋ Reset button. Initializes the system. ➌ Halt button. When the Halt button is pressed, different results are manifest depending upon the state of the machine. The major function of the Halt button is to stop whatever the machine is doing and return the system to the SRM console.
1.3 System Consoles There are two console programs: the SRM console and the AlphaBIOS console. SRM Console Prompt On systems running the Compaq Tru64 UNIX or OpenVMS operating system, the following console prompt is displayed after system startup messages are displayed, or whenever the SRM console is invoked: P00>>> NOTE: The console prompt displays only after the entire power-up sequence is complete. This can take up to several minutes if the memory is very large.
SRM Console The SRM console is a command-line interface used to boot the Compaq Tru64 UNIX and OpenVMS operating systems. It also provides support for examining and modifying the system state and configuring and testing the system. The SRM console can be run from a serial terminal or a graphics monitor. AlphaBIOS Console The AlphaBIOS console is a menu-based interface that supports the Microsoft Windows NT operating system.
1.4 System Architecture An Alpha microprocessor chip is used in this system. The CPU, memory, and the I/O modules are physically connected to the system board and logically connected through a switch-based interconnect implemented in a cross-bar switch chipset. Figure 1-4 Block Diagram Command, Address, and Control lines for each Memory Array C chip Control lines for D chips Probe/ Addr. Probe/ Addr. CAPbus P chip 64 bit PCI P chip 64 bit PCI CMD/ Addr.
The AlphaServer DS20 is a switch-based interconnect system; it uses a cross-bar switch chipset that allows data to move directly from place to place in the system. The CPU, memory, and I/O devices physically connect to the system board and each has one or two logical connections to the switch. The arrows on the block diagram shown in Figure 1-4 indicate the flow of data, command/address, and control signals.
1.5 CPU Types There is a single CPU variant.
Alpha Chip Composition The Alpha 21264 chip uses 0.35 micron chip technology, has a transistor count of 15.2 million, consumes 50 watts of power, and is air cooled (a fan is on the chip). The default cache system is write-back. Chip Description Unit Description Instruction Execution 64-Kbyte I-cache 4-way execution; four integer units, two of which can perform memory address calculations for load and store instructions; dedicated units for floating-point add, multiply, divide, and square root operations.
1.6 Memory Memory consists of up to four memory options, each consisting of four DIMMs. There are four option variants: 128 Mbytes, 256 Mbytes, 512 Mbytes, and 1 Gbyte.
Memory Variants Memory is organized on two 256 plus ECC bit buses. Each bus can hold up to two memory banks (a memory option) made up of four DIMM modules. Memory can be configured from a minimum of 128 Mbytes (1 MS340-BA) to 4 Gbytes (4 MS340EA). All memory is synchronous. DRAM Option Size Module Type Number/ option MS340-BA MS340-CA MS340-DA MS340-EA 128 MB 256 MB 512 MB 1 GB 54-25066-BA 54-25053-BA 54-25941-KA 54-25941-BA Synch. Synch. Synch. Synch.
1.7 Memory Addressing and Data Location Memory addressing is contiguous beginning with memory bank 0. The first address of each bank is one above the ending address of the previous bank. Data is located in DIMMs as described by Figure 1-7.
Memory Addressing The first address of each bank is one above the ending address of the previous bank. Example 1–1 and Figure 1-8 show the starting address of each memory bank using either the SRM console or AlphaBIOS.
1.8 System Board The system board contains five major logic sections performing five major system functions.
Three major sections on the system board are: • The cross-bar switch chipset and the system components attached to it (CPU(s), memory, PCI chips, and the TIG bus) • The power connections and voltage regulator • The I/O subsystem System Overview 1-17
1.8.1 Cross-Bar Switch and System Components The cross-bar switch chipset consists of a single control chip, the C-chip, and eight data chips, the D-chips. Into and out of the D-chips are two system buses to CPUs, two PAD buses to PCI chips, and two memory data buses that connect to up to four memory banks.
Each type of bus in the system is unique: • The two memory data buses operate in 256-bit mode passing two hex words (32 bytes) of data between memory and the D-chips per cycle. The bus operates at 83.3 MHz. • The two CPU data buses operate in “64-bit mode” passing a quadword (8 bytes) of data between CPU and the D-chips per cycle. Though the CPU data bus is narrower than the memory data bus, it operates at four times the speed of the memory data bus at 333 MHz.
1.8.2 I/O Subsystem The I/O subsystem consists of two 64-bit PCI buses. One has an embedded ISA bridge, three PCI option slots, and a single ISA slot; the other bus has three PCI option slots.
Table 1-1 PCI Slot Numbering Slot PCI0 PCI1 5 PCI to ISA bridge 6 Adaptec SCSI 7 PCI slot PCI slot 8 PCI slot PCI slot 9 PCI slot PCI slot ISA Shared ISA device logically ISA device physically The logic for two PCI buses is on the system board. • PCI0 is a 64-bit bus with three PCI slots, a Cypress chip, and an Adaptec SCSI controller. The Cypress chip is the PCI to ISA bus bridge and controls the following: the keyboard, mouse, IDE bus, real-time clock, and the USB bus.
1.8.3 System Board Switchpacks There are two switchpacks on the system board. They control the writing of the flash ROM and the speed of the crossbar switch among other things.
Figure 1-12 shows the location of the switchpacks and Table 1-2 and Table 1-3 describe what each switch controls. Table 1-2 Switchpack 2 Switch 1 2 3 4 5 6 7 8 Description Fail safe boot. Off (default) = normal boot. On = boot the fail safe booter Reserved. Must be off. Reserved. Must be off. Reserved. Must be off. Switches 5, 6, and 7 create a field that defines the speed at which the cross bar switch runs. Switches 5 and 6 are on and switch 7 is off.
1.9 Server Feature Module The server feature module provides remote control operation of the system. A four-switch switchpack enables or disables remote control features.
The system allows both local and remote control. The remote control firmware and a set of switches that enable or disable remote control features reside on the server feature module.
1.9.1 Power Control Logic The power control logic is on the server feature module.
The power control logic performs these functions: • Monitors system temperature and powers down the system 30 seconds after it detects that internal temperature of the system is above the value of the 0 environment variable over_temp. Default = 55 C. • Monitors the system and CPU fans and powers down the system 30 seconds after it detects a fan failure. • Provides some visual indication of faults through LEDs.
1.10 Power Circuit and Cover Interlock Power is distributed throughout the system and mechanically can be broken by the On/Off switch, the cover interlock, or remotely through the RCM.
Figure 1-15 shows the distribution of the power enable circuit through the system. Opens in the circuit, or the RCM signal RCM_DC_EN_L, or a power supply detected power fault causes interruption to the DC power applied to the system. A failure anywhere in the circuit will result in the removal of DC power. A potential failure is the relay used in the remote control logic to control the RCM_DC_EN_L signal. The cover interlock is located under the top cover between the system card cage and the storage area.
1.11 Power Supply Two power supplies provide system power. Figure 1-16 Back of Power Supply and Location Power Supply 1 Power Supply 0 Current share +5V/Return +5V/Return +12V/Return Misc. Signal +3.
Description A single 675 watt power supply provides power to the system. A second power supply (optional) provides redundant power. Power Supply Features • 88–132 and 176–264 Vrms AC input • 675 watts output. Output voltages are as follows: Output Voltage Max. Voltage Max. Current +5.0 4.85 5.25 100 +3.3 3.18 3.48 100 +12 11.5 12.6 28 –12 –10.9 –13.2 2 4.9 5.4 +5 Vaux • Min. Voltage 1.5 Remote sense on +5.0V and +3.3V +5.0V is sensed on the system board. +3.
1.12 Power Up/Down Sequence System power can be controlled manually by the On/Off button on the OCP or remotely through the RCM. The power-up/down sequence flow is shown below.
When AC is applied to the system, Vaux (auxiliary voltage) is asserted and is sensed on the server feature module. If the On-Off Button is On, and RCM OK and Interlock OK are asserted, the OCP asserts DC_ENABLE_L starting the power supplies. If there is a hard fault on power-up, the power supplies shut down immediately; otherwise, the power system powers up and remains up until the system is shut off or the server feature module senses a fault.
1.13 TIG Bus The Timing, Interrupt, and General bus (TIG) performs a number of functions; it carries all system interrupts, timing signals and provides the path to the diagnostic and console flash ROMs. Figure 1-18 TIG Bus Block Diagram CPU Data Bus D Chips CPU s PAD Bus P Chip Interrupt data lines CAP Bus C Chip TIG Bus Flash ROM Config Registers and switchpack IRQs 3.
Figure 1-18 is a block diagram of the TIG bus implemented through the TIG chip. Three system functions are carried out on this bus. Flash ROM The flash ROM containing the diagnostics, fail-safe loader, and console firmware sits on the TIG bus. (This is different from the AS 1200 where the flash ROM sat on the 2 I C bus.) Still a good deal of logic has to function for the diagnostics to run. Configuration Registers Registers on the bus include interrupts, module information, and clock information.
2 1.14 Maintenance Bus (I C Bus) 2 There are two I C buses (referred to as the “I squared C bus”) in this system. The internal maintenance bus is used to monitor system conditions scanned by the power control logic on the server feature module, log error state and track 2 system configuration information. There is a private I C bus between memory and the C-chip used to provide memory configuration information to the consoles and operating systems.
Monitor 2 The I C bus monitors the state of system conditions scanned by the power control logic. There are two registers that the PC logic writes data to: • One records the state of the fans and power supplies and is latched when there is a fault. • The other causes an interrupt on the I C bus when a CPU or system fan fails, an overtemperature condition exists, or power supplied to the system changes from N + 1 to N or from N to N +1.
1.15 StorageWorks Drives The system supports up to seven StorageWorks drives.
The StorageWorks drives are to the right of the system cage. Up to seven drives fit into the shelf. The system supports fast wide UltraSCSI disk drives. The RAID controller is also supported. With an optional UltraSCSI Bus Splitter Kit, the StorageWorks shelf can be split into two buses.
Chapter 2 Troubleshooting This chapter describes troubleshooting during power-up and booting. It also describes the console test command and other service related console commands. A Compaq Analyze example is also provided.
2.1 Troubleshooting During Power-Up Power or other problems can occur before the system is up and running. Power Problem List The system will halt/power off for the following reasons: 1. A CPU fan failure 2. A system fan failure 3. An overtemperature condition 4. Power supply failure if the redundant power option is not present 5. Circuit beaker(s) tripped 6. AC problem 7. Interlock switch activation or failure 8.
If the system does not power up • Are the power cords plugged in? • Is the power supply functioning? (The power supply will shut down if it detects any faults. See Section 1.11.
2.2 Control Panel Display and Troubleshooting The control panel display indicates the likely device when testing fails. Figure 2-1 Control Panel and LCD Display AlphaServer DS20 PK1408 • When the On/Off button LED is on, power is applied and the system is running. When it is off, the system is not running, but power may or may not be present. If the power supplies are receiving AC power, Vaux is present on the server feature module regardless of the condition of the On/Off switch.
Table 2-1 Control Panel Display Content of Display Progress Indicated in Power-Up Flow Compaq CPU functioning, path to the OCP operating. Hardware involved – CPU, C-chip, P-chip 0, PCI to ISA bridge, ISA to XBUS bridge, OCP controller. Compaq * B-cache initialized and both B-cache and memory is being tested. Additional hardware involved: Backup cache on the CPU module, D-chips, memory DIMMs. Compaq Firmware Firmware loading. Additional hardware involved – TIG bus. Compaq Error 06 Memory error.
2.3 Power-Up Display and Troubleshooting If the power-up display appears, the following hardware is at least partially functioning: at least one CPU, the C-chip, some D-chips, the P-chips, the TIG 2 bus, the ISA bridge, and the I C bus. The entire power-up display prints to a serial terminal (if the console environment variable is set to serial); the last several lines print to either a serial terminal or a graphics monitor. Power-up status also is seen on the control panel display.
By the time the power-up display is completed, 1. the CPUs have run their self-tests, 2. the SROM has completed its preliminary tests and loaded the SRM console from flash ROM on the TIG bus into memory, 3. the SROM has passed control to the SRM console, 4. the SRM has polled the system, run its system diagnostics, and has sent the display characters. If the system’s operating system is NT, you will not see any of the power-up display before the line that says “Testing the System.
2.4 Running Diagnostics — Test Command The test command runs diagnostics on the entire system, CPU devices, memory devices, and the PCI I/O subsystem. The test command runs only from the SRM console. Ctrl/C stops the test. The console cannot be secure. Example 2–2 Test Command Syntax P00>>> help test NAME test FUNCTION Test the system. SYNOPSIS test [-lb] [-t
2.5 Testing an Entire System A test command runs all exercisers for subsystems and devices on the system. I/O devices tested are supported boot devices. The test runs for 2 minutes. Example 2–3 Sample Test Command P00>>> test System test, runtime 120 seconds Type ^C if you wish to abort testing once it has started Default zone extended at the expense of memzone.
ID Program Device Pass Hard/Soft Bytes Wrtn Bytes Rd -------- ------------ ---------- ----- -------- ------------------00001c12 memtest memory 2 0 0 1082130432 1082130432 00001c17 memtest memory 2 0 0 1082130432 1082130432 00001c35 memtest memory 2 0 0 1073741824 1073741824 00001c80 exer_kid dkb100.1.0.9 0 0 0 0 20086784 00001c83 exer_kid dkb200.2.0.9 0 0 0 0 20086784 00001c85 exer_kid dkb300.3.0.9 0 0 0 0 20086784 00001cc7 exer_kid dke0.0.0.200 0 0 0 0 16531456 00001cc8 exer_kid dke200.2.0.
2.6 Other Useful Console Commands Several console commands can be used to diagnose the system. The show power command identifies power, temperature, and fan faults.
2.7 Troubleshooting with LEDs During power-up, reset, initialization, or testing, diagnostics are run on CPUs, memories, P-chips, and the PCI backplane and its embedded options. Although system LEDs are not visible when the side panels are on, they can be viewed when the card cage side of the system is exposed and the top cover is on. There are LEDs on the CPU and server feature modules.
To see LEDs, the card cage side of the system must be exposed; the system top should be on, and the system must be on. CPU LEDs The CPU LEDs are on the under side of the module. Figure 2-2 shows the location of the LEDs when looking up at the module. Normally all CPU LEDs are on except the SROM Clock LED. Replace the CPU if the 5V OK LED is on and any of the following LEDs are off: CPU DC OK, or 2V OK. If the 5V OK LED is off, power is not getting to the CPU.
2.8 Compaq Analyze Compaq Analyze is the error analysis tool used to analyze errors. An example of its output is shown here. For information on installing, running, and learning about Compaq Analyze, go to http://www.evnrud.cxo.dec.com/desta/kits.htm. 2.8.1 Compaq Analyze Graphics Interface (GUI) Compaq Analyze automatically runs on each of the supported operating systems on the DS20 system.
Figure 2-3 shows an example of what you can expect to see on a system’s console, assuming it is a graphics terminal and Compaq Analyze is installed and running in the backround. When an error is detected, it is reported to the console with a series of problem found statements. In this case, “an uncorrectable system fan 0 error detected,” was logged a couple of times in the event log with a time stamp of Friday March 12, 1999.
2.8.2 Description of the Error After “double clicking” the Problem Found: hot spot on the Compaq Analyze screen a full description of the error is displayed and a FRU and its location is called out. Example 2–6 shows a Compaq Analyze error report. Example 2–6 Compaq Analyze Error Report Problem Found: An uncorrectable system fan 0 error detected.
Evidence: Entry Errlog: SMM_1838 SysType_34 OS_Type_1 Entry_Type_682 Entry_Type_Ana Mchk_Error_Cod Event_Header_Common_Fields_V2_0 Event_Leader: xFFFFFFFE Header_Length: 176 Event_Length: 312 Header_Rev_Major: 2 Header_Rev_Minor: 0 OS_Type: 1 ! 1 = UNIX, 2 = OVMS, 3 = NT Hardware_Arch: 4 CEH_Vendor_ID: 3564 Hdwr_Sys_Type: 34 Logging_CPU: 0 CPUs_In_Active_Set: 2 Major_Class: 115 Minor_Class: 2 DSR_Msg_Num: 1838 ! Compaq AlphaServer DS20 CEH_Device: 35 Chip_Type: 8 ! 8 = EV6 CEH_Device_ID_0: x0000FFFF CEH_De
Systype34_Env_Regs_V1 Frame_Flags: x00000000 Mchk_Error_Code: x00000206 Frame_Rev: 1 SW_Sum_Flags: x0000000000000000 Cchip_DIR: x0001000000000000 Environ_QW_1: x0000000000000009 Environ_QW_2: x000000000000004F Environ_QW_3: x0000000000000000 Environ_QW_4: x0000000000000000 Environ_QW_5: x0000000000000000 Environ_QW_6: x0000000000000000 Environ_QW_7: x0000000000000000 Environ_QW_8: x0000000000000000 Environ_QW_9: x0000000000000000 Subpacket_Support Subpacket_Header_Support Trailer_Frame_Support Compaq Analyz
Of particular interest in the error report is the Full Description of the error. If Compaq Analyze is able to determine what failed on the machine, it gives a full description of the failing FRU and its location. In this case the upper system fan is identified as the failing part and its location is given. Evidence provided depends upon the type of error detected. The types of errors detected are given in Table 2-2.
2.9 Releasing Secure Mode The console cannot be secure for most SRM console commands to run. If the console is not secure, user mode console commands can be entered. See the system manager if the system is secure and you do not know the password. Example 2–7 Releasing/Reestablishing Secure Mode P00>>> login Please enter password: xxxx P00>>> [User mode SRM console commands are now available.] P00>>> set secure The console command login clears secure.
Chapter 3 Error Registers This chapter describes the following registers used to hold error information: • Ibox Status Register - I_STAT • Memory Management Status Register – MM_STAT • Dcache Status Register – DC_STAT • Cbox Read Register • Miscellaneous Register (MISC) • Device Interrupt Request Register (DIRn, n=0,1) • Pchip Error Register (PERROR) • Failure Register • Function Register Error Registers 3-1
3.1 Ibox Status Register - I_STAT The Ibox Status Register (I_STAT) is a read/write-1-to-clear register that contains Ibox status information. The register is read only by PAL code and is an element in the CPU or System Uncorrectable Machine Check Error Logout frame. 31 30 29 28 0 TPE DPE 61 32 PK1414-99 Table 3-1 Ibox Status Register Name Bits Type Description Reserved <63:31> RO Reserved for Compaq.
3.2 Memory Management Status Register – MM_STAT The Memory Management Status Register (MM_STAT) is a read-only register. When a Dstream TB miss or fault occurs, information about the error is latched in MM_STAT. This register is not updated when a LD_VPTE gets a DTB miss instruction. The register is read only by PALcode and is an element in the CPU or System Uncorrectable Machine Check Error Logout frame.
Table 3-2 Memory Management Status Register Name Bits Reserved <63:11> Type Description Reserved for Compaq. DC_TAG <10> _PERR RO This bit is set when a Dcache tag parity error occurs during the initial tag probe of a load or store instruction. The error created a synchronous fault to the D_FAULT PALcode entry point and is correctable. The virtual address associated with the error is available in the VA register. OPCODE <9:4> RO Opcode of the instruction that caused the error.
3.3 Dcache Status Register – DC_STAT The Dcache Status Register (DC_STAT) is a read-write register. If a Dcache tag parity error or data ECC error occurs, information about the error is latched in this register. The register is read only by PALcode and is an element in the CPU or System Uncorrectable Machine Check Error Logout frame.
Table 3-3 Dcache Status Register Name Bits Type Description Reserved <63:5> SEO <4> W1C Second error occured. When set, indicates that a second Dcache store ECC error occurred within 6 cycles of the previous Dcache store ECC error. ECC_ERR <3> _LD W1C ECC error on load. When set, indicates that a single-bit ECC error occurred while processing a load from the Dcache or any fill. ECC_ERR <2> _ST W1C ECC error on store. When set, indicates that an ECC error occurred while processing a store.
3.4 Cbox Read Register The Cbox Read Register is read 6 bits at a time. Table 3-4 shows the ordering from LSB to MSB. The register is read only by PALcode and is an element in the CPU or System Uncorrectable Machine Check Error Logout frame. Table 3-4 Cbox Read Register Name Description C_SYNDROME_1 <7:0> Syndrome for the upper QW in the OW of victim that was scrubbed. C_SYNDROME_0 <7:0> Syndrome for the lower QW in the OW of victim that was scrubbed.
Table 3-4 Cbox Read Register (Continued) Name Description C_STAT<3:0> If C_STAT equals xxx_MEM_ERR or xxx_BC_ERR, then C_STAT contains the status of the block as follows; otherwise, the value of C_STAT is X. Bit valus 7- 4 C_ADDR <6:42> Status of block Reserved 3 Parity 2 Valid 1 Dirty 0 Shared Address of the last reported ECC or parity error. If C_STAT value is DSTREAM_DC_ERR, only bits <6:19> are valid.
3.5 Miscellaneous Register (MISC) This register is designed so that only writes of 1 affect it. When a 1 is written to any bit in the register, the programmer does not need to be concerned with readmodify-write or the status of any other bits in the register. Once NXM is set, the NXS field is locked. It is unlocked when software clears the NXM field. The ABW (arbitration won) field is locked if either ABW bit is set, so the first CPU to write it locks out the other CPU.
Table 3-5 Miscellaneous Register Name Bits Type Initial State Description RES <63:44> MBZ, RAZ 0 DEVSUP <43:40> WO 0 REV <39:32> RO 1 Latest revision of the Cchip: 1 = Tsunami NXS <31:29> RO 0 NXM source – Device that caused the NXM. Unpredictable if NXM not set. 0 = CPU0, 1 = CPU1. NXM <28> R, W1C 0 Nonexistent memory address detected. Sets DRIR<63> and locks the NXS field until it is cleared. RES <27:25> MBZ, RAZ 0 Reserved.
Table 3-5 Miscellaneous Register (Continued) Name Bits Type Initial State Description ITINTR <7:4> R, W1C 0 Interval timer interrupt pending – one bit per CPU. Pin irq<2> is asserted to the CPU corresponding to a 1 in this field. RES <3:2> MBZ, RAZ 0 Reserved. CPUID <1:0> RO - ID of the CPU performing the read.
3.6 Device Interrupt Request Register (DIRn, n=0,1) These registers indicate which interrupts are pending to the CPUs and indicate the presence of an I/O error condition.
3.7 Pchip Error Register (PERROR) If any bits <11:0> are set, this register is frozen. Only bit <0> can be set after that. All other values are held until all bits <11:0> are clear. When an error occurs and one of the <11:0> bits set, the associated information is captured in bit <63:16>. After the information is captured, the INV bit is cleared, but the information is not valid and should not be used if INV is set.
Table 3-7 Pchip Error Register Name Bits Type Initial State Description SYN <63:56> RO 0 ECC syndrome of error if CRE or UECC. CMD <55:52> RO 0 PCI command when error occurred if not CRE or UECC. If CRE or UECC, then: Value Command 0000 DMA read 0001 DMA read-modify-write 0011 SGTE read Others Reserved INV <51> RO Rev1 RAZ Rv0 0 Info Not Valid – meaningful when one of bits <11:0> is set. Indicates the validity of SYN, CMD, and ADDR bits. Valid = 0, Invalid = 1.
Table 3-7 Pchip Error Register (Continued) Name Bits Type Initial State Description RDPE <7> R,W1C 0 PCI read data parity error as PCI master. TA <6> R, W1C 0 Target abort as PCI master. APE <5> R, W1C 0 Address parity error detected as potential PCI target. SGE <4> R, W1C 0 Scatter-gather had invalid page table entry. DCRTO <3> R, W1C 0 Delayed completion retry timeout as PCI target. PERR <2> R, W1C 0 b_perr_l sampled asserted.
3.8 Failure Register 2 This register, on the I C bus, is locked when there is a power supply or fan failure. Together with the Function Register, fan and power supply failures are identified and reported to the operating system thus notifying it that the system will shut down in 30 seconds. The results of reading this register are displayed by the SRM show power console command.
Table 3-8 Failure Register Name Bits Initial Type State Description PS0_PRESENT_L <7> RO X If the bit is clear, power supply 0 is present. Reserved <6> RO 1 Reserved C/SFAN1_L <5> RO X When set, indicates that either the system fan 1 or the fan on the heatsink on CPU1 failed. Which failed is determined by the state of SYSFAN_OK and CPUFANS_OK in the Function Register. PS1_PRESENT_L <4> /FAN TRAY RO X If the bit is clear, either power supply 1 or the system fan tray is present.
3.9 Function Register 2 The Function Register generates an interrupt on the I C bus if one of the critical functions monitored (power, temperature, fan operation) goes beyond predetermined limits. When such an interrupt is generated, the contents of bits <0, 1, 2, and 5> in the Failure Register are frozen. The system will shut down 30 seconds after the interrupt is posted. The results of reading this register are displayed by the SRM show power console command.
Table 3-9 Function Register Name Bits Type Initial State Description Reserved <7> RO 0 Reserved PS1_OK_L <6> RO X When set, indicates that power supply 1 is functioning properly. PS0_OK_L <5> RO X When set, indicates that power supply 0 is functioning properly. FANTRAY_FAIL_H <4> RO X When clear, indicates that the fantray, if present is functioning properly. CPUFANS_OK <3> RO X When set, indicates that the fans on CPU heatsinks are functioning properly.
Chapter 4 Removal and Replacement This chapter describes removal and replacement procedures for field-replaceable units (FRUs). 4.1 System Safety Observe the safety guidelines in this section to prevent personal injury. CAUTION: Wear an antistatic wrist strap whenever you work on a system. WARNING: When the system is off and plugged into an AC outlet, auxiliary power is still supplied to the system. To remove all power, unplug the power supply.
4.2 FRU List Figure 4-1 shows of the FRU locations and Table 4-1 lists the part numbers of the field-replaceable units.
Table 4-1 Field-Replaceable Unit Part Numbers CPU Modules 54-24758-01 C01 500 MHz CPU, 4 Mbyte cache Memory Modules 54-25066-BA 32 Mbyte DIMM 54-25053-BA 64 Mbyte DIMM 54-25941-KA 128 Mbyte DIMM 54-25941-BA 256 Mbyte DIMM System Backplane, Display, and Support Hardware 54-25756-01 D02 System board 54-25580-01 Server feature module RX23L-AC Floppy RRD47-AC CD-ROM 54-23302-02 OCP assembly 70-31349-01 Speaker assembly Fans 70-31351-01 & -02 Cooling fan 120x120 70-33195-02 Auxiliary coo
Table 4-1 Field-Replaceable Unit Part Numbers (Continued) Power Cords BN26J-1K North America, Japan 12V, 75-inches long BN19H-2E Australia, New Zealand, 2.5m long BN19C-2E Central Europe, 2.5m long BN19A-2E UK, Ireland, 2.5m long BN19E-2E Switzerland 2.5m long BN19K-2E Denmark, 2.5m long BN19Z-2E Italy, 2.5m long BN19S-2E Egypt, India, South Africa, 2.5m long BN18L-2E Israel, 2.
Table 4-1 Field-Replaceable Unit Part Numbers (Continued) System Cables and Jumpers From To 70-31348-01 Interlock switch and pigtail cable Interlock switch assembly Twisted pair (red and black) OCP DC enable power cable from OCP connector 17-04796-01 20 pin signal cable RCM con on system board RCM connector on server feature module 17-04886-01 SCSI CD-ROM signal cable SCSI backplane CD-ROM signal connector 17-04735-01 24 pin power harness Power supply Power transition module 70-33578-01
4.3 System Access Three sheet metal covers, one on top and one on each side, when removed provide access to the system card cage and the power/SCSI sections of the system.
Exposing the System CAUTION: Be sure the system On/Off button is in the “off” position before removing system covers. 1. Shut down the operating system. 2. Press the On/Off button to turn the system off. 3. Unlock and open the door that exposes the storage shelf. 4. Pull down the top cover latch shown in Figure 4-2 until it latches in the down position. 5. Grasp the finger groove at the rear of the top cover and pull it straight back about 2 inches and then lift it off the cabinet. 6.
4.4 CPU Removal and Replacement CAUTION: Make sure all CPU modules are the same variant. Figure 4-3 Removing CPU Module PK1477-98 WARNING: CPU modules and memory modules have parts that operate at high temperatures. Wait 2 minutes after power is removed before touching any module.
Removal 1. Shut down the operating system and turn the system off. 2. Expose the card cage side of the system (see Section 4.3). 3. Detatch the power cable from the CPU. 4. Loosen the two captive screws holding the module to the card cage. 5. Pull the CPU module from the system. Replacement Reverse the steps in the Removal procedure. Verification — DIGITAL UNIX and OpenVMS Systems 1. Bring the system up to the SRM console by pressing the Halt button, if necessary. 2.
4.5 Memory Module Removal and Replacement CAUTION: Several different memory DIMMs work in these systems. Be sure you are replacing the broken DIMM with the same variant. Figure 4-4 Removing Memory IP00315A WARNING: CPU modules and memory DIMMs have parts that operate at high temperatures. Wait 2 minutes after power is removed before touching any module.
Removal 1. Shut down the operating system and turn the system off. 2. Expose the card cage side of the system (see Section 4.3). 3. There are levers on the connectors in each memory slot on the system board. Press both levers in an arc away from the DIMM and gently pull the DIMM from the connector. Replacement Reverse the steps in the Removal procedure. NOTE: Memory DIMMs are installed in banks of four modules of the same size.
4.
Removal 1. Shut down the operating system and turn the system off. 2. Unplug the AC power cord. (Auxiliary power is applied to the server feature module and parts of the system board even when the system is turned off.) 3. Expose the card cage side of the system (see Section 4.3). 4. Remove memory. 5. Remove all CPUs. 6. Remove all PCI and ISA options. 7. From the back of the cabinet, using a Phillips head screwdriver, unscrew the four screws holding the CPU module brace from the system frame.
4.7 PCI/ISA Option Removal and Replacement Figure 4-6 Removing PCI/ISA Option Slot Cover Screws Option Card IP00225 WARNING: To prevent fire, use only modules with current limited outputs. See National Electrical Code NFPA 70 or Safety of Information Technology Equipment, Including Electrical Business Equipment EN 60 950.
Removal 1. Shut down the operating system and turn the system off. 2. Expose the card cage side of the system (see Section 4.3). 3. To remove the faulty option: Disconnect cables connected to the option. Remove cables to other options that obstruct the option you are removing. Unscrew the small Phillips head screw securing the option to the card cage. Slide it from the system. Replacement Reverse the steps in the Removal procedure. Verification — DIGITAL UNIX and OpenVMS Systems 1.
4.
Removal 1. Shut down the operating system and turn the system off. 2. Unplug the AC power cords. (Auxiliary power is applied to the server feature module and parts of the system board even when the system is turned off.) 3. Expose the card cage section of the system (see Section 4.3). 4. Unplug all cables connected to the server feature module. 5.
4.9 Power Supply Removal and Replacement Figure 4-8 Removing Power Supply 4 rear screws 6/32 inch Power Supply 1 (Optional) Power Supply 0 Internal screw 3.
Removal 1. Shut down the operating system and turn the system off. 2. Expose the power section of the system (see Section 4.3). 3. Unplug the AC power cord. (Auxiliary power is applied to the server feature module and parts of the system board even when the system is turned off.) 4. Unplug all the cables to the power supply and unplug the power cables to the transition module. 5.
4.
Removal 1. Shut down the operating system and power down the system. 2. Remove the AC power cords. (Auxiliary power is applied to the server feature module and parts of the system board even when the system is turned off.) 3. Expose both the card cage section and the power section of the system (see Section 4.3). 4. Remove the cable clip between the power section and the card cage section of the system. 5.
4.
Removal 1. Shut down the operating system and turn the system off. 2. Remove the AC power cords. 3. Expose both the card cage section and the power section of the system (see Section 4.3). 4. Remove the power supply(s) (see Section 4.9). 5. Unplug the fan cable connected to the power transition module. 6. Fold the power harness up over the top of the system so that it does not interfere with access to the module. 7.
4.
Removal 1. Shut down the operating system and power down the system. 2. Unplug the AC power cord. 3. Expose the power section of the system (see Section 4.3). 4. Unplug all cables connected to the power transition module. 5. From the rear, remove the four screws holding the auxiliary fan in place. 6. Remove the fan. Replacement Reverse the steps in the Removal procedure. Verification Power up the system.
4.
Removal 1. Shut down the operating system and turn the system off. 2. Unplug the AC power cord. (Auxiliary power is applied to the server feature module and parts of the system board even when the system is turned off.) 3. Expose the card cage side of the system (see Section 4.3). Removing Fan 0 4. Remove the CPU module(s). 5. Unplug the power cord to fan 0 from the server feature module. 6. Unscrew the fan from the frame and remove it from the system. Removing Fan 1 4.
4.
Removal 1. Shut down the operating system and turn the system off. 2. Expose the card cage side of the system (see Section 4.3). 3. Unplug the AC power cord. 4. Loosen the screw that holds the CD-ROM bracket to the system (➊ in Figure 4-13). 5. Detach both the power and the signal connectors at the rear of the CD-ROM. 6. Pull the CD-ROM and the bracket a short distance toward the rear of the system and lift them out of the cabinet. 7.
4.
Removal 1. Shut down the operating system and turn the system off. 2. Expose the card cage side of the system (see Section 4.3). 3. To remove the StorageWorks door: a. Open the door slightly and grab the left edge of the door with your left hand and the right edge of the door with your right hand. b. While pushing the door up, bend it by pulling it away from the system. The door compresses enough so its bottom post slips out of its retaining hole. c.
4.
Removal 1. Shut down the operating system and turn the system off. 2. Expose the card cage side of the system (see Section 4.3). 3. Loosen the two screws holding the CD-ROM to its bracket (see Figure 4-15). 4. Detach both the power and signal connectors at the rear of the CD-ROM. 5. Pull the CD-ROM forward out of the system. Replacement Reverse the steps in the Removal procedure. Verification Power up the system.
4.
Removal 1. Shut down the operating system and turn the system off. 2. Unplug the AC power cords. 3. Expose the card cage side of the system (see Section 4.3). 4. Detatch the power and signal cables from the back of the floppy. 5. Remove the two Phillips head screws holding the floppy in the system (➊ in Figure 4-16). 6. Slide the floppy out the front of the system. Replacement Reverse the steps in the Removal procedure.
4.
Removal 1. Shut down the operating system and turn the system off. 2. Open the front door exposing the StorageWorks disks. 3. Pinch the clips on both sides of the disk and slide it out of the shelf. Replacement Reverse the steps in the Removal procedure. Verification Power up the system. Use the show device console commands to verify that the system sees the disk you replaced.
4.
Removal 1. Shut down the operating system and turn the system off. 2. Unplug the AC power cords. 3. Expose the power section of the system (see Section 4.3). 4. Remove the power and signal cables from the UltraSCSI bus extender on the side of the StorageWorks shelf. 5. Remove the power harness and all signal cables from the StorageWorks backplane and the power transition module and lift it out of the way. 6.
4.
Removal 1. Shut down the operating system and turn the system off. 2. Unplug the AC power cords. 3. Expose the power section of the system. See Section 4.3. 4. Remove the power and signal cables from the UltraSCSI bus extender on the side of the StorageWorks shelf. 5. The UltraSCSI bus extender is mounted on plastic standoffs to which it snaps. Pinch each snap with a pair of needle nose pliers, free the corners, and pull the bus extender off. Replacement Reverse the steps in the Removal procedure.
Appendix A Halts, Console Commands, and Environment Variables This appendix discusses halting the system and provides a summary of the SRM console commands and environment variables. The test command is described in Chapter 2 of this document. For complete reference information on other SRM commands and environment variables, see your system User’s Guide.
A.1 Halt Button Functions The Halt button causes the system to perform in various ways depending upon the system state at the time the button is pressed. When the Halt button is pressed, results differ depending upon the state of the machine. Table A-1 describes the full function of the Halt button. Table A-1 Results of Pressing the Halt Button Machine State OpenVMS running/hung Compaq Tru64 UNIX running/hung Windows NT running/hung AlphaBIOS running/hung SRM console running st SROM (1 2 secs.
A.2 Using the Halt Button Use the Halt button to halt the Compaq Tru64 UNIX or OpenVMS operating system when it hangs or you want to use the SRM console. Use the Halt button to force Windows NT systems to bring up the SRM console rather than booting or halting in AlphaBIOS. Using Halt to Shut Down the Operating System You can use the Halt button if the Compaq Tru64 UNIX or OpenVMS operating system hangs. Pressing the Halt button halts the operating system back to the SRM console firmware.
A.3 Halt Assertion A halt assertion allows you to disable automatic boots of the operating system so that you can perform tasks from the SRM console. Under certain conditions, you might want to force a “halt assertion.” A halt assertion differs from a simple halt in that the SRM console “remembers” the halt. The next time you power up, the system ignores the SRM power-up script (nvram) and ignores any environment variables that you have set to cause an automatic boot of the operating system.
by the RCM reset command to force a halt assertion. Upon reset, the system powers up to the SRM console, but the SRM console does not load the AlphaBIOS console. Clearing a Halt Assertion Clear a halt assertion as follows: • If the halt assertion was caused by pressing the Halt button or remotely entering the RCM halt command, the console uses the halt assertion once, then clears it.
A.4 Summary of SRM Console Commands The SRM console commands are used to examine or modify the system state. Table A-2 Summary of SRM Console Commands Command Function alphabios Loads and starts the AlphaBIOS console. boot Loads and starts the operating system. clear envar Resets an environment variable to its default value. clear password Sets the password to 0. continue Resumes program execution. crash Forces a crash dump at the operating system level.
Table A-2 Summary of SRM Console Commands (Continued) Command Function login Turns off secure mode, enabling access to all SRM console commands during the current session. man Displays information about the specified console command. more Displays a file one screen at a time. prcache Initializes and displays status of the PCI NVRAM. set envar Sets or modifies the value of an environment variable. set host Connects to an MSCP DUP server on a DSSI device.
A.5 Summary of SRM Environment Variables Environment variables pass configuration information between the console and the operating system. Their settings determine how the system powers up, boots the operating system, and operates. Environment variables are set or changed with the set envar command and returned to their default values with the clear envar command. Their values are viewed with the show envar command. The SRM environment variables are specific to the SRM console.
Table A-3 Environment Variable Summary (Continued) Environment Variable Function memory_test Specifies the extent to which memory will be tested. For Compaq Tru64 UNIX systems only. ocp_text Overrides the default OCP display text with specified text. os_type Specifies the operating system and sets the appropriate console interface. pci_parity Disables or enables parity checking on the PCI bus. pk*0_fast Enables fast SCSI mode.
A.6 Recording Environment Variables This worksheet lists all environment variables. Copy it and record the settings for each system. Use the show* command to list environment variable settings.
Table A-4 Environment Variables Worksheet (Continued) Environment Variable System Name System Name System Name pk*0_soft_term sys_model_num sys_serial_num sys_type tga_sync_green tt_allow_login Halts, Console Commands, and Environment Variables A-11
Appendix B Managing the System Remotely This appendix describes how to manage the system from a remote location using the remote console manager (RCM). You can use the RCM from a console terminal at a remote location or from a local console terminal connected to the COM1 port.
B.1 RCM Overview The remote console manager (RCM) monitors and controls the system remotely. The control logic resides on the system board. The RCM is a separate console from the SRM and AlphaBIOS consoles. The SRM and AlphaBIOS firmware reside on the system board. The RCM firmware resides on the server feature module and can only be accessed through COM1. The RCM is run from a serial console terminal or terminal emulator.
B.2 First-Time Setup To set up the RCM to monitor a system remotely, connect the modem to the COM1 port at the back of the system, configure the modem for autoanswer and 9600 baud, and dial in.
B.2.1 Using RCM Locally or with a Modem on COM1 Use the default escape sequence to invoke the RCM mode locally for the first time. You can invoke RCM from the SRM console, the operating system, or an application. The RCM quit command reconnects the terminal to the system console port. 1. To invoke the RCM locally, type the RCM escape sequence. See ➊ in Example B–1 for the default sequence. The escape sequence is not echoed on the terminal or sent to the system.
B.3 RCM Commands The RCM commands given in Table B-1 are used to control and monitor a system remotely. Table B-1 RCM Command Summary Command Function halt Halts the server. Emulates pressing the Halt button and immediately releasing it. haltin Causes a halt assertion. Emulates pressing the Halt button and holding it in. haltout Terminates a halt assertion created with haltin. Emulates releasing the Halt button after holding it in. help or ? Displays the list of commands.
Command Conventions • • • • • The commands are not case sensitive. A command must be entered in full. You can delete an incorrect command with the Backspace key before you press Enter. If you type a valid RCM command, followed by extra characters, and press Enter, the RCM accepts the correct command and ignores the extra characters. If you type an incorrect command and press Enter, the command fails with the message: *** ERROR - unknown command *** halt The halt command halts the managed system.
help or ? The help or ? command displays the RCM firmware commands. poweroff The poweroff command requests the RCM to power off the system. The poweroff command is equivalent to pressing the On/Off button on the control panel to the off position. RCM>poweroff If the system is already powered off or if switch 3 (RPD DIS) on the switchpack has been set to the on setting (disabled), this command has no immediate effect.
quit The quit command exits the user from command mode and reconnects the serial terminal to the system console port. The following message is displayed: Focus returned to COM port Upon entering a carriage return, the system returns to either the console or the operating system depending upon which was running when the RCM was invoked. reset The reset command requests the RCM to reset the hardware. The reset command is equivalent to pressing the Reset button on the control panel.
status The status command displays the current state of the system sensors, as well as the current escape sequence and alarm information. The following is an example of the display. RCM>status Firmware Rev: V2.0 Escape Sequence: ^]^]RCM Remote Access: ENABLE Temp (C): 26.0 RCM Power Control: ON RCM Halt: Deasserted External Power: ON Server Power: ON RCM> The status fields are explained in Table B-2. Table B-2 RCM Status Command Fields Item Description Firmware Rev: Revision of RCM firmware.
B.4 Using the RCM Switchpack The RCM operating mode is controlled by a switchpack on the server feature module located in the fan area between the system card cage and the front of the system. Use the switches to enable or disable certain RCM functions, if desired. Figure B-2 Location of RCM Switchpack on Server Feature Module 12 34 RCM Switchpack PK1472-98 Switch Name Description 1 EN RCM Enables or disables the RCM. The default is ON (RCM enabled). The OFF setting disables RCM.
Uses of the Switchpack You can use the RCM switchpack to change the RCM operating mode or disable the RCM altogether. The following are conditions when you might want to change the factory settings. • Switch 1 (EN RCM)—Set this switch to OFF (disable) if you want to reset the baud rate of the COM1 port to a value other than the system default of 9600. You must disable RCM to select a baud rate other than 9600. • Switch 2 (Reserved)—Reserved. • Switch 3 (RPD DIS).
Resetting the RCM to Factory Defaults You can reset the RCM to factory settings, if desired. You would need to do this if you forgot the escape sequence for the RCM. Follow the steps below. 1. Turn off the system. 2. Unplug the AC power cords. NOTE: If you do not unplug the power cords, the reset will not take effect when you power up the system. 3. Remove the system covers. See Section 3.2. 4. Locate the RCM switchpack on the server feature module and set switch 4 to ON. 5.
B.5 Troubleshooting Guide Table B-3 is a list of possible causes and suggested solutions for symptoms you might see. Table B-3 RCM Troubleshooting Symptom Possible Cause Suggested Solution The local console terminal is not accepting input. Cables not correctly installed. Check external cable installation. Switch 1 on switchpack set to disable. Set switch 1 to ON. The console terminal is displaying garbage. System and terminal baud rate set incorrectly.
Appendix C Firmware Update This appendix provides instructions on updating firmware.
C.1 Updating Firmware and Consoles Start the Loadable Firmware Update (LFU) utility by issuing the lfu command at the SRM console prompt, booting it from the CD-ROM while in the SRM console, or selecting Update AlphaBIOS in the AlphaBIOS Setup screen. Example C–1 Starting LFU from the SRM Console P00>>> lfu ***** Loadable Firmware Update Utility ***** Select firmware load device (cda0, dva0, ewa0), or Press to bypass loading and proceed to LFU: cda0 . .
• From the SRM console, start LFU by issuing the lfu command (see Example C–1). Also from the SRM console, LFU can be booted from the Alpha CD-ROM (V5.4 or later), as shown in Example C–2. • From the AlphaBIOS console, select Update AlphaBIOS from the AlphaBIOS Setup screen (see Figure C–1). A typical update procedure is: 1. Start LFU. 2. Use the LFU list command to show the revisions of modules that LFU can update and the revisions of update firmware. 3.
C.1.1 Updating Firmware from the CD-ROM Insert the Alpha CD-ROM, start LFU, and select cda0 as the load device. Example C–3 Updating Firmware from the CD-ROM ***** Loadable Firmware Update Utility ***** Select firmware load device (cda0, dva0, ewa0), or Press to bypass loading and proceed to LFU: cda0 ➊ Please enter the name of the options firmware files list, or Press to use the default filename [AS1400FW]: AS1400CP ➋ Copying Copying Copying Copying AS1400CP from DKA500.5.0.1.1 .
➊ Select the device from which firmware will be loaded. The choices are the internal CD-ROM, the internal floppy disk, or a network device. In this example, the internal CD-ROM is selected. ➋ Select the file that has the firmware update, or press Enter to select the default file. The file options are: AS1400FW (default) SRM console, AlphaBIOS console, and I/O adapter firmware. AS1400CP SRM console and AlphaBIOS console firmware only. AS1400IO I/O adapter firmware only.
Example C–3 Updating Firmware from the CD-ROM (Continued) UPD> update * ➎ WARNING: updates may take several minutes to complete for each device. Confirm update on: AlphaBIOS AlphaBIOS ➏ DO NOT ABORT! Updating to V6.40-1... Verifying V6.40-1... PASSED. Confirm update on: srmflash srmflash PASSED. [Y/(N)] y [Y/(N)] y DO NOT ABORT! Updating to V6.0-3... Verifying V6.0-3...
➎ The update command updates the device specified or all devices. In this example, the wildcard indicates that all devices supported by the selected update file will be updated. ➏ For each device, you are asked to confirm that you want to update the firmware. The default is no. Once the update begins, do not abort the operation. Doing so will corrupt the firmware on the module. ➐ The exit command returns you to the console from which you entered LFU (either SRM or AlphaBIOS).
C.1.2 Updating Firmware from Floppy Disk — Creating the Diskettes Create the update diskettes before starting LFU. See Section C.1.3 for an example of the update procedure. Table C–1 File Locations for Creating Update Diskettes on a PC Console Update Diskette I/O Update Diskette AS1400FW.TXT AS1400IO.TXT AS1400CP.TXT TCREADME.SYS TCREADME.SYS CIPCA315.SYS TCSRMROM.SYS DFPAA310.SYS TCARCROM.SYS KZPAAA11.
Example C–4 Creating Update Diskettes on an OpenVMS System Console update diskette $ $ $ $ $ $ $ $ $ $ $ $ $ $ inquire ignore "Insert blank HD floppy in DVA0, then continue" set verify set proc/priv=all init /density=hd/index=begin dva0: tcods2cp mount dva0: tcods2cp create /directory dva0:[as1400] copy tcreadme.sys dva0:[as1400]tcreadme.sys copy AS1400fw.txt dva0:[as1400]as1400fw.txt copy AS1400cp.txt dva0:[as1400]as1400cp.txt copy tcsrmrom.sys dva0:[as1400]tcsrmrom.sys copy tcarcrom.
C.1.3 Updating Firmware from Floppy Disk — Performing the Update Insert an update diskette (see Section C.1.2) into the floppy drive. Start LFU and select dva0 as the load device.
➊ Select the device from which firmware will be loaded. The choices are the internal CD-ROM, the internal floppy disk, or a network device. In this example, the internal floppy disk is selected. ➋ Select the file that has the firmware update, or press Enter to select the default file. When the internal floppy disk is the load device, the file options are: AS1400CP (default) SRM console and AlphaBIOS console firmware only. AS1400IO I/O adapter firmware only.
Example C–5 Updating Firmware from the Floppy Disk (Continued) UPD> update pfi0 ➍ WARNING: updates may take several minutes to complete for each device. Confirm update on: pfi0 pfi0 [Y/(N)] y ➎ DO NOT ABORT! Updating to 3.10... Verifying to 3.10... PASSED.
➍ ➎ The update command updates the device specified or all devices. ➏ The lfu command restarts the utility so that console firmware can be updated. (Another method is shown in Example C–6, where the user specifies the file AS1400FW and is prompted to insert the second diskette.) ➐ The default update file, AS1400CP, is selected. The console firmware can now be updated, using the same procedure as for the I/O firmware.
C.1.4 Updating Firmware from a Network Device Copy files to the local MOP server’s MOP load area, start LFU, and select ewa0 as the load device.
Before starting LFU, download the update files from the Internet. You will need the files with the extension .SYS. Copy these files to your local MOP server’s MOP load area. ➊ Select the device from which firmware will be loaded. The choices are the CDROM, the internal floppy disk, or a network device. In this example, a network device is selected. ➋ Select the file that has the firmware update, or press Enter to select the default file.
Example C–7 Updating Firmware from a Network Device (Continued) UPD> update * -all ➍ WARNING: updates may take several minutes to complete for each device. AlphaBIOS DO NOT ABORT! Updating to V6.40-1... Verifying V6.40-1... PASSED. kzpsa0 DO NOT ABORT! Updating to A11 ... Verifying A11... PASSED. kzpsa1 DO NOT ABORT! Updating to A11 ... Verifying A11... PASSED. srmflash DO NOT ABORT! Updating to V6.0-3... Verifying V6.0-3... PASSED.
➍ The update command updates the device specified or all devices. In this example, the wildcard indicates that all devices supported by the selected update file will be updated. Typically LFU requests confirmation before updating each console’s or device’s firmware. The -all option eliminates the update confirmation requests. ➎ The exit command returns you to the console from which you entered LFU (either SRM or AlphaBIOS).
C.1.5 LFU Commands The commands summarized in Table C–2 are used to update system firmware. Table C–2 LFU Command Summary Command Function display Shows the physical configuration of the system. exit Terminates the LFU program. help Displays the LFU command list. lfu Restarts the LFU program. list Displays the inventory of update firmware on the selected device. readme Lists release notes for the LFU program. update Writes new firmware to the module.
display The display command shows the physical configuration of the system. Display is equivalent to issuing the SRM console command show configuration. Because it shows the slot for each module, display can help you identify the location of a device. exit The exit command terminates the LFU program, causes system initialization and testing, and returns the system to the console from which LFU was called. help The help (or ?) command displays the LFU command list, shown below.
list The list command displays the inventory of update firmware on the CD-ROM, network, or floppy. Only the devices listed at your terminal are supported for firmware updates. The list command shows three pieces of information for each device: • Current Revision — The revision of the device’s current firmware • Filename — The name of the file used to update that firmware • Update Revision — The revision of the firmware update image readme The readme command lists release notes for the LFU program.
Index ? ? command, RCM · B-7 A Architecture, block diagram · 1-8 Alpha 21264 microprocessor · 1-8, 1-11 Alpha chip composition · 1-11 AlphaBIOS console · 1-7, 2-7 auto_action environment variable, SRM · 2-7 Auxiliary fan removal and replacement · 4-24 B Beep codes · 2-2 Buses CAP · 1-9, 1-19 ISA · 1-9 memory data bus · 1-19 PAD · 1-9, 1-19 PCI · 1-9 TIG · 1-9, 1-19 XBUS · 1-9, 1-21 C CAP bus · 1-9, 1-19 Cbox Read Register · 3-7 C-chip · 1-19 CD-ROM removal and replacement · 4-32 Command summary (SRM) · A
auto_action · 2-7 os_type · 2-7 Error registers · 3-1 exit command, LFU · C-3, C-7, C-13, C17, C-18, C-19 External Interface Address register · 3-2 F Failure Register · 3-16 Fan removal and replacement (auxiliary) · 424 removal and replacement (system) · 426 Firmware RCM · B-5 updating · C-3 updating from CD-ROM · C-4 updating from floppy disk · C-8, C-10 updating from network device · C-14 updating, AlphaBIOS selection · C-2 updating, SRM command · C-2 Floppy removal and replacement · 4-34 FRU list · 4-2
Memory Management Status Register · 3-3 MISC register · 3-9 Miscellaneous Register · 3-9 MM_STAT register · 3-3 Modem using in RCM · B-3 O Operating the system remotely · B-2 Operator control panel · 1-4 removal and replacement · 4-30 os_type environment variable, SRM · 2-7 P PAD bus · 1-9, 1-19 P-chip · 1-19 Pchip Error Register · 3-13 PCI bus · 1-9 PCI slot numbering · 1-21 PCI/ISA option removal and replacement · 4-14 PERROR register · 3-13 Power circuit · 1-28 failures · 1-29 Power control logic · 1-2
removal and replacement · 4-16 setesc command, RCM · B-8 SRM console · 1-7, 2-7 status command, RCM · B-9 StorageWorks · 1-39 backplane removal and replacement · 438 disk removal and replacement · 4-36 repeater removal and replacement · 4-40 Switch-based interconnect · 1-9, 1-17, 1-18 C-chip · 1-9 D-chip · 1-9 System architecture · 1-8 fully configured · 1-9 remote operation · B-2 System access · 4-6 System board · 1-9, 1-16 cross-bar switch · 1-18 PCI I/O subsystem section · 1-20 remote control logic secti