AlphaServer 1000 Service Guide Order Number: EK–DTLSV–SV.
First Printing, February 1995 Second Printing, July 1995 Digital Equipment Corporation makes no representations that the use of its products in the manner described in this publication will not infringe on existing or future patent rights, nor do the descriptions contained in this publication imply the granting of licenses to make, use, or sell equipment or software in accordance with the description.
Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi 1 Troubleshooting Strategy 1.1 1.1.1 1.2 1.3 Troubleshooting the System Problem Categories . . . Service Tools and Utilities . Information Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Running System Diagnostics 3.1 3.2 3.3 3.3.1 3.3.2 3.3.3 3.3.4 3.3.5 3.3.6 3.3.7 3.3.8 3.3.9 3.4 3.5 Running ROM-Based Diagnostics . . . Command Summary . . . . . . . . . . . . . Command Reference . . . . . . . . . . . . . test . . . . . . . . . . . . . . . . . . . . . . . . cat el and more el . . . . . . . . . . . . memory . . . . . . . . . . . . . . . . . . . . netew . . . . . . . . . . . . . . . . . . . . . . network . . . . . . . . . . . . . . . . . . . . net -s . . . . . . . . . . . . . . . . . . .
.2.2 5.3 5.4 5.5 5.5.1 5.6 5.6.1 5.6.2 5.6.3 5.6.4 5.7 5.8 5.8.1 5.8.2 5.9 5.10 Memory Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Motherboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EISA Bus Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ISA Bus Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Identifying ISA and EISA options . . . . . . . . . . . . . . . . EISA Configuration Utility . . . . . . . . . . . .
A Default Jumper Settings A.1 A.2 A.3 Motherboard Jumpers . . . . . . . . . . . . . . . . . . . . . . . . . . . . CPU Daughter Board (J3 and J4) Supported Settings . . . . CPU Daughter Board (J1 Jumper) . . . . . . . . . . . . . . . . . . . A–2 A–4 A–6 Sample Hardware Configuration Display . . . . . . . . . . . 5–6 Glossary Index Examples 5–1 Figures 2–1 2–2 2–3 2–4 2–5 2–6 5–1 5–2 5–3 5–4 5–5 5–6 5–7 5–8 5–9 5–10 5–11 5–12 vi Jumper J1 on the CPU Daughter Board . . . . . .
6–1 6–2 6–3 6–4 6–5 6–6 6–7 6–8 6–9 6–10 6–11 6–12 6–13 6–14 6–15 6–16 6–17 6–18 6–19 6–20 6–21 6–22 6–23 6–24 6–25 6–26 6–27 6–28 6–29 6–30 6–31 6–32 6–33 FRUs, Front Right . . . . . . . . . . . . . . . . . . . . . . . . . . . . FRUs, Rear Left . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Opening Front Door . . . . . . . . . . . . . . . . . . . . . . . . . . . Removing Top Cover and Side Panels . . . . . . . . . . . . . Floppy Drive Cable (34-Pin) . . . . . . . . . . . . . . . . . . . . .
6–34 6–35 6–36 6–37 6–38 6–39 A–1 A–2 A–3 A–4 Removing the OCP Module . . . . . . . . . . . . . . . Removing Power Supply . . . . . . . . . . . . . . . . . Removing Speaker . . . . . . . . . . . . . . . . . . . . . . Removing a CD–ROM Drive . . . . . . . . . . . . . . Removing a Tape Drive . . . . . . . . . . . . . . . . . . Removing a Floppy Drive . . . . . . . . . . . . . . . . . Motherboard Jumpers (Default Settings) . . . . . AlphaServer 1000 4/200 CPU Daughter Board (Jumpers J3 and J4) . . . . . . . .
5–6 5–7 6–1 6–2 Summary of Procedure for Configuring EISA Bus (EISA Options Only) . . . . . . . . . . . . . . . . . . . . . . . . . . Summary of Procedure for Configuring EISA Bus with ISA Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . AlphaServer 1000 FRUs . . . . . . . . . . . . . . . . . . . . . . . . Power Cord Order Numbers . . . . . . . . . . . . . . . . . . . . .
Preface This guide describes the procedures and tests used to service AlphaServer 1000 systems. AlphaServer 1000 systems use a deskside ‘‘wide-tower’’ enclosure. Intended Audience This guide is intended for use by Digital Equipment Corporation service personnel and qualified self-maintenance customers.
Convention Meaning Return A key name enclosed in a box indicates that you press that key. Ctrl/x Ctrl/x indicates that you hold down the Ctrl key while you press another key, indicated here by x. In examples, this key combination is enclosed in a box, for example, Ctrl/C . Warning Warnings contain information to prevent personal injury. Caution Cautions provide information to prevent damage to equipment or software.
1 Troubleshooting Strategy This chapter describes the troubleshooting strategy for AlphaServer 1000 systems. • Section 1.1 provides questions to consider before you begin troubleshooting an AlphaServer 1000 system. • Tables 1–1 through 1–5 provide a diagnostic flow for each category of system problem. • Section 1.2 lists the product tools and utilities. • Section 1.3 lists available information services. 1.
1.1.1 Problem Categories System problems can be classified into the following five categories. Using these categories, you can quickly determine a starting point for diagnosis and eliminate the unlikely sources of the problem. 1. Power problems (Table 1–1) 2. No access to console mode (Table 1–2) 3. Console-reported failures (Table 1–3) 4. Boot failures (Table 1–4) 5.
Table 1–1 Diagnostic Flow for Power Problems Symptom Action System does not power on. Power supply shuts down after a few seconds (fan failure). • Check the power source and power cord. • Check that the system’s top cover is properly secured. A safety interlock switch shuts off power to the system if the top cover is removed. • If there are two power supplies, make sure both power supplies are plugged in. • Check the On/Off switch setting on the operator control panel.
Table 1–2 Diagnostic Flow for Problems Getting to Console Mode Symptom Action Power-up screen is not displayed. Interpret the error beep codes at power-up (Section 2.1) for a failure detected during self-tests. Check that the keyboard and monitor are properly connected and turned on. If the power-up screen is not displayed, yet the system enters console mode when you press Return , check that the console environment variable is set correctly.
Table 1–3 Diagnostic Flow for Problems Reported by the Console Program Symptom Action Power-up tests do not complete. Interpret the error beep codes at power-up (Section 2.1) and check the power-up screen (Section 2.2) for a failure detected during self-tests. If the power-up display stops on e6, an EISA or PCI board is causing the system to hang. Console program reports error: • Error beep codes report an error at power-up. • Power-up screen includes error messages.
Table 1–4 Diagnostic Flow for Boot Problems Symptom Action System cannot find boot device. Check the system configuration for the correct device parameters (node ID, device name, and so on). • For DEC OSF/1 and OpenVMS, use the show config and show device commands (Section 5.1). • For Windows NT, use the Display Hardware Configuration display and the Set Default Environment Variables display (Section 5.1). Check the system configuration for the correct environment variable settings.
Table 1–5 Diagnostic Flow for Errors Reported by the Operating System Symptom System is hung or has crashed. Action Examine the crash dump file. Refer to OpenVMS AXP Alpha System Dump Analyzer Utility Manual for information on how to interpret OpenVMS crash dump files. Refer to the Guide to Kernel Debugging (AA–PS2TA– TE) for information on using the DEC OSF/1 Krash Utility. Errors have been logged and the operating system is up. Examine the operating system error log files to isolate the problem.
RECOMMENDED USE: ROM-based diagnostics are the primary means of testing the console environment and diagnosing the CPU, memory, Ethernet, I/O buses, and SCSI subsystems. Use ROM-based diagnostics in the acceptance test procedures when you install a system, add a memory module, or replace the following: CPU module, memory module, motherboard, I/O bus device, or storage device. Refer to Chapter 3 for information on running ROM-based diagnostics.
Crash Dumps For fatal errors, such as fatal bugchecks, DEC OSF/1 and OpenVMS operating systems will save the contents of memory to a crash dump file. RECOMMENDED USE: Crash dump files can be used to determine why the system crashed. To save a crash dump file for analysis, you need to know the proper system settings. Refer to the OpenVMS AXP Alpha System Dump Analyzer Utility Manual (AA-PV6UB-TE) or the Guide to Kernel Debugging (AA–PS2TA–TE) for DEC OSF/1. 1.
2 Power-Up Diagnostics and Display This chapter provides information on how to interpret error beep codes and the power-up display on the console screen. In addition, a description of the power-up and firmware power-up diagnostics is provided as a resource to aid in troubleshooting. • Section 2.1 describes how to interpret error beep codes at power-up. • Section 2.1.1 describes SROM memory tests that can be run at power-up to isolate failing SIMM memory. • Section 2.
2.1 Interpreting Error Beep Codes If errors are detected at power-up, audible beep codes are emitted from the system. For example, if the SROM code could not find any good memory, you would hear a 1-3-3 beep code (one beep, a pause, a burst of three beeps, a pause, and another burst of three beeps). The beep codes are the primary diagnostic tool for troubleshooting problems when console mode cannot be accessed. Refer to Table 2–1 for information on interpreting error beep codes.
Table 2–1 (Cont.) Interpreting Error Beep Codes Beep Code Problem 1-3-3 No usable memory detected. Corrective Action 1. Verify that the memory modules are properly seated and try powering up again. 2. Swap bank 0 memory with known good memory and run SROM memory tests at powerup (Section 2.1.1). 3. If populating bank 0 with known good memory does not solve the problem, replace the CPU daughter board (Chapter 6). 4.
Table 2–2 SROM Memory Tests, CPU Jumper J1 Bank # 6 Test Description Test Results Backup Cache Tag Test Test status displays on OCP: 1.2.3.done. If the tests take longer than a few seconds between each number displayed in the test count, there is a problem with the cache—replace the CPU daughter board (Chapter 6). 2 Cache Test: Tests backup cache. Test status displays on OCP: ....done.
Table 2–2 (Cont.) SROM Memory Tests, CPU Jumper J1 Bank # 5 Test Description Test Results Memory Test, Cache Enabled: Tests memory with backup and data cache enabled. Test status displays on OCP: 12345.done. If an error is detected, the bank number and failing SIMM position are displayed. The following OCP message indicates a failing SIMM at bank 0, SIMM position 2. FAIL B:0 S:2 Test duration: Approximately 2 seconds per 8 megabytes of memory.
Figure 2–1 Jumper J1 on the CPU Daughter Board J1 0 1 2 3 4 5 6 7 MA00328 Bank Jumper Setting 0 Standard boot setting (default) 1 Mini-console setting: Internal use only 2 SROM CacheTest: backup cache test 3 SROM BCacheTest: backup cache and memory test 4 SROM memTest: memory test with backup and data cache disabled 5 SROM memTestCacheOn: memory test with backup and data cache enabled 6 SROM BCache Tag Test: backup cache tag test 7 Fail-Safe Loader setting: selects fail-safe loader firmwa
Figure 2–2 AlphaServer 1000 Memory Layout Bank 3 Bank 2 Bank 1 Bank 0 ECC Banks SIMM 1 SIMM 3 SIMM 0 SIMM 2 SIMM 1 SIMM 3 SIMM 0 SIMM 2 SIMM 1 SIMM 3 SIMM 0 SIMM 2 SIMM 1 SIMM 3 SIMM 0 SIMM 2 ECC SIMM for Bank 2 ECC SIMM for Bank 3 ECC SIMM for Bank 0 ECC SIMM for Bank 1 MA00327 2.2 Power-Up Screen During power-up self-tests, the test status and result are displayed on the console terminal. Information similar to the following example should be displayed on the screen. ff.fe.fd.fc.fb.
Windows NT Systems The Windows NT operating system is supported by the ARC firmware (see Section 5.1.1). Systems using Windows NT power up to the ARC boot menu as follows: ARC Multiboot Alpha AXP Version n.nn Copyright (c) 1994 Microsoft Corporation Copyright (c) 1994 Digital Equipment Corporation Boot menu: Boot Windows NT Boot an alternate operating system Run a program Supplementary menu... Use the arrow keys to select, then press Enter. 2.2.
2.3 Mass Storage Problems Indicated at Power-Up Mass storage failures at power-up are usually indicated by read fail messages. Other problems are indicated by storage devices missing from the show config display. • Table 2–3 provides information for troubleshooting mass storage problems indicated at power-up or storage devices missing from the show config display. • Table 2–4 provides troubleshooting tips for AlphaServer systems that use the SWXCR-xx controller. • Section 2.
Table 2–3 (Cont.) Mass Storage Problems Problem Symptom Corrective Action SCSI bus length exceeded Drives may disappear intermittently from the show config and show device displays. A SCSI bus extended to the internal StorageWorks shelf with the backplane configured as a single bus, cannot be extended outside of the enclosure. A SCSI bus extended to the internal StorageWorks shelf with the backplane configured as a dual bus, can be extended 1 meter outside of the enclosure.
Table 2–4 Troubleshooting Problems with SWXCR-xx RAID Controller Symptom Action Some RAID drives do not appear on the show device d display. Valid configured RAID logical drives will appear as DRA0–DRAn, not as DKn. Configure the drives by running the RAID Configuration Utility (RCU). Follow the instructions in the StorageWorks RAID Array 200 Subsystem Family Installation and Configuration Guide, EK-SWRA2-IG. Reminder: Several physical disks can be grouped as a single logical DRAn device.
For information on other storage devices, refer to the documentation provided by the manufacturer or vendor.
Figure 2–5 CD–ROM Drive Activity LED Activity LED MA00333 Power-Up Diagnostics and Display 2–13
2.5 EISA Bus Problems Indicated at Power-Up EISA bus failures at power-up are usually indicated by the following messages displayed during power-up: EISA Configuration Error. Run the EISA Configuration Utility. Run the EISA Configuration Utility (ECU) (Section 5.4) when this message is displayed. Other problems are indicated by EISA devices missing from the show config display. Table 2–5 provides steps for troubleshooting EISA bus problems that persist after you run the ECU.
2.5.1 Additional EISA Troubleshooting Tips The following tips can aid in isolating EISA bus problems. • Peripheral device controllers need to be seated (inserted) carefully, but firmly, into their slot to make all necessary contacts. Improper seating is a common source of problems for EISA modules. • Be sure you run the correct version of ECU for the operating system. For windows NT, use ECU diskette DECpc AXP (AK-PYCJ*-CA); for DEC OSF/1 and OpenVMS, use ECU diskette DECpc AXP (AK-Q2CR*-CA).
2.6 PCI Bus Problems Indicated at Power-Up PCI bus failures at power-up are usually indicated by the inability of the system to see the device. Table 2–6 provides steps for troubleshooting PCI bus problems. Use the table to diagnose the likely cause of the problem. Note Some PCI devices do not implement PCI parity, and some have a paritygenerating scheme in which parity is sometimes incorrect or is not compliant with the PCI Specification.
2.7 Fail-Safe Loader The fail-safe loader (FSL) allows you to attempt to recover when one of the following is the cause of a problem getting to the console program under normal power-up: • A power failure or accidental power down during a firmware upgrade • A checksum failure or flash ROM header error while the SROM code is trying to load the SRM/ARC console firmware Note The fail-safe loader should be used only when a failure at power-up prohibits you from getting to the console program.
Figure 2–6 Jumper J1 on the CPU Daughter Board J1 0 1 2 3 4 5 6 7 MA00328 Bank Jumper Setting 0 Standard boot setting (default) 1 Mini-console setting: Internal use only 2 SROM CacheTest: backup cache test 3 SROM BCacheTest: backup cache and memory test 4 SROM memTest: memory test with backup and data cache disabled 5 SROM memTestCacheOn: memory test with backup and data cache enabled 6 SROM BCache Tag Test: backup cache tag test 7 Fail-Safe Loader setting: selects fail-safe loader firmwa
2.8 Power-Up Sequence During the AlphaServer 1000 power-up sequence, the power supplies are stabilized and the system is initialized and tested through the firmware power-on self-tests. The power-up sequence includes the following: • • Power supply power-up: – AC power-up – DC power-up Two sets of power-on diagnostics: – Serial ROM diagnostics – Console firmware-based diagnostics Caution The AlphaServer 1000 enclosure will not power up if the top cover is not securely attached.
2.8.2 DC Power-Up Sequence DC power is applied to the system with the DC On/Off button on the operator control panel. A summary of the DC power-up sequence follows: 1. When the DC On/Off button is pressed, the power supply checks for a POK_H condition. 2. 12V, 5V, 3.3V, and -12V outputs are energized and stabilized. If the outputs do not come into regulation, the power-up is aborted and the power supply enters the latching-shutdown mode. 2.
4. Configure the memory in the system and test only the first 4 MB of memory. If there is more than one memory module of the same size, the lowest numbered memory module (one closest to the CPU) is tested first. If the memory test fails, the failing bank is mapped out and memory is reconfigured and re-tested. Testing continues until good memory is found. If good memory is not found, an error beep code (1-3-3) is generated and the power-up tests are terminated. 5.
5. Enter console mode or boot the operating system. This action is determined by the auto_action environment variable. If the os_type environment variable is set to NT, the ARC console is loaded into memory, and control is passed to the ARC console.
3 Running System Diagnostics This chapter provides information on how to run system diagnostics. • Section 3.1 describes how to run ROM-based diagnostics, including error reporting utilities and loopback tests. • Section 3.4 describes acceptance testing and initialization procedures. • Section 3.5 describes the DEC VET operating system exerciser. 3.
3.2 Command Summary Table 3–1 provides a summary of the diagnostic and related commands. Table 3–1 Summary of Diagnostic and Related Commands Command Function Reference Acceptance Testing test Quickly tests the core system. The test command is the primary diagnostic for acceptance testing and console environment diagnosis. Section 3.3.1 cat el Displays the console event log. Section 3.3.2 more el Displays the console event log one screen at a time. Section 3.3.
Table 3–1 (Cont.) Summary of Diagnostic and Related Commands Command Function Reference test lb Conducts loopback tests for COM2 and the parallel port in addition to quick core system tests. Section 3.3.1 netew Runs external mop loopback tests for specified EISAor PCI-based ew* (DECchip 21040, TULIP) Ethernet ports. Section 3.3.4 network Runs external mop loopback tests for specified EISAor PCI-based er* (DEC 4220, LANCE) Ethernet ports. Section 3.3.
3.3.1 test The test command runs firmware diagnostics for the entire core system. The tests are run concurrently in the background. Fatal errors are reported to the console terminal. The cat el command should be used in conjunction with the test command to examine test/error information reported to the console event log. Because the tests are run concurrently and indefinitely (until you stop them with the kill_diags command), they are useful in flushing out intermittent hardware problems.
5. VGA console tests. These tests are run only if the console environment variable is set to ‘‘serial.’’ The VGA console test displays rows of the letter ‘‘H’’. Synopsis: test [lb] Argument: [lb] The loopback option includes console loopback tests for the COM2 serial port and the parallel port during the test sequence. Examples: The system is tested and the tests complete successfully. Note Examine the console event log after running tests.
ID Program -------- -----------00000001 idle 0000002d exer_kid 0000003d nettest 00000045 memtest 00000052 exer_kid 00000053 exer_kid >>> kill_diags >>> Device Pass Hard/Soft Bytes Written Bytes Read ------------ ------ --------- ------------- ------------system 0 0 0 0 0 tta1 0 0 0 1 0 era0.0.0.2.1 43 0 0 1376 1376 memory 7 0 0 424673280 424673280 dka100.1.0.6 0 0 0 0 2688512 dka200.2.0.6 0 0 0 0 922624 The system is tested and the system reports a fatal error message.
3.3.2 cat el and more el The cat el and more el commands display the current contents of the console event log. Status and error messages (if problems occur) are logged to the console event log at power-up, during normal system operation, and while running system tests. Standard error messages are indicated by asterisks (***). When cat el is used, the contents of the console event log scroll by. You can use the Ctrl/S combination to stop the screen from scrolling, Ctrl/Q to resume scrolling.
3.3.3 memory The memory command tests memory by running a memory exerciser each time the command is entered. The exercisers are run in the background and nothing is displayed unless an error occurs. The number of exercisers, as well as the length of time for testing, depends on the context of the testing. Generally, running three to five exercisers for 15 minutes to 1 hour is sufficient for troubleshooting most memory problems.
Example with a memory compare error indicating bad SIMMs.
3.3.4 netew The netew command is used to run MOP loopback tests for any EISA- or PCIbased ew* (DECchip 21040, TULIP) Ethernet ports. The command can also be used to test a port on a ‘‘live’’ network. The loopback tests are set to run continuously (-p pass_count set to 0). Use the kill command (or Ctrl/C ) to terminate an individual diagnostic or the kill_diags command to terminate all diagnostics. Use the show_status display to determine the process ID when terminating an individual diagnostic test.
Testing an Ethernet Port: >>> netew >>> show_status ID Program -------- -----------00000001 idle 000000d5 nettest >>> kill_diags >>> Device Pass Hard/Soft Bytes Written Bytes Read ------------ ------ --------- ------------- ------------system 0 0 0 0 0 ewa0.0.0.0.
3.3.5 network The network command is used to run MOP loopback tests for any EISA- or PCIbased er* (DEC 4220, LANCE) Ethernet ports. The command can also be used to test a port on a ‘‘live’’ network. The loopback tests are set to run continuously (-p pass_count set to 0). Use the Ctrl/C ) to terminate an individual diagnostic or the kill_diags command to terminate all diagnostics. Use the show_status display to determine the process ID when terminating an individual diagnostic test.
Testing an Ethernet Port: >>> network >>> show_status ID Program -------- -----------00000001 idle 000000d5 nettest >>> kill_diags >>> Device Pass Hard/Soft Bytes Written Bytes Read ------------ ------ --------- ------------- ------------system 0 0 0 0 0 era0.0.0.0.
3.3.6 net -s The net -s command displays the MOP counters for the specified Ethernet port.
3.3.7 net -ic The net -ic command initializes the MOP counters for the specified Ethernet port.
3.3.8 kill and kill_diags The kill and kill_diags commands terminate diagnostics that are currently executing . Note A serial loopback connector (12-27351-01) must be installed on the COM2 serial port for the kill_diags command to successfully terminate system tests. • The kill command terminates a specified process. • The kill_diags command terminates all diagnostics. Synopsis: kill_diags kill [PID . . . ] Argument: [PID . . . ] The process ID of the diagnostic to terminate.
3.3.9 show_status The show_status command reports one line of information per executing diagnostic. The information includes ID, diagnostic program, device under test, error counts, passes completed, bytes written, and bytes read. Many of the diagnostics run in the background and provide information only if an error occurs. Use the show_status command to display the progress of diagnostics.
3.4 Acceptance Testing and Initialization Perform the acceptance testing procedure listed below after installing a system or whenever adding or replacing the following: Memory modules Motherboard CPU daughter board Storage devices EISA or PCI options 1. Run the RBD acceptance tests using the test command. 2. If you have added, moved, or removed an EISA or ISA option, run the EISA Configuration Utility (ECU). 3. Bring up the operating system. 4.
4 Error Log Analysis This chapter provides information on how to interpret error logs reported by the operating system. • Section 4.1 describes machine check/interrupts and how these errors are detected and reported. • Section 4.2 describes the entry format used by the error formatters. • Section 4.3 describes how to generate a formatted error log using the DECevent Translation and Reporting Utility available with OpenVMS and Digital UNIX. 4.
Table 4–1 AlphaServer 1000 Fault Detection and Correction Component Fault Detection/Correction Capability KN22A Processor Module DECchip 21064 and 21064A microprocessors Contains error detection and correction (EDC) logic for data cycles. There are check bits associated with all data entering and exiting the 21064(A) microprocessor. A singlebit error on any of the four longwords being read can be corrected (per cycle).
Processor Machine Check (SCB: 670) Processor machine check errors are fatal system errors that result in a system crash. The error handling code for these errors are common across all platforms using the DECchip 21064 and 21064A microprocessors.
• B-cache tag address parity error • B-cache tag control parity error • Non-existent memory error • ESC NMI: IOCHK Processor-Corrected Machine Check (SCB: 630) Processor-corrected machine checks are caused by B-cache errors that are detected and corrected by the DECchip 21064 or 21064A microprocessor. These are nonfatal errors that result in an error log entry. The error handling code for these errors are common across all platforms using the DECchip 21064 and 21064A microprocessors.
4.3 Event Record Translation Systems running Digital UNIX and OpenVMS operating systems use the DECevent management utility to translate events into ASCII reports derived from system event entries (bit-to-text translations).
4.3.2 Digital UNIX Translation Using DECevent The kernel error log entries are translated from binary to ASCII using the dia command. To invoke the DECevent utility, enter dia command. Format: dia [-a -f infile[ . . . ]] Example: % dia -t s:14-jun-1995:10:00 For more information on generating error log reports using DECevent, refer to DECevent Translation and Reporting Utility for Digital UNIX User and Reference Guide.
5 System Configuration and Setup This chapter provides configuration and setup information for AlphaServer 1000 systems and system options. • Section 5.1 describes how to examine the system configuration using the console firmware. – Section 5.1.1 describes the function of the two firmware interfaces used with AlphaServer 1000 systems. – Section 5.1.2 describes how to switch between firmware interfaces. – Sections 5.1.3 and 5.1.
5.1 Verifying System Configuration Figure 5–1 illustrates the system architecture for AlphaServer 1000 systems.
SRM Command Line Interface Systems running DEC OSF/1 or OpenVMS access the SRM firmware through a command line interface (CLI). The CLI is a UNIX style shell that provides a set of commands and operators, as well as a scripting facility. The CLI allows you to configure and test the system, examine and alter system state, and boot the operating system. The SRM console prompt is >>>.
5.1.2 Switching Between Interfaces For a few procedures it is necessary to switch from one console interface to the other. • The test command is run from the SRM interface. • The EISA Configuration Utility (ECU) and the RAID Configuration Utility (RCU) are run from the ARC interface. Switching from SRM to ARC Two SRM console commands are used to switch to the ARC console: • The arc command loads the ARC firmware and switches to the ARC menu interface.
5.1.3.1 Display Hardware Configuration The hardware configuration display provides the following information: • The first screen displays the boot devices. • The second screen displays processor information, the amount of memory installed, and the type of video card installed. • The third and fourth screens display information about the adapters installed in the system’s EISA and PCI slots. Table 5–1 lists the steps to view the hardware configuration display.
Table 5–2 (Cont.) ARC Firmware Device Names Name Description scsi(0)disk(0)rdisk(0) scsi(0)cdrom(5)fdisk(0) The scsi( ) devices are SCSI disk or CD–ROM devices. These examples represent installed SCSI devices. The disk drives are set to SCSI ID 0, and the CD–ROM drive is set to SCSI ID 5. The devices have logical unit numbers of 0.
Example 5–1 (Cont.) Sample Hardware Configuration Display Slot 0 1 2 5 6 7 0 Device Other Disk Network Network Network Display Disk Identifier DEC2A01 ADP0001 DEC4220 DEC3002 DEC4250 CPQ3011 FLOPPY Press any key to continue... Wednesday, 8-31-1994 10:51:32 AM PCI slot information: Bus Virtual Slot Function Vendor Device Revision Device type 0 6 0 1000 1 1 SCSI 0 7 0 8086 482 3 EISA bridge 0 7 0 1011 2 23 Ethernet Press any key to continue... Extended Firmware Information: Version: 4.1-19950117.
Table 5–3 lists and explains the default ARC firmware environment variables. Table 5–3 ARC Firmware Environment Variables Variable Description A: The default floppy drive. The default value is eisa( )disk( )fdisk( ). AUTOLOAD The default startup action, either YES (boot) or NO or undefined (remain in Windows NT firmware). CONSOLEIN The console input device. The default value is multi( )key( )keyboard( )console( ). CONSOLEOUT The console output device.
5.1.4 Verifying Configuration: SRM Console Commands for DEC OSF/1 and OpenVMS The following SRM console commands are used to verify system configuration on DEC OSF/1 and OpenVMS systems: • • show config (Section 5.1.4.1)—Displays the buses on the system and the devices found on those buses. show device (Section 5.1.4.2)—Displays the devices and controllers in the system. • show memory (Section 5.1.4.3)—Displays main memory configuration. • set and show (Section 5.1.4.
Synopsis: show config Example: >>> show config Firmware SRM Console: ARC Console: PALcode: Serial Rom: V1.1-1 3.5-14 VMS PALcode X5.55, OSF PALcode X1.35-53 1.1 Processor DECchip (tm) 21064-2 MEMORY 48 Meg Bank 0 Bank 1 Bank 2 Bank 3 of System Memory = 16 Mbytes(4 MB Per Simm) Starting at 0x00000000 = 16 Mbytes(4 MB Per Simm) Starting at 0x01000000 = 16 Mbytes(4 MB Per Simm) Starting at 0x02000000 = No Memory Detected PCI Bus Bus 00 Slot 06: NCR Bus 00 Slot 07: Intel 810 Scsi Controller pka0.7.0.6.
5.1.4.2 show device The show device command displays the devices and controllers in the system. The device name convention is shown in Figure 5–2. Figure 5–2 Device Name Convention dka0.0.0.0.0 Hose Number: 0 PCI_0 (32-bit PCI); 1 EISA; 2 PCI_1 Slot Number: For EISA options---Correspond to EISA card cage slot numbers (1--*) For PCI options---Slot 0 = Ethernet adapter (EWA0) or reserved on AlphaServer 2000 systems.
Example: >>> show device dka400.4.0.6.0 dva0.0.0.0.1 era0.0.0.2.1 pka0.7.0.6.0 >>> DKA400 DVA0 ERA0 PKA0 RRD43 2893 08-00-2B-BC-93-7A SCSI Bus ID 7 Console device name Node name (alphanumeric, up to 6 characters) Device type Firmware version (if known) 5.1.4.3 show memory The show memory command displays information for each bank of memory in the system.
show envar Arguments: envar The name of the environment variable to be modified. value The value that is assigned to the environment variable. This may be an ASCII string. Options: -default Restores variable to its default value. -integer Creates variable as an integer. -string Creates variable as a string (default).
Table 5–4 (Cont.) Environment Variables Set During System Configuration Variable Attributes Function bootdef_dev NV The device or device list from which booting is to be attempted, when no path is specified on the command line. Set at factory to disk with Factory Installed Software; otherwise NULL. boot_file NV,W The default file name used for the primary bootstrap when no file name is specified by the boot command. The default value when the system is shipped is NULL.
Table 5–4 (Cont.) Environment Variables Set During System Configuration Variable Attributes Function console NV Sets the device on which power-up output is displayed. GRAPHICS—Sets the power-up output to be displayed at a graphics terminal or device connected to the VGA module at the rear of the system. SERIAL—Sets the power-up output to be displayed on the device that is connected to the COM1 port at the rear of the system.
Table 5–4 (Cont.) Environment Variables Set During System Configuration Variable Attributes Function os_type NV Sets the default operating system. ‘‘vms’’ or ‘‘osf’’—Sets system to boot the SRM firmware. ‘‘nt’’—Sets system to boot the ARC firmware. pci_parity NV Disables or enables parity checking on the PCI bus. ‘‘on’’—Enables parity checking for all devices on the PCI bus. ‘‘off’’—Disables parity checking for all devices on the PCI bus.
Note Whenever you use the set command to reset an environment variable, you must initialize the system to put the new setting into effect. Initialize the system by entering the init command or pressing the Reset button. 5.2 System Bus Options The system bus interconnects the CPU and memory modules. Figure 5–3 shows the card cage and bus locations.
Figure 5–3 Card Cages and Bus Locations VGA Jumper J27 Bank 3 Bank 2 Bank 1 Bank 0 ECC Banks CPU Daughter Board PCI Option Slots PCI or EISA/ISA Option Slots E14 E78 EISA/ISA Option Slots NVRAM TOY Clock Chip NVRAM Chip MA00334 Note If the top EISA connector is used (slot 8), the bottom PCI slot (slot 11) cannot be used. If the bottom PCI slot is used, the top EISA slot cannot be used.
5.2.1 CPU Daughter Board AlphaServer 1000 systems use a CPU daughter board. The daughter board provides: • The DECchip 21064 processor • 2 megabytes of backup cache • APECS chipset, which provides logic for external access to the cache for main memory control, and the PCI bus interface • SROM code (SROM tests are controlled by jumper J6 on the CPU daughter board) 5.2.2 Memory Modules AlphaServer 1000 systems can support from 16 megabytes to 512 megabytes of memory.
Table 5–5 Operating System Memory Requirements Operating System Memory Requirements DEC OSF/1 and OpenVMS 32 MB minimum; 64 MB recommended Windows NT 16 MB minimum; 32 MB recommended Windows NT Server 32 MB minimum; 64 MB recommended Figure 5–4 Memory Layout on the Motherboard Bank 3 Bank 2 Bank 1 Bank 0 ECC Banks SIMM 1 SIMM 3 SIMM 0 SIMM 2 SIMM 1 SIMM 3 SIMM 0 SIMM 2 SIMM 1 SIMM 3 SIMM 0 SIMM 2 SIMM 1 SIMM 3 SIMM 0 SIMM 2 ECC SIMM for Bank 2 ECC SIMM for Bank 3 ECC SIMM for Ban
• The speaker interface • PCI-to-EISA bridge chip set • Time-of-year (TOY) clock • Connectors: – EISA bus connectors (Slots 1-8) – PCI bus connectors (Slots 11, 12, and 13) Note If the top EISA connector is used (slot 8), the bottom PCI slot (slot 11) cannot be used. If the bottom PCI slot is used, the top EISA slot cannot be used. – Memory module connectors (20 SIMM connectors) – CPU daughter board connector 5.
Up to eight ISA (or EISA) modules can reside in the EISA bus portion of the card cage. Refer to Section 5.6 for information on using the EISA Configuration Utility (ECU) to configure ISA options. Warning: For protection against fire, only modules with currentlimited outputs should be used. 5.5.1 Identifying ISA and EISA options By examining the contacts of the option board you can determine whether a board is EISA or ISA (Figure 5–5): • ISA boards have one row of contacts and no more than one gap.
The ECU is supplied on the two System Configuration Diskettes shipped with the system. Make a backup copy of the system configuration diskette and keep the original in a safe place. Use the backup copy when you are configuring the system. The system configuration diskette must have the volume label ‘‘SYSTEMCFG.’’ Note The CFG files supplied with the option you want to install may not work on this system if the option is not supported. Before you install an option, check that the system supports the option.
5.6.2 How to Start the ECU Complete the following steps to run the ECU: 1. Invoke the console firmware. • For systems running Windows NT—Shut down the operating system or power up to the console Boot menu. • For systems running OpenVMS or DEC OSF/1—Shut down the operating system and press the Halt button or power up with the Halt button set to the ‘‘in’’ position. When the console prompt >>> is displayed, set the Halt button to the ‘‘out’’ position. 2.
Note If you are configuring only EISA options, do not perform Step 2 of the ECU, ‘‘Add or remove boards.’’ (EISA boards are recognized and configured automatically.) • If you are configuring an EISA bus that contains both ISA and EISA options, refer to Table 5–7. 4. After you have saved configuration information and exited from the ECU: • For systems running Windows NT—Remove the ECU diskette from the diskette drive and boot the operating system.
Table 5–6 Summary of Procedure for Configuring EISA Bus (EISA Options Only) Step Explanation Install EISA option. Use the instructions provided with the EISA option. Power up and run ECU. If the ECU locates the required CFG configuration files, it displays the main menu. The CFG file for the option may reside on a configuration diskette packaged with the option or may be included on the system configuration diskette. Note It is not necessary to run Step 2 of the ECU, ‘‘Add or remove boards.
Table 5–7 Summary of Procedure for Configuring EISA Bus with ISA Options Step Explanation Install or move EISA option. Do not install ISA boards. Use the instructions provided with the EISA option. ISA boards are installed after the configuration process is complete. Power up and run ECU. If you have installed an EISA option, the ECU needs to locate the CFG file for that option.
5.7 PCI Bus Options PCI (Peripheral Component Interconnect) is an industry-standard expansion I/O bus that is the preferred bus for high-performance I/O options. The AlphaServer 1000 provides three slots for 32-bit PCI options. A PCI board is shown in Figure 5–6. Figure 5–6 PCI Board MA00080 Install PCI boards according to the instructions supplied with the option.
• The entire SCSI bus length, from terminator to terminator, must not exceed 6 meters for single-ended SCSI-2 at 5 MB/sec, or 3 meters for single-ended SCSI-2 at 10 MB/sec. The Fast SCSI-2 adapter on the motherboard supports up to two 5.25-inch, internal half-height removable-media devices. This bus can be extended to the internal StorageWorks shelf or an external expander to support up to seven drives.
Figure 5–7 Single Controller Configuration with Dual Bus StorageWorks Shelf 17-03959-01 Bus ID 4 Bus ID 5 J10 J1 0 1 J12 2 17-03962-01 J2 J11 J16 J14 3 0 J13 1 J15 2 J17 W3 W2 W1 5–30 System Configuration and Setup J3 MA00301
Figure 5–8 Single Controller Configuration with Single Bus StorageWorks Shelf 17-03959-01 Bus ID 4 Bus ID 5 J10 0 J1 1 J12 2 J2 3 J11 J16 J14 17-03960-01 4 J13 5 J15 17-03962-01 6 J17 W3 W2 W1 J3 MA00302 System Configuration and Setup 5–31
5.8.2 Multiple Controller Configurations Figure 5–9 shows a configuration using two controllers. In this configuration the StorageWorks shelf is configured as a single bus. Figure 5–10 shows a configuration using two controllers. In this configuration the StorageWorks shelf is configured as a dual bus.
Figure 5–10 Dual Controller Configuration with Dual Bus StorageWorks Shelf 17-03959-01 Bus ID 4 Bus ID 5 J10 0 J1 1 J12 2 17-03960-02 J2 17-03962-01 J11 J16 J14 17-03962-02 3 0 J13 1 12-41667-02 17-03960-02 J15 17-03962-02 J17 2 12-41667-02 W3 W2 W1 J3 Bus A Bus B MA00304 System Configuration and Setup 5–33
5.9 Power Supply Configurations AlphaServer 1000 systems offer added reliability with redundant power options, as well as UPS options. The power supplies for AlphaServer 1000 systems support two different modes of operation. In addition, UPS options are available. Refer to Figure 5–11. Power supply modes of operation: 1. Single power supply 2. Dual power supply (redundant mode)—Provides redundant power (n + 1). In redundant mode, the failure of one power supply does not cause the system to shut down.
Figure 5–11 Power Supply Configurations Redundant Single 400 Watts DC or Less 400 Watts DC or Less UPS UPS MA00335 System Configuration and Setup 5–35
Figure 5–12 Power Supply Cable Connections Signal/Misc. Harness (22-Pin/15-Pin) + 3.3V Harness (20-Pin) + 5V Harness (24-Pin) 17-03969-01 Current Sharing Harness (3-Pin) J12 Storage Harness (12-Pin) + 5V Harness (24-Pin) J13 + 3.3V Harness (20-Pin) Signal/Misc.
5.10 Console Port Configurations Power-up information is typically displayed on the system’s console terminal. The console terminal may be either a graphics terminal or a serial terminal (connected through the COM1 serial port). The setting of the console environment variable determines where the system will display power-up output. Set this environment variable according to the console terminal that you are using.
Using a VGA Controller Other than the Standard On-Board VGA When the system is configured to use a PCI- or EISA-based VGA controller instead of the standard on-board VGA (CIRRUS), consider the following: • The on-board CIRRUS VGA options must be set to disabled through the ECU. • The VGA jumper (J27) on the upper-left corner of the motherboard must then be set to disable (off). • The console environment variable should be set to graphics.
6 AlphaServer 1000 FRU Removal and Replacement This chapter describes the field-replaceable unit (FRU) removal and replacement procedures for AlphaServer 1000 systems, which use a deskside ‘‘wide-tower’’ enclosure. • Section 6.1 lists the FRUs for AlphaServer 1000-series systems. • Section 6.2 provides the removal and replacement procedures for the FRUs. 6.
Table 6–1 AlphaServer 1000 FRUs Part # Description Section 17-03970-02 Floppy drive cable (34-pin) Figure 6–5 17-03971-01 OCP module cable (10-pin) Figure 6–6 17-00083-09 Power cord Figure 6–7 17-03964-01 Power supply current sharing cable (3-pin) Figure 6–8 70-31346-01 Power supply DC cable assembly (signal /misc, 15-pin), (+5V, 24-pin), (+3.
Table 6–1 (Cont.) AlphaServer 1000 FRUs Part # Description Section RZnn -VA StorageWorks disk drive Section 6.2.4 54-23365-01 Internal StorageWorks backplane Section 6.2.5 17-03960-01 Internal StorageWorks jumper cable (50-pin) Figure 6–12 1 x 4MB SIMM Section 6.2.6 Internal StorageWorks Memory Modules ME524-DE ME534-DE 1 x 8MB SIMM Section 6.2.6 ME644-DE 1 x 16MB SIMM Section 6.2.6 ME654-DE 1 x 32MB SIMM Section 6.2.
Table 6–1 (Cont.) AlphaServer 1000 FRUs Part # Description Section RRDnn -CA CD–ROM drives Section 6.2.13 TLZnn -LG Tape drives Section 6.2.13 TZKnn -LG Tape drives Section 6.2.13 RXnn -AA Floppy drive Section 6.2.
Figure 6–2 FRUs, Rear Left Memory Upper Fan SCSI Cables Speaker Lower Fan Power Cord SCSI Multinode Cable CPU Daughter Board Motherboard NVRAM Chip (E14) NVRAM Toy Clock Chip (E78) MA00321 AlphaServer 1000 FRU Removal and Replacement 6–5
6.2 Removal and Replacement This section describes the procedures for removing and replacing FRUs for AlphaServer 1000 systems, which use the deskside ‘‘wide-tower’’ enclosure. Caution: Before removing the top cover and side panels: 1. Perform an orderly shutdown of the operating system. 2. Set the On/Off button on the operator control panel to off. 3. Unplug the AC power cords. Caution Static electricity can damage integrated circuits.
Figure 6–4 Removing Top Cover and Side Panels Top Cover Release Latch MA00300 AlphaServer 1000 FRU Removal and Replacement 6–7
6.2.1 Cables This section shows the routing for each cable in the system.
Figure 6–6 OCP Module Cable (10-Pin) J10 J1 J12 J2 J11 J16 J14 17-03971-01 J13 J15 J17 MA00337 AlphaServer 1000 FRU Removal and Replacement 6–9
Figure 6–7 Power Cord MA00338 Table 6–2 lists the country-specific power cables. Table 6–2 Power Cord Order Numbers Country Power Cord BN Number Digital Number U.S., Japan, Canada BN09A-1K 17-00083-09 Australia, New Zealand BN019H-2E 17-00198-14 Central Europe (Aus, Bel, Fra, Ger, Fin, Hol, Nor, Swe, Por, Spa) BN19C-2E 17-00199-21 U.K.
Figure 6–8 Power Supply Current Sharing Cable (3-Pin) MA00339 AlphaServer 1000 FRU Removal and Replacement 6–11
Figure 6–9 Power Supply DC Cable Assembly DC Cable Assembly Signal/Misc. Harness (22-Pin/15-Pin) + 3.3V Harness (20-Pin) + 5V Harness (24-Pin) + 5V Harness (24-Pin) + 3.3V Harness (20-Pin) Signal/Misc.
• Power supply +3.
Figure 6–11 Interlock/Server Management Cable (2-pin) J254 MA00370 6–14 AlphaServer 1000 FRU Removal and Replacement
Figure 6–12 Internal StorageWorks Jumper Cable (50-Pin) J10 0 J1 1 J12 2 J2 J11 J16 J14 17-03960-01 3 4 J13 5 J15 6 J17 W3 W2 W1 J3 MA00342 AlphaServer 1000 FRU Removal and Replacement 6–15
Figure 6–13 SCSI (J15 StorageWorks Shelf to Bulkhead Connector or Bulkhead to Multinode) Cable (50-Pin) 17-03959-01 Bus ID 4 Bus ID 5 J10 0 J1 1 J12 2 17-03960-02 J2 17-03962-01 J11 J16 J14 3 0 J13 17-03962-02 1 12-41667-02 17-03960-02 J15 17-03962-02 J17 2 W3 W2 W1 12-41667-02 J3 Bus A Note Figure 6–13 shows the SCSI cable in the bulkhead to multinode cable configuration, Figure 6–14 shows the SCSI cable in the bulkhead to J15 Storageworks shelf configuration.
Figure 6–14 SCSI (J15 StorageWorks Shelf to Bulkhead Connector or Bulkhead to Multinode) Cable (50-Pin) 17-03959-01 Bus ID 4 Bus ID 5 J10 0 J1 1 J12 2 J2 3 J11 J16 J14 17-03960-01 Top Bulkhead Connector 4 J13 5 J15 17-03962-01 6 J17 W3 W2 W1 J3 MA00345 AlphaServer 1000 FRU Removal and Replacement 6–17
Figure 6–15 SCSI (J1 or J14 StorageWorks Shelf to Bulkhead Connector) Cable (50-Pin) 17-03959-01 Bus ID 4 Bus ID 5 J10 0 J1 1 J12 2 17-03960-02 J2 17-03962-01 J11 J16 J14 3 0 J13 17-03962-02 1 12-41667-02 17-03960-02 J15 17-03962-02 J17 2 12-41667-02 W3 W2 W1 6–18 AlphaServer 1000 FRU Removal and Replacement J3 Bus A Bus B MA00346
Figure 6–16 SCSI (Embedded 8-bit) Multinode Cable (50-Pin) 17-03959-01 Bus ID 4 Bus ID 5 J10 J1 17-03959-01 Bus ID 4 Bus ID 5 J10 J1 17-03962-01 MA00347 AlphaServer 1000 FRU Removal and Replacement 6–19
Figure 6–17 SCSI RAID Internal Cable (50-Pin) (Single-Channel) 17-03959-01 Bus ID 4 Bus ID 5 J10 J1 0 1 J12 2 J2 J11 J16 J14 17-03962-01 17-03962-02 3 4 J13 5 12-41667-02 17-03960-02 17-03960-01 J15 6 J17 W3 W2 W1 6–20 AlphaServer 1000 FRU Removal and Replacement J3 MA00348
Figure 6–18 SCSI RAID Internal Cable (50-Pin) (Dual-Channel) 17-03959-01 Bus ID 4 Bus ID 5 J10 0 J1 1 J12 2 17-03960-02 J2 17-03962-01 J11 J16 J14 17-03962-02 3 0 J13 1 12-41667-02 17-03960-02 J15 17-03962-02 J17 2 12-41667-02 W3 W2 W1 J3 Bus A Bus B MA00349 AlphaServer 1000 FRU Removal and Replacement 6–21
6.2.2 CPU Daughter Board Figure 6–19 Removing CPU Daughter Board Crossbar Retaining Screw CPU Card Handle Clips MA00312 Warning: CPU and memory modules have parts that operate at high temperatures. Wait 2 minutes after power is removed before handling these modules.
6.2.3 Fans STEP 1: REMOVE THE CPU DAUGHTER BOARD AND ANY OTHER OPTIONS BLOCKING ACCESS TO THE FAN SCREWS. See Figure 6–19 for removing the CPU daughter board. STEP 2: DISCONNECT THE FAN CABLE FROM THE MOTHERBOARD AND REMOVE FAN.
Figure 6–20 Removing Fans Upper Fan Lower Fan MA00311 6–24 AlphaServer 1000 FRU Removal and Replacement
6.2.4 StorageWorks Drive Note If the StorageWorks drives are plugged into an SWXCR-xx controller, you can ‘‘hot swap’’ drives; that is, you can add or replace drives without first shutting down the operating system or powering down the server hardware. For more information, see StorageWorks RAID Array 200 Subsystem Family Installation and Configuration Guide, EK-SWRA2-IG.
6.2.5 Internal StorageWorks Backplane STEP 1: REMOVE POWER SUPPLIES. Figure 6–22 Removing Power Supply Current Sharing Harness (3-Pin) Storage Harness (12-Pin) + 5V Harness (24-Pin) + 3.3V Harness (20-Pin) Signal/Misc.
STEP 2: REMOVE INTERNAL STORAGEWORKS BACKPLANE.
6.2.6 Memory Modules The positions of the failing single-inline memory modules (SIMMs) are reported by SROM power-up scripts (Section 2.1.1). Note • Bank 0 must contain a memory option (5 SIMMs–0, 1, 2, 3, and 1 ECC SIMM). • A memory option consists of five SIMMs (0, 1, 2, 3 and 1 ECC SIMM for the bank). • All SIMMs within a bank must be of the same capacity. STEP 1: RECORD THE POSITION OF THE FAILING SIMMS. STEP 2: LOCATE THE FAILING SIMM ON THE MOTHERBOARD.
Warning: Memory and CPU modules have parts that operate at high temperatures. Wait 2 minutes after power is removed before handling these modules. Caution Do not use any metallic tools or implements including pencils to release SIMM latches. Static discharge can damage the SIMMs.
Note SIMMs can only be removed and installed in successive order. For example; to remove a SIMM at bank 0, SIMM 1, SIMMs 0 and 1 for banks 3, 2, and 1 must first be removed.
Note When installing SIMMs, make sure that the SIMMs are fully seated. The two latches on each SIMM connector should lock around the edges of the SIMMs.
6.2.
6.2.8 Motherboard STEP 1: RECORD THE POSITION OF EISA AND PCI OPTIONS. STEP 2: REMOVE EISA AND PCI OPTIONS. STEP 3: REMOVE CPU DAUGHTER BOARD.
Figure 6–29 Removing CPU Daughter Board Crossbar Retaining Screw CPU Card Handle Clips MA00312 Warning: CPU and memory modules have parts that operate at high temperatures. Wait 2 minutes after power is removed before handling these modules.
STEP 4: DETACH MOTHERBOARD CABLES, REMOVE SCREWS AND MOTHERBOARD. Caution When replacing the system bus motherboard install the screws in the order indicated.
STEP 5: MOVE THE NVRAM CHIP (E14) AND NVRAM TOY CHIP (E78) TO THE NEW MOTHERBOARD. Move the socketed NVRAM chip (position E14) and NVRAM TOY chip (E78) to the replacement motherboard and set the jumpers to match previous settings.
6.2.9 NVRAM Chip (E14) and NVRAM TOY Clock Chip (E78) See Figure 6–31 for the motherboard layout. 6.2.10 OCP Module STEP 1: REMOVE FRONT DOOR. STEP 2: REMOVE FRONT PANEL. STEP 3: REMOVE OCP MODULE.
Figure 6–33 Removing Front Panel Remove Hidden Screws Remove Screws MA00307 6–38 AlphaServer 1000 FRU Removal and Replacement
Figure 6–34 Removing the OCP Module J254 Black/Red (To Interlock Switch) Green/Yellow (To Motherboard) MA00308 AlphaServer 1000 FRU Removal and Replacement 6–39
6.2.11 Power Supply STEP 1: DISCONNECT POWER SUPPLY CABLES. STEP 2: REMOVE POWER SUPPLY. Figure 6–35 Removing Power Supply Current Sharing Harness (3-Pin) Storage Harness (12-Pin) + 5V Harness (24-Pin) + 3.3V Harness (20-Pin) Signal/Misc. Harness (15-Pin) MA00350 Warning: Hazardous voltages are contained within. Do not attempt to service. Return to factory for service.
6.2.
6.2.
Figure 6–38 Removing a Tape Drive MA00325 AlphaServer 1000 FRU Removal and Replacement 6–43
Figure 6–39 Removing a Floppy Drive MA00326 6–44 AlphaServer 1000 FRU Removal and Replacement
A Default Jumper Settings This appendix provides the location and default setting for all jumpers in AlphaServer 1000 systems: • Section A.1 provides location and default settings for jumpers located on the motherboard. • Section A.2 provides the location and supported settings for jumpers J3 and J4 on the CPU daughter board. • Section A.3 provides the location and default setting for the J1 jumper on the CPU daughter board.
A.1 Motherboard Jumpers Figure A–1 shows the location and default settings for jumpers located on the motherboard.
Jumper Name Description Default Setting J27 VGA Enable When enabled (as shown in Figure A–1), the on-board VGA logic is activated. Enabled for on-board VGA; Disabled if an EISA- or PCI-based VGA option is installed. J49 SCSI Termination Allows the internal SCSI terminator to be disabled. Enabled (as shown in Figure A–1). J50 Flash ROM VPP Enable Permits the 12V voltage needed to update the Flash ROMs. Jumper installed.
A.2 CPU Daughter Board (J3 and J4) Supported Settings Figure A–2 shows the supported AlphaServer 1000 4/200 settings for the J3 and J4 jumpers on the CPU daughter board. These jumpers affect clock speed and other critical system settings. Figure A–3 shows the supported AlphaServer 1000 4/233 settings for the J3 and J4 jumpers on the CPU daughter board. These jumpers affect clock speed and other critical system settings.
Figure A–3 AlphaServer 1000 4/233 CPU Daughter Board (Jumpers J3 and J4) J4 J3 MA00791 Supported settings: • J4 Jumper: Off On Off Off On • J3 Jumper: Off Default Jumper Settings A–5
A.3 CPU Daughter Board (J1 Jumper) Figure A–4 shows the default setting for the J1 jumper on the CPU daughter board. For information on SROM tests and the fail-safe loader, which are activated through the J1 jumper, refer to Chapter 2.
Glossary 10BASE-T Ethernet network IEEE standard 802.3-compliant Ethernet products used for local distribution of data. These networking products characteristically use twisted-pair cable. ARC User interface to the console firmware for operating systems that require firmware compliance with the Windows NT Portable Boot Loader Specification. ARC stands for Advanced RISC Computing. AUI Ethernet network Attachment unit interface. An IEEE standard 802.
backup cache A second, very fast cache memory that is closely coupled with the processor. bandwidth The rate of data transfer in a bus or I/O channel. The rate is expressed as the amount of data that can be transferred in a given time, for example megabytes per second. battery backup unit A battery unit that provides power to the entire system enclosure (or to an expander enclosure) in the event of a power failure. Another term for uninterruptible power supply (UPS). boot Short for bootstrap.
bystander A system bus node (CPU or memory) that is not addressed by a current system bus commander. byte A group of eight contiguous bits starting on an addressable byte boundary. The bits are numbered right to left, 0 through 7. cache memory A small, high-speed memory placed between slower main memory and the processor. A cache increases effective memory transfer rates and processor speed.
cluster A group of networked computers that communicate over a common interface. The systems in the cluster share resources, and software programs work in close cooperation. cold bootstrap A bootstrap operation following a power-up or system initialization (restart). On Alpha based systems, the console loads PALcode, sizes memory, and initializes environment variables. commander In a particular bus transaction, a CPU or standard I/O that initiates the transaction.
data cache A high-speed cache memory reserved for the storage of data. Abbreviated as D-cache. DECchip 21064 processor The CMOS, single-chip processor based on the Alpha architecture and used on many AlphaGeneration computers. DEC OSF/1 Version 3.0b for AXP systems A general-purpose operating system based on the Open Software Foundation OSF/1 2.0 technology. DEC OSF/1 V3.x runs on the range of AlphaGeneration systems, from workstations to servers. DEC VET Digital DEC Verifier and Exerciser Tool.
DUP server Diagnostic Utility Program server. A firmware program on board DSSI devices that allows a user to set host to a specified device in order to run internal tests or modify device parameters. ECC Error correction code. Code and algorithms used by logic to facilitate error detection and correction. EEPROM Electrically erasable programmable read-only memory. A memory device that can be byte-erased, written to, and read from. EISA bus Extended Industry Standard Architecture bus.
fail-safe loader (FSL) A program that allows you to power up without initiating drivers or running power-up diagnostics. From the fail-safe loader you can perform limited console functions. Fast SCSI An optional mode of SCSI-2 that allows transmission rates of up to 10 megabytes per second. FDDI Fiber Distributed Data Interface. A high-speed networking technology that uses fiber optics as the transmissions medium. FIB Flexible interconnect bridge.
halt The action of transferring control of the computer system to the console program. hose The interface between the card cage and the I/O subsystems. hot swap The process of removing a device from the system without shutting down the operating system or powering down the hardware. initialization The sequence of steps that prepare the computer system to start. Occurs after a system has been powered up. instruction cache A high-speed cache memory reserved for the storage of instructions.
loopback test Internal and external tests that are used to isolate a failure by testing segments of a particular control or data path. A subset of ROM-based diagnostics. machine check/interrupts An operating system action triggered by certain system hardware-detected errors that can be fatal to system operation. Once triggered, machine-check handler software analyzes the error. mass storage device An input/output device on which data is stored.
motherboard The main circuit board of a computer. The motherboard contains the base electronics for the system (for example, base I/O, CPU, ROM, and console serial line unit) and has connectors where options (such as I/Os and memories) can be plugged in. multiprocessing system A system that executes multiple tasks simultaneously. node A device that has an address on, is connected to, and is able to communicate with other devices on a bus.
operator control panel The panel located on the front of the system, which contains the power-up /diagnostic display, DC On/Off button, Halt button, and Reset button. PALcode Alpha Privileged Architecture Library code, written to support Alpha processors. PALcode implements architecturally defined behavior. PCI Peripheral Component Interconnect. An industry-standard expansion I/O bus that is the preferred bus for high-performance I/O options. Available in a 32-bit and a 64-bit version.
RAID Redundant array of inexpensive disks. A technique that organizes disk data to improve performance and reliability. RAID has three attributes: • It is a set of physical disks viewed by the user as a single logical device. • The user’s data is distributed across the physical set of drives in a defined manner. • Redundant disk capacity is added so that the user’s data can be recovered even if a drive fails.
serial control bus A two-conductor serial interconnect that is independent of the system bus. This bus links the processor modules, the I/O, the memory, the power subsystem, and the operator control panel. serial ROM In the context of the CPU module, ROM read by the DECchip microprocessor after reset that contains low-level diagnostic and initialization routines. SIMM Single in-line memory module.
system disk The device on which the operating system resides. TCP/IP Transmission Control Protocol/Internet Protocol. A set of software communications protocols widely used in UNIX operating environments. TCP delivers data over a connection between applications on different computers on a network; IP controls how packets (units of data) are transferred between computers on a network.
wide area network (WAN) A high-speed network that connects a server to a distant host computer, PC, or other server, or that connects numerous computers in numerous distant locations. Windows NT ‘‘New technology’’ operating system owned by Microsoft Corp. The AlphaServer systems currently support the Windows NT, OpenVMS, and DEC OSF/1 operating systems.
Index A A: environment variable, 5–7 AC power-up sequence, 2–19 Acceptance testing, 3–18 arc command, 5–4 ARC interface, 5–3 switching to SRM from, 5–4 AUTOLOAD environment variable, 5–8 B Beep codes, 2–2, 2–17, 2–20, 2–21 Boot diagnostic flow, 1–6 Boot menu (ARC), 2–8 C Card cage location, 5–18 cat el command, 2–8, 3–7 CD–ROM LEDs, 2–13 CFG files, 2–15 COM2 and parallel port loopback tests, 3–4 Commands diagnostic, summarized, 3–2 diagnostic-related, 3–3 firmware console, functions of, 1–8 to examine sys
Console event log, 2–8 Console firmware DEC OSF/1, 5–3 diagnostics, 2–21 OpenVMS, 5–3 Windows NT, 5–3 Console interfaces switching between, 5–4 Console output, 5–37 Console port configurations, 5–37 CONSOLEIN environment variable, 5–7 CONSOLEOUT environment variable, 5–7 COUNTDOWN environment variable, 5–8 CPU daughter board, 5–19 Crash dumps, 1–9 D DC power-up sequence, 2–20 DEC VET, 1–8, 3–18 DECevent, 1–7 Device naming convention SRM, 5–11 Devices Windows NT firmware device display, 5–6 Windows NT firmw
Environment variables set during system configuration, 5–13 ERF/uerf, 1–7 Error handling, 1–7 logging, 1–7 report formatter (ERF), 1–7 Error formatters DECevent, 4–5 Error log translation Digital UNIX, 4–6 OpenVMS, 4–5 Error logging, 4–4 event log entry format, 4–4 Ethernet external loopback, 3–4 Event logs, 1–7 Event record translation Digital UNIX, 4–5 OpenVMS, 4–5 Exceptions how PALcode handles, 4–1 F Fail-safe loader, 2–17 activating, 2–17 power-up using, 2–17 Fan failure, 1–3 Fault detection/correctio
Logs event, 1–7 Loopback tests, 1–8 COM2 and parallel ports, 3–4 command summary, 3–3 M Machine check/interrupts, 4–2 processor, 4–2 processor corrected, 4–2 system, 4–2 Maintenance strategy, 1–1 service tools and utilities, 1–7 Mass storage described, 5–28 Mass storage problems at power-up, 2–9 fixed media, 2–9 removable media, 2–9 memory command, 3–8 Memory module configuration, 5–19 displaying information for, 5–12 minimum and maximum, 5–19 Memory tests, 2–4 Memory, main exercising, 3–8 Modules CPU, 5–1
ROM-based diagnostics (RBDs) (cont’d) running, 3–1 utilities, 3–2 S SCSI bus on-board, 5–29 SCSI devices Windows NT firmware device names, 5–5, 5–6 Serial ports, 5–37 Serial ROM diagnostics, 2–20 Service tools and utilities, 1–7 set command (SRM), 5–12 show command (SRM), 5–12 show configuration command (SRM), 5–9 show device command (SRM), 5–11 show memory command (SRM), 5–12 show_status command, 3–17 SIMMs, 5–19 troubleshooting, 2–4 SRM interface, 5–3 switching to ARC from, 5–4 SROM memory tests, 2–4 Sto
Troubleshooting (cont’d) with DEC VET, 1–8 with loopback tests, 1–8 with operating system exercisers, 1–8 with ROM-based diagnostics, 1–7 Index–6 W Windows NT firmware Available hardware devices display, 5–6 default environment variables, 5–7 device names, 5–5
How to Order Additional Documentation Technical Support If you need help deciding which documentation best meets your needs, call 800-DIGITAL (800-344-4825) and press 2 for technical assistance. Electronic Orders If you wish to place an order through your account at the Electronic Store, dial 800-234-1998, using a modem set to 2400- or 9600-baud. You must be using a VT terminal or terminal emulator set at 8 bits, no parity.
Reader’s Comments AlphaServer 1000 Service Guide EK–DTLSV–SV. B01 Your comments and suggestions help us improve the quality of our publications. Thank you for your assistance.
Do Not Tear – Fold Here and Tape TM BUSINESS REPLY MAIL FIRST CLASS PERMIT NO. 33 MAYNARD MASS.