HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Second Edition (October 1999) Part Number: EK–HSG84–SV.
Notice While Compaq Computer Corporation believes the information included in this manual is correct as of the date of publication, it is subject to change without notice. Compaq makes no representations that the interconnection of its products in the manner described in this document will not infringe on existing or future patent rights, nor do the descriptions contained in this document imply the granting of licenses to make, use, or sell equipment or software in accordance with the description.
Contents About This Guide Chapter 1 General Description Subsystem Components — Exploded Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–1 HSG80 Subsystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–2 HSG80 Array Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–4 Cache Module. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
iv Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Fibre Channel Optical Cable Cleaning Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–5 Cleaning the GLM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–6 Shutting Down the Subsystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Contents Replacing a Fiber Cable, Switch, or Hub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Remove a Fiber Cable, Switch, or Hub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Install a Fiber Cable, Switch, or Hub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Replacing a Program Card. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
vi Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Significant Event Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–18 Events That Cause Controller Operation to Terminate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–18 Flashing OCP Pattern Display Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–19 Solid OCP Pattern Display Reporting . . . . . . . . .
Contents vii Chapter 5 Event Reporting: Templates and Codes Passthrough Device Reset Event Sense Data Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–1 Last Failure Event Sense Data Response (Template 01) . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–2 Multiple-Bus Failover Event Sense Data Response (Template 04) . . . . . . . . . . . . . . . . . . . 5–4 Failover Event Sense Data Response (Template 05) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Figures Figure 1–1. HSG80 subsystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–2 Figure 1–2. HSG80 array controller—fibre channel optical cabling . . . . . . . . . . . . . . . . . . . . . 1–4 Figure 1–3. Cache module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–5 Figure 1–4. EMU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
x Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Figure 3–1. Figure 3–2. Figure 3–3. Figure 3–4. Figure 3–5. Figure 4–1. Figure 4–2. Figure 4–3. Figure 4–4. Figure 4–5. Figure 5–1. Figure 5–2. Program (PCMCIA) card . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–2 Location of write-protection switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–4 Upgrading device firmware . . . . . . . . .
Tables Table 1–1 HSG80 Subsystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–3 Table 1–2 HSG80 Fibre Channel Array Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–4 Table 1–3 Cache Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–5 Table 1–4 EMU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xii Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 4–15 Fibre Channel Host Status Display — Port Status . . . . . . . . . . . . . . . . . . . . . . . . . 4–45 Table 4–16 Fibre Channel Host Status Display — Link Error Counters . . . . . . . . . . . . . . . . . 4–45 Table 4–17 First Digit on the TACHYON Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–47 Table 4–18 Second Digit on the TACHYON Chip . . . . . . . . . . . . . . . . .
About This Guide This guide describes the features and part numbers of the HSG80 array controller running Array Controller Software (ACS) Version 8.5F, 8.5S, and 8.5P. It also contains replacement procedures, subsytem upgrade procedures, and troubleshooting resources which includes event reporting codes. This guide does not contain information about the operating environments to which the controller might be connected, nor does it contain detailed information about subsystem enclosures or their components.
xiv Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Conventions This guide uses the text conventions in Table 1 and special notices provided within this section. Text Conventions Table 1 Text Conventions Convention Bold SMALL CAPS Meaning Keyboard keys appear in boldface. For example: Enter/Return or Y(es) key Used to indicate the status of an LED.
About This Guide xv Special Notices This guide does not contain detailed descriptions of standard safety procedures. However, it does contain warnings for procedures that might cause personal injury and cautions for procedures that might damage the controller or its related components. Look for these symbols when performing the procedures in this guide: WARNING: A warning indicates the presence of a hazard that can cause personal injury if precautions in the text are not observed.
xvi Compaq StorageWorks HSG80 Array Controller ACS Version 8.
Chapter 1 General Description This chapter illustrates and describes, in general terms, the subsystem and its major components, plus connectors, switches, and light emitting diodes (LEDs) for the following components: ■ HSG80 array controller ■ Cache module ■ Environmental monitoring unit (EMU) See the Fibre Channel Switch Documentation that came with the switch kit for specifics about how the switch operates.
1–2 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide HSG80 Subsystem 1 18 2 17 15 16 3 14 4 13 12 9 11 5 9 10 6 9 9 7 8 Figure 1–1.
General Description 1–3 Table 1–1 HSG80 Subsystem Item 1 2 3 4 5 6 7 8 9 q w e r t y u i Description Compaq Part Number DIGITAL Part Number BA370 rack-mountable enclosure 401914-001 DS-BA370-MA Cooling fan, blue Cooling fan, gray 400293-001 402602-001 FC-BA35X-MK FC-BA35X-ML Power cable kit, white 401916-001 17-03718-10 Input/output (I/O) module, blue I/O module, gray 400294-001 401911-001 FC-BA35X-MN 70-32856-S2 Fibre channel hub, 7-port 234454-001 FE-09061-01 Fibre channel hub, 12-p
1–4 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide HSG80 Array Controller 1 2 1 2 3 4 5 6 4 3 CXO6691B Figure 1–2.
General Description 1–5 Cache Module 1 ~ 2 CXO6570B Figure 1–3.
1–6 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide EMU 1 CXO6604B Figure 1–4.
General Description Controller Front Panel 1 2 1 2 6 3 4 5 3 6 4 5 CXO6582B Figure 1–5.
1–8 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide OCP LEDs 1 2 1 2 3 4 5 6 CXO6216C Figure 1–6.
General Description Gigabit Link Module (GLM) 1 3 2 4 5 6 7 CXO6245C Figure 1–7.
1–10 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide PVA Module 1 2 3 CXO5821B Figure 1–8.
General Description EMU 1 2 3 4 5 6 7 CXO5774B Figure 1–9.
Chapter 2 Replacement Procedures This chapter describes the procedures for replacing the following items: ■ Array controller ■ Cache module ■ External cache battery (ECB) ■ GLM ■ PVA module ■ I/O module ■ EMU ■ DIMMs ■ Fiber cable or switch ■ Program card ■ Failed storageset member Procedures for shutting down and restarting the subsystem are also included. See the enclosure documentation for information about replacing power supplies, cooling fans, bus cables, and power cables.
2–2 Compaq StorageWorks HSG80 Array Controller ACS Version 8.
Replacement Procedures 2–3 ■ Before touching any circuit board or component, always touch a verifiable earth ground to discharge any static electricity that might be present in clothing. ■ Always keep circuit boards and components away from nonconducting material. ■ Always keep clothing away from circuit boards and components. ■ Always use antistatic bags and grounding mats for storing circuit boards or components during replacement procedures.
2–4 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Use the following steps to establish a local connection for setting the controller initial configuration: 1. Turn off the PC or terminal, and connect it to the controller, as shown in Figure 2–1. a. For a PC connection, plug one end of the maintenance port cable into the terminal; plug the other end into the controller maintenance port. b. For a terminal connection, refer to Figure 2–1 for cabling information.
Replacement Procedures 2–5 3. Configure the terminal emulation software for 9600 baud, 8 data bits, 1 stop bit, and no parity. 4. Press the Enter or Return key. The command line interface (CLI) prompt appears, indicating that a local connection was established with the controller. NOTE: The default data transfer rate of a new controller is 9600 baud. The maximum transfer rate is 19200. If the current configuration used 19200, use step 5 to establish this rate. 5.
2–6 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide CAUTION: It is only necessary to clean the Fibre Channel optical cable when replacing a controller. Overcleaning might cause damage to the ferrules. NOTE: When installing a cable for the first time, it is not necessary to follow this procedure. 1. Using the polyester cleaning cloth that came with the cable cleaning kit, cover your fingers and squeeze one ferrule between two fingers. 2.
Replacement Procedures 2–7 2 3 1 2 3 1 1 2 3 4 5 GLM Receptacle Swab 6 CXO6531B Figure 2–3. Cleaning procedure for GLM 3. Carefully dust out the cavity by rotating the swab tip back and forth one or two times. 4. Repeat step 1 through step 3 for the receiving side of the optical GLM cavity. Shutting Down the Subsystem Use the following steps to shut down a subsystem: 1. From a host console, stop all host activity and dismount the logical units in the subsystem.
2–8 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide When the controllers shut down, the reset buttons and the first three LEDs are lit continuously (see Figure 2–4). Receiving this indication can take several minutes, depending on the amount of data that needs to be flushed from the cache modules. 1 1 2 2 1 2 3 4 5 Reset button First three LEDs 6 CXO6991A Figure 2–4. Identifying the controller reset button and first three LEDs 4.
Replacement Procedures 1 2 3 4 5 1 2–9 ECB 1 ECB 2 Power connector Status LED Battery disable switch (SHUT OFF) 3 4 5 2 CXO6164C Figure 2–5. ECB battery disable switch location NOTE: To return to normal operation, apply power to the storage subsystem. The ECB will be enabled when the subsystem is powered on. Restarting the Subsystem Use the following steps to restart a subsystem. 1. Refer to enclosure documentation for specific procedures to follow for restarting the subsystem.
2–10 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Replacing Controller and Cache Modules in a Single-Controller Configuration Follow the instructions in this section to replace modules in a single-controller configuration (see Figure 2–6). To upgrade a single-controller configuration to a dual-redundant controller configuration, see Chapter 3.
Replacement Procedures 2–11 Replacing a Controller and Cache Module in a Single-Controller Configuration If both the controller and cache module need to be replaced, first follow the steps for replacing a controller, and then the steps for replacing a cache module. Replacing a Controller in a Single-Controller Configuration Use the procedures in “Removing the Controller in a Single-Controller Configuration” and “Installing the Controller in a Single-Controller Configuration” to replace a controller.
2–12 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide CAUTION: The cache module might contain unwritten data if the controller crashes and the controller cannot be shut down with the SHUTDOWN THIS_CONTROLLER command. 5. Remove the program card ESD cover and program card. Save them in a static-free place for the replacement controller. 6. Disconnect all host bus cables from the controller.
Replacement Procedures 2–13 1. Insert the new controller into its bay, and engage its retaining levers. 2. Connect all host bus cables to the new controller. 3. Connect a PC or terminal to the controller maintenance port. 4. Press and hold the reset button while inserting the program card into the new controller. 5. Release the reset button and replace the program card ESD cover. 6.
2–14 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide 12. Set the subsystem date and time using the following command in its entirety: SET THIS_CONTROLLER TIME=dd-mmm-yyyy:hh:mm:ss 13. Disconnect the PC or terminal from the controller maintenance port.
Replacement Procedures 2–15 5. Disable the ECB by pressing the battery disable switch until the status light stops blinking—approximately five seconds. 6. Disconnect the ECB cable from the cache module. 7. Disengage both retaining levers, remove the cache module, and place the cache module into an antistatic bag or onto a grounded antistatic mat. NOTE: Remove the DIMMs from the cache module for use within the replacement cache module. 8.
2–16 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide 8. If not already connected, connect a PC or terminal to the controller maintenance port. 9. Restart the controller by pressing its reset button. 10. When the CLI prompt reappears, display details about the configured controller using the following command: SHOW THIS_CONTROLLER FULL 11. Mount the logical units on the host. If using a Windows NT platform, restart the server. 12.
Replacement Procedures 2–17 Replacing Controller and Cache Modules in a Dual-Redundant Controller Configuration Follow the instructions in this section to replace modules in a dual-redundant controller configuration (see Figure 2–9). 1 2 3 4 6 5 7 CXO6990A 1 2 3 4 5 EMU Controller A Controller B Cache module A 6 7 Fibre channel optical cables with extender clips Cache module B PVA module Figure 2–9.
2–18 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide IMPORTANT: Note the following before starting the replacement procedures: ■ The new controller hardware must be compatible with the remaining controller hardware. See the product-specific release notes that accompanied the software release for information regarding hardware compatibility. ■ The software versions and patch levels must be the same on both controllers.
Replacement Procedures The following display appears: Do you intend to replace this controller’s cache battery? Y/N 5. Enter N(o). The following menu appears: FRUTIL Main Menu: 1. Replace or remove a controller or cache module 2. Install a controller or cache module 3. Replace a PVA module 4. Replace an I/O module 5. Exit Enter choice: 1, 2, 3, 4, or 5 -> 6. Enter option 1. The following menu appears: Replace or remove Options: 1. Other controller and cache module 2. Other controller module 3.
2–20 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide 8. Enter Y(es). The following display appears: Quiescing all device ports. Please wait... Device Port 1 quiesced. Device Port 2 quiesced. Device Port 3 quiesced. Device Port 4 quiesced. Device Port 5 quiesced. Device Port 6 quiesced. All device ports quiesced. Remove the slot A [or B] controller (the one without a blinking green LED) within 4 minutes.
Replacement Procedures 2–21 Once the cache module is removed, the following display appears: Restarting all device ports. Please wait... Device Port 1 restarted. Device Port 2 restarted. Device Port 3 restarted. Device Port 4 restarted. Device Port 5 restarted. Device Port 6 restarted. Do you have a replacement controller and cache module? Y/N 14. Enter N(o) if a replacement controller and cache module is not available. ■ FRUTIL will exit.
2–22 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide 5. Connect a PC or terminal to the maintenance port of the operational controller. The controller connected to becomes “this controller;” the controller being installed becomes the “other controller.” 6. Start FRUTIL with the following command: RUN FRUTIL The following display appears: Do you intend to replace this controller’s cache battery? Y/N 7. Enter N(o). The following menu appears: FRUTIL Main Menu: 1.
Replacement Procedures 2–23 10. Enter Y(es). The following display appears: Quiescing all device ports. Please wait... Device Port 1 quiesced. Device Port 2 quiesced. Device Port 3 quiesced. Device Port 4 quiesced. Device Port 5 quiesced. Device Port 6 quiesced. All device ports quiesced. . . . Perform the following steps: 1. Turn off the battery for the new cache module by pressing the battery’s shut off button for five seconds 2. Connect the battery to the new cache module. 3.
2–24 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide 14. Make sure that the program card is seated in the replacement controller, insert the new controller into its bay, and engage its retaining levers. When fully seated, the newly installed controller boots automatically. The following display appears: If the other controller did not restart, follow these steps: 1. Press and hold the other controller’s reset button. 2. Reseat the other controller’s program card.
Replacement Procedures 2–25 Replacing a Controller in a Dual-Redundant Controller Configuration Use the following steps in “Removing a Controller in a Dual-Redundant Controller Configuration” and “Installing a Controller in a Dual-Redundant Controller Configuration” to replace a controller. Removing a Controller in a Dual-Redundant Controller Configuration Use the following steps to remove a controller: 1. Connect a PC or terminal to the maintenance port of the operational controller.
2–26 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide 6. Enter option 1. The following menu appears: Replace or remove Options: 1. Other controller and cache module 2. Other controller module 3. Other cache module 4. Exit Enter choice: 1, 2, 3, or 4 -> 7. Enter option 2.
Replacement Procedures 2–27 NOTE: A countdown timer allows a total of two minutes to remove the controller. After two minutes, “this controller” will exit FRUTIL and resume operations. If this happens, return to step 4 and proceed. 9. Remove all host bus cables from the “other controller” using needle-nose pliers (see inset on Figure 2–9). 10. Disengage both retaining levers, remove the “other controller,” and place the controller into an antistatic bag or onto a grounded antistatic mat.
2–28 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide The following display appears: Do you intend to replace this controller’s cache battery? Y/N 3. Enter N(o). The following menu appears: FRUTIL Main Menu: 1. Replace or remove a controller or cache module 2. Install a controller or cache module 3. Replace a PVA module 4. Replace an I/O module 5. Exit Enter choice: 1, 2, 3, 4, or 5 -> 4. Enter option 2. The following menu appears: Install Options: 1.
Replacement Procedures 2–29 NOTE: A countdown timer allows a total of two minutes to install the controller. After two minutes, “this controller” will exit FRUTIL and resume operations. If this happens, return to step 2 and proceed. CAUTION: ESD can easily damage a controller. Wear a snug-fitting, grounded ESD wrist strap. Carefully align the controller in the appropriate guide rails. Misalignment might damage the backplane. 7.
2–30 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide 12. Disconnect the PC or terminal from the controller maintenance port. Replacing a Cache Module in a Dual-Redundant Controller Configuration Use the following steps in “Removing a Cache Module in a Dual-Redundant Controller Configuration” and “Installing a Cache Module in a Dual-Redundant Controller Configuration” to replace a cache module.
Replacement Procedures 2–31 5. Enter option 1. The following menu appears: Replace or remove Options: 1. Other controller and cache module 2. Other controller module 3. Other cache module 4. Exit Enter choice: 1, 2, 3, or 4 -> 6. Enter option 3.
2–32 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide NOTE: A countdown timer allows a total of two minutes to remove the cache module. After two minutes, “this controller” will exit FRUTIL and resume operations. If this happens, return to step 3 and proceed. 8. Disengage both retaining levers and partially remove the “other controller” cache module—about half way.
Replacement Procedures 2–33 Installing a Cache Module in a Dual-Redundant Controller Configuration Use the following steps to install a cache module: CAUTION: ESD can easily damage a cache module or a DIMM. Wear a snug-fitting, grounded ESD wrist strap. 1. Connect a PC or terminal to the maintenance port of the operational controller. The controller connected to becomes “this controller;” the controller for the cache module being installed becomes the “other controller.” 2.
2–34 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide 6. Insert each DIMM straight into the appropriate slot of the cache module, ensuring that the notches in the DIMM align with the tabs in the slot (see Figure 2–15). 7. Press the DIMM gently into the slot until seated at both ends. 8. Engage two retaining clips for the DIMM. 9. Repeat step 6 through step 8 for each DIMM. 10. Enter Y(es). The following display appears: Quiescing all device ports. Please wait...
Replacement Procedures 2–35 CAUTION: Carefully align the cache module in the appropriate guide rails. Misalignment might damage the backplane. 13. Insert the new cache module into its bay and engage its retaining levers. NOTE: In mirrored mode, FRUTIL initializes the mirrored portion of the new cache module, checks for old data on the cache module, then restarts all device ports. After the device ports restart, FRUTIL tests the cache module and the ECB.
2–36 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Replacing an ECB The ECB can be replaced with cabinet power on or off. A dual ECB is shown in Figure 2–10 and contains two batteries. A single ECB contains only one battery.
Replacement Procedures 2–37 The following display appears: Do you intend to replace this controller’s cache battery? Y/N 3. Enter Y(es). The following display appears: If the batteries were replaced while the cabinet was powered down, press return. Otherwise follow this procedure: WARNING: Ensure that at least one battery is connected to the Y cable at all times during this procedure. 1.Connect the new battery to the unused end of the 'Y' cable attached to cache A [or B]. 2.Disconnect the old battery.
2–38 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide ■ Repeat step 2 through step 7. 9. Remove the old ECB. NOTE: If an empty bay was not available, and the new ECB was placed on the top of the enclosure, carefully insert it now into the empty bay. Replacing an ECB With Cabinet Powered Off Use the following steps to replace the ECB with the cabinet powered off: 1. If the controller and cache module are not operating, go to step 4. Otherwise, proceed to step 2. 2.
Replacement Procedures 2–39 6. Connect the open end of the ECB Y-cable to the new ECB and then disconnect the ECB cable from the old ECB. 7. Restore power to the subsystem. The controller automatically restarts. 8. Start FRUTIL with the following command: RUN FRUTIL The following display appears: Do you intend to replace this controller’s cache battery? Y/N 9. Type Y(es). The following display appears: If the batteries were replaced while the cabinet was powered down, press return.
2–40 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Replacing a GLM Use the following steps in “Removing a GLM” and “Installing a GLM” to replace a GLM in a controller. Figure 2–11 shows the location and orientation of the GLMs. 1 1 2 3 4 5 6 7 3 2 4 5 6 Access door Port 1 GLM Release lever Locking tab Guide holes GLM connector Port 2 GLM 7 CXO6245C Figure 2–11. Location of GLMs inside a controller CAUTION: ESD can easily damage a controller and GLM.
Replacement Procedures 2–41 Removing a GLM Use the following steps and Figure 2–11 to remove a GLM: 1. Remove the controller using either the steps in “Removing the Controller in a Single-Controller Configuration,” page 2–11, or “Removing a Controller in a Dual-Redundant Controller Configuration,” page 2–25. 2. Remove the screw that secures the access door 1 on the top of the controller. 3. Remove the access door and set it aside. 4. Disengage the GLM locking tabs bottom side of the controller.
2–42 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Replacing a PVA Module Use the following steps to replace a PVA module in the master enclosure (ID 0), the first expansion (ID 2), or second expansion enclosure (ID 3). The master enclosure contains the controllers and the cache modules. NOTE: This procedure is not applicable for the M1 shelf. The HSG80 controller can support up to three BA370 enclosures: a master enclosure and two expansion enclosures.
Replacement Procedures 2–43 NOTE: The FRUTIL PVA Replacement Menu provides options for three enclosures regardless of how many enclosures are actually connected. 6. From the menu, select one of the following options: ■ Enter option 1 to replace the PVA in the master enclosure. ■ Enter option 2 to replace the PVA in the first expansion enclosure ■ Enter option 3 to replace the PVA in the second expansion enclosure.
2–44 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide 12. Press Return to resume device port activity and restart the “other controller.” When all port activity has restarted, The following display appears: PVA replacement complete. Please wait . . . If the other controller did not restart, press its reset button. Field Replacement Utility terminated. 13. If the “other controller” did not restart, press its reset button. 14.
Replacement Procedures 2–45 Replacing an I/O Module Figure 2–12 shows a rear view of the BA370 enclosure and the relative location of the six I/O modules (also referred to as ports). Figure 2–13 shows the six I/O modules and the location of the connectors and securing screws. Use the following steps to replace an I/O module: NOTE: This procedure is not applicable for the M1 enclosure.
2–46 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide 1. Connect a PC or terminal to the maintenance port of an operational controller. 2. In a dual-redundant controller configuration, disable failover with the following command: SET NOFAILOVER 3. Start FRUTIL with the following command: RUN FRUTIL The following display appears: Do you intend to replace this controller’s cache battery? Y/N 4. Enter N(o). The following menu appears: FRUTIL Main Menu: 1.
Replacement Procedures 2–47 5. Enter option 4.
2–48 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide 14. Enable failover and re-establish the dual-redundant configuration with the following command: SET FAILOVER COPY=THIS_CONTROLLER This command copies the subsystem configuration from “this controller” to the “other controller.” 15. Disconnect the PC or terminal from the controller maintenance port. Replacing an EMU Use the following steps in “Removing an EMU” and “Installing an EMU” to replace the EMU.
Replacement Procedures 2–49 Installing an EMU CAUTION: Carefully align the EMU in the appropriate guide rails. Misalignment might damage the backplane. After installing the EMU, check the PVA SCSI ID number on the master enclosure to make sure it represents the correct enclosure number (ID 0). If the SCSI ID number is not 0, reset it to ID 0 before starting the controller. 1. Insert the EMU into its bay (see Figure 2–9, levers. 1 on page 2–17) and engage its retaining 2.
2–50 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Replacing DIMMs Use the following steps in “Removing DIMMs” and “Installing DIMMs” to replace DIMMs in a cache module. DIMM locations are shown in Figure 2–14 and supported configurations are shown in Table 2–1. 3 1 4 2 CXO6576B Figure 2–14.
Replacement Procedures 2–51 Use Figure 2–15 during the removal and installation procedures for component clarification. Removing DIMMs Use the following steps to remove a DIMM from a cache module: 1. Remove the cache module using the steps in either “Removing the Cache Module in a Single-Controller Configuration” on page 2–14, or “Removing a Cache Module in a Dual-Redundant Controller Configuration” on page 2–30. 2. Press the DIMM retaining clip (see Figure 2–15).
2–52 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide 1 2 3 1 2 3 CXO6577B Figure 2–15.
Replacement Procedures 2–53 Replacing a Fiber Cable, Switch, or Hub Use the following steps in “Remove a Fiber Cable, Switch, or Hub” and “Install a Fiber Cable, Switch, or Hub” to replace a fiber cable, switch, or hub. Remove a Fiber Cable, Switch, or Hub Use the following steps to remove a cable connected to either side of your switch or hub, or to remove the switch or hub: 1. Shut down the host system using host documentation. 2. Shut down the controllers.
2–54 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide 1. If replacing a cable, connect the replacement cable into the ports previously used by the old cable. If replacing a switch or hub, reconnect all cables removed from the old switch or hub. 2. Restart each controller by pressing its reset button. The controllers automatically restart and the subsystem is now ready for operation. 3. Restart the host system using host documentation.
Replacement Procedures 2–55 3. Shut down the controllers. ■ In single-controller configurations, shut down “this controller” with the following command: SHUTDOWN THIS_CONTROLLER ■ In dual-redundant controller configurations, shut down the “other controller” first, then shut down “this controller” with the following commands: SHUTDOWN OTHER_CONTROLLER SHUTDOWN THIS_CONTROLLER When the controllers shut down, the reset buttons and the first three LEDs are lit continuously (see Figure 2–4).
2–56 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Replacing a Failed Storageset Member If a disk drive fails in a RAIDset or mirrorset, the controller automatically places it into the failedset. If the spareset contains a replacement drive that satisfies the storageset replacement policy, the controller automatically replaces the failed member with the replacement drive.
Chapter 3 Upgrading the Subsystem This chapter provides instructions for upgrading the controller software, installing software patches, upgrading firmware on a device, upgrading from a single-controller configuration to a dual-redundant controller configuration, and upgrading cache memory. IMPORTANT: See Chapter 2 to review the list of required tools and the precautions to follow prior to performing any procedure within this chapter.
3–2 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Upgrading Controller Software Upgrade controller software using one of two ways: ■ Install a new program card (see Figure 3–1) that contains the new software. ■ Download a new software image, and use the menu-driven Code Load/Code Patch (CLCP) utility to write it onto the existing program card. Use this utility to also install, delete, and list patches to the controller software.
Upgrading the Subsystem 3–3 ■ In single-controller configurations, shut down “this controller” with the following command: SHUTDOWN THIS_CONTROLLER ■ In dual-redundant controller configurations, shut down the “other controller” first, then shut down “this controller” with the following commands: SHUTDOWN OTHER_CONTROLLER SHUTDOWN THIS_CONTROLLER When the controllers shut down, the reset buttons and the first three LEDs are lit continuously (see Figure 2–4).
3–4 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide 2. Load the image onto a PC or workstation using its file- or network-transfer capabilities. 3. From a host console, quiesce all port activity and dismount the storage units in the subsystem. IMPORTANT: Do not remove the program card in the next step. 4. Remove the program card ESD cover.
Upgrading the Subsystem 3–5 7. Enter option 1. The following display appears: You have selected the Code Load Utility. This utility is used to load a new software image into the program card currently inserted in the controller. Type ^Y or ^C (then RETURN) at any time to abort code load. The code image may be loaded using SCSI Write Buffer commands through the SCSI Host Port, or using KERMIT through the maintenance terminal port.
3–6 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide 14. Use KERMIT to transfer the binary image from the PC to the controller. When the download is complete, CLCP automatically writes the new image to the program card and restarts the controller. 15. Verify that the controller is running the new software version with the following command: SHOW THIS CONTROLLER 16.
Upgrading the Subsystem 3–7 4. Start CLCP with the following command: RUN CLCP The following menu appears: Select an option from the following list: Code Load & Patch local program Main Menu 0: Exit 1: Enter Code LOAD local program 2: Enter Code PATCH local program 3: Enter EMU Code LOAD utility Enter option number (0..3) [0] ? 5. Enter option 2. The following menu appears: You have selected the Code Patch local program. This program is used to manage software code patches.
3–8 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide 9. For dual-redundant controller configurations, repeat step 2 through step 8 for the second controller. Deleting a Software Patch Use the following steps to delete a software patch: 1. From a host console, quiesce all port activity. 2. Connect a PC or terminal to the controller maintenance port. 3.
Upgrading the Subsystem 3–9 patches are also selected for deletion. The program lists your deletion selections and asks if you wish to continue. Type ^Y or ^C (then RETURN) at any time to abort Code Patch. The following patches are currently stored in the patch area: Software Version - Patch number(s) xxxx xxxx Currently, xx% of the patch area is free. Software Version of patch to delete? 6. Enter the software version of the patch to delete and press Enter/Return.
3–10 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide The following menu appears: Select an option from the following list: Code Load & Patch local program Main Menu 0: Exit 1: Enter Code LOAD local program 2: Enter Code PATCH local program 3: Enter EMU Code LOAD utility Enter option number (0..3) [0] ? 3. Enter option 2. The following menu appears: You have selected the Code Patch local program. This program is used to manage software code patches.
Upgrading the Subsystem 3–11 Upgrading Firmware on a Device Use the format and device code load utility (HSUTIL) to upgrade a device with firmware located in contiguous blocks at a specific logical block numbers (LBNs) on a source disk drive configured as a unit on the same controller. Upgrading firmware on a disk is a two-step process (see Figure 3–3): 1. Copy the new firmware from the host to a disk drive configured as a unit in the subsystem. 2.
3–12 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide ■ During the installation, the source disk drive is not available for other subsystem operations. ■ Some devices might not reflect the new firmware version number when viewed from the “other controller” in a dual-redundant controller configuration. If this occurs, enter the following CLI command: CLEAR_ERRORS device-name UNKNOWN.
Upgrading the Subsystem 3–13 6. Choose the single-disk unit as the source disk for the download. 7. Enter the starting LBN of the firmware image—usually LBN 0. 8. Enter the product ID of the device being upgraded. This ID corresponds to the product information reported in the Type column when issuing a SHOW DISK FULL command. HSUTIL lists all devices that correspond to the product ID entered. 9. Enter the disk or tape name of the device being upgraded. 10.
3–14 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Upgrading to a Dual-Redundant Controller Configuration Use the following steps to upgrade a single-controller configuration subsystem to a dual-redundant configuration subsystem. To replace failed components, see Chapter 2 for more information.
Upgrading the Subsystem 4. Enter option 2. The following menu appears: Install Options: 1. Other controller and cache module 2. Other controller module 3. Other cache module 4. Exit Enter choice: 1, 2, 3, or 4 -> 5. Enter option 1. The following display appears: Insert both the slot A [or B} controller and cache module? Y/N 6. Enter Y(es). The following display appears: Quiescing all device ports. Please wait... Device Port 1 quiesced. Device Port 2 quiesced. Device Port 3 quiesced.
3–16 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide CAUTION: The ECB must be disabled—the status light is not lit and is not blinking—before disconnecting the ECB cable from the cache module. Failure to disable the ECB might damage the cache module. 8. Disable the ECB by pressing the battery disable switch until the status light stops blinking—approximately five seconds. 9. Connect the new ECB cable to the new cache module.
Upgrading the Subsystem 3–17 14. Enable failover, and establish the dual-redundant controller configuration with the following command: SET FAILOVER COPY=THIS_CONTROLLER This command copies the subsystem configuration from “this controller” to the new controller. 15. See the Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 CLI Reference Guide to configure the controller. 16. Disconnect the PC or terminal from the controller maintenance port.
3–18 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 3–1 Cache Module Memory Configurations Memory DIMMs Quantity 64 MB 32 MB 2 128 MB 32 MB 4 256 MB 128 MB 2 512 MB 128 MB 4 Location 1 3 1 2 3 4 1 3 1 2 3 4 IMPORTANT: For ACS V8.5P installations, the required cache memory configuration is 512 MB. For ACS V8.5S, Compaq strongly recommends using 512 MB of cache memory. To upgrade cache module memory, its controller must be shut down.
Upgrading the Subsystem 3–19 CAUTION: The ECB must be disabled—the status light is not lit and is not blinking—before disconnecting the ECB cable from the cache module. Failure to disable the ECB might result in cache module damage. 5. Disable the ECB by pressing the battery disable switch until the status light stops blinking—approximately five seconds. 6. Disconnect the ECB cable from the cache module. 7.
3–20 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide 1 2 3 DIMM DIMM slot DIMM retaining clip 1 2 3 CXO6577B Figure 3–5. DIMM components 9. If replacing DIMMs (see Figure 3–5): a. Press down on the DIMM retaining clip removed. 3 at both ends of the DIMM 1 being b. Gently remove the DIMM from the DIMM slot 1 2. c. Insert the new DIMM straight into the slot, ensuring that the notches in the DIMM align with the tabs in the slot. d.
Upgrading the Subsystem 3–21 IMPORTANT: In a dual-redundant controller configuration, both cache modules must contain the same memory configuration. DO NOT proceed unless both cache modules contain identical amounts of cache memory. CAUTION: Carefully align the cache module in the appropriate guide rails. Misalignment might damage the backplane. 11. Insert the cache module into its bay and engage the retaining levers. 12. Connect the ECB cable to the cache module. 13.
Chapter 4 Troubleshooting Resources This chapter provides guidelines for troubleshooting the controller, cache module, and ECB. It also describes the utilities and exercisers available to aid in troubleshooting these components. See Chapter 5 for a list of event codes. See enclosure documentation for information on troubleshooting its hardware, such as the power supplies, cooling fans, and EMU.
4–2 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide event report. This report will contain an instance code, located at offset 32 through 35, that can be used to determine the cause of the error. See “Translating Event Codes” on page 4–29 for help on translating instance codes. ECB Charging Diagnostics Whenever restarting the controller, its diagnostic routines automatically check the charge of each ECB battery.
Troubleshooting Resources 4–3 Typical Installation Troubleshooting Checklist The following checklist provides a general procedure for diagnosing the controller and its supporting modules. By following this checklist, many of the problems that occur in a typical installation will be identified. After identifying a problem, use Table 4–1 to confirm the diagnosis and fix the problem.
4–4 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Show these codes and translate the “last failure” codes they contain. See the section on “Displaying Failure Entries” on page 4–28 and “Translating Event Codes“ on page 4–29. If the controller failed to the extent that it cannot support a local terminal for FMU, check the host error log for the “instance” or “last failure” codes. See Chapter 5 to interpret the event codes. 7.
Troubleshooting Resources 4–5 Troubleshooting Table After diagnosing a problem, use Table 4–1 to resolve it. Table 4–1 Troubleshooting Table (Sheet 1 of 7) Symptom Reset button not lit. Possible Cause No power to subsystem. Investigation Check power to subsystem and power Remedy Replace cord or AC input power module. supplies on controller’s shelf. Make sure that all cooling fans are installed.
4–6 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 4–1 Troubleshooting Table (Sheet 2 of 7) Symptom Cannot set failover to create dual-redundant configuration. Possible Cause Investigation Remedy Incorrect command syntax. See the HSG80 Array Controller ACS Version 8.5 CLI Reference Guide for the SET FAILOVER command. Use the correct command syntax. Different software versions on controllers. Check software versions on both controllers.
Troubleshooting Resources 4–7 Table 4–1 Troubleshooting Table (Sheet 3 of 7) Symptom Possible Cause Investigation Nonmirrored cache; controller reports failed DIMM in cache module A or B. Improperly installed DIMM. Remove cache module and make sure that the DIMM is fully seated in its slot. Reseat DIMM. Failed DIMM. If the foregoing check fails to produce a remedy, check for OCP LED codes. Replace DIMM. Mirrored cache; “this controller” reports DIMM 1 or 2 failed in cache module A or B.
4–8 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 4–1 Troubleshooting Table (Sheet 4 of 7) Symptom Invalid cache. Possible Cause Investigation Mirrored-cache mode discrepancy. This may occur after you’ve installed a new controller. Its existing cache module is set for mirrored caching, but the new controller is set for unmirrored caching. (It may also occur if the new controller is set for mirrored caching but its existing cache module is not.
Troubleshooting Resources 4–9 Table 4–1 Troubleshooting Table (Sheet 5 of 7) Symptom Cannot add device. Cannot configure storagesets. Possible Cause Investigation Remedy Illegal device. See product-specific release notes that accompanied the software release for the most recent list of supported devices. Replace device. Device not properly installed in shelf. Check that SBB is fully seated. Firmly press SBB into slot. Failed device. Check for presence of device LEDs.
4–10 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 4–1 Troubleshooting Table (Sheet 6 of 7) Symptom Possible Cause Investigation Remedy Incorrect command syntax. See the HSG80 Array Controller ACS Version 8.5 CLI Reference Guide for correct syntax. Reassign the unit number with the correct syntax. Incorrect SCSI target ID numbers set for controller that accesses desired unit.
Troubleshooting Resources 4–11 Table 4–1 Troubleshooting Table (Sheet 7 of 7) Symptom Possible Cause Investigation Remedy Host’s log file or maintenance terminal indicates that a forced error occurred when the controller was reconstructing a RAIDset or mirrorset. Unrecoverable read errors may have occurred when controller was reconstructing the storageset. Errors occur if another member fails while the controller is reconstructing the storageset.
4–12 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Caching Techniques The cache module supports the following caching techniques to increase subsystem read and write performance: ■ Read caching ■ Read-ahead caching ■ Write-through caching ■ Write-back caching Read Caching When the controller receives a read request from the host, it reads the data from the disk drives, delivers it to the host, and stores the data in its cache module.
Troubleshooting Resources 4–13 Write-Through Caching When the controller receives a write request from the host, it places the data in its cache module, writes the data to the disk drives, then notifies the host when the write operation is complete. This process is called write-through caching because the data actually passes through—and is stored in—the cache memory on its way to the disk drives.
4–14 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Cache Policies Resulting from Cache Module Failures If the controller detects a full or partial failure of its cache module or ECB, it automatically reacts to preserve the unwritten data in its cache module. Depending upon the severity of the failure, the controller chooses an interim caching technique—also called the cache policy—which it uses until the cache module or ECB is repaired or replaced.
Troubleshooting Resources 4–15 Table 4–2 Cache Policies—Cache Module Status (Continued) Cache Module Status Cache A DIMM or cache memory controller chip failure. Cache B Good. Cache Policy Unmirrored Cache Mirrored Cache Data integrity: Write-back data that was not written to media when failure occurred was not recovered. Data integrity: Controller A recovers all of its write-back data from the mirrored copy on cache B.
4–16 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 4–3 Resulting Cache Policies—ECB Status Cache Module Status Cache A At least 50% charged. Less than 50% charged. Cache B At least 50% charged. At least 50% charged. Cache Policy Unmirrored Cache Mirrored Cache Data loss: No. Data loss: No. Cache policy: Both controllers continue to support write-back caching. Cache policy: Both controllers continue to support write-back caching. Failover: No.
Troubleshooting Resources 4–17 Table 4–3 Resulting Cache Policies—ECB Status (Continued) Cache Module Status Cache A Less than 50% charged. Failed. Cache B Less than 50% charged. Less than 50% charged. Cache Policy Unmirrored Cache Mirrored Cache Data loss: No. Data loss: No. Cache policy: Both controllers support write-through caching only. Cache policy: Both controllers support write-through caching only. Failover: No. Failover: No. Data loss: No. Data loss: No.
4–18 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide ❏ Have an ECB connected and the UPS switch is set to one of the following: ▲ NOUPS (no UPS is connected) ▲ NODE_ONLY (a UPS is connected) ❏ Do not have an ECB connected and the UPS switch is set to DATACENTER_WIDE ■ No unit errors are outstanding (for example, lost data or data that cannot be written to devices). ■ Both controllers are started and configured in failover mode.
Troubleshooting Resources 4–19 Use the following legend for both tables: ■ = reset button FLASHING (in Table 4–4) or ON (in TABLE 4–5) ❏ = reset button OFF ● = LED FLASHING (in Table 4–4) or ON (in TABLE 4–5) ❍ = LED OFF NOTE: If the reset button is flashing and an LED is lit continuously, either the devices on that LED bus do not match the controller configuration, or an error occurred in one of the devices on that bus. Also, a single LED that is lit indicates a failure of the drive on that port.
4–20 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 4–4 Flashing OCP Patterns (Continued) Pattern OCP Code Error Repair Action ■❍❍●●●❍ E Memory error in the JSRAM. Replace controller. ■❍❍●●●● F Wrong image found on program card. Replace program card or replace controller if needed. ■❍●❍❍❍❍ 10 Controller Module memory is bad. Replace controller. ■❍●❍❍●❍ 12 Controller Module memory addressing is malfunctioning. Replace controller.
Troubleshooting Resources 4–21 Table 4–4 Flashing OCP Patterns (Continued) Pattern OCP Code ■●●●●●● 3F Error An invalid process ran during initialization. Repair Action Replace controller. Solid OCP Pattern Display Reporting Certain events cause a solid display of the OCP LEDs. The event and its resulting patterns are described in Table 4–5.
4–22 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 4–5 Solid OCP Patterns (Sheet 2 of 5) Pattern ■●❍❍●❍● ■●❍❍●●❍ OCP Code 25 26 Error Recursive Bugcheck detected. Repair Action The same bugcheck has occurred three times within ten minutes, and controller operation has terminated. Reset the controller.
Troubleshooting Resources 4–23 Table 4–5 Solid OCP Patterns (Sheet 3 of 5) Pattern ■●❍●●❍● OCP Code Error 2D All master cabinet SCSI buses are not set to ID 0. Repair Action Set PVA ID to 0 for the cabinet with the controllers. If problem persists, try the following repair actions: 1. Replace the PVA module. 2. Replace the EMU. 3. Remove all devices. 4. Replace the cabinet. ■●❍●●●❍ 2E Multiple cabinets have the same SCSI ID. More than one cabinet have the same SCSI ID .
4–24 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 4–5 Solid OCP Patterns (Sheet 4 of 5) Pattern ■●●❍●❍● OCP Code Error 35 An unexpected bugcheck occurred during Last Failure processing. Repair Action Reset controller. Last Failure Processing interrupted by another Last Failure event. ■●●❍●●❍ 36 Hardware-induced controller reset expected. Replace controller. Automatic hardware reset failed.
Troubleshooting Resources 4–25 Table 4–5 Solid OCP Patterns (Sheet 5 of 5) Pattern OCP Code ■●●●●●● 3F Error DAEMON diagnostic failed hard in non-fault tolerant mode. Repair Action Verify that cache module is present. If the error persists, replace controller. DAEMON diagnostic detected critical hardware component failure; controller can no longer operate.
4–26 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Spontaneous Event Log Spontaneous event logs are automatically displayed on the maintenance terminal (unless disabled via the FMU) using %EVL formatting, as illustrated in the following examples: %EVL--HSG> --13-JAN-1999 04:32:47 (time not set)-- Instance Code: 0102030A (not yet reported to host) Template: 1.(01) Power On Time: 0.Years, 14.Days, 19.Hours, 58.Minutes, 43.
Troubleshooting Resources 4–27 CLI Event Reporting CLI event reports are automatically displayed on the maintenance terminal (unless disabled via the FMU) using %CER formatting, as shown in the following example: %CER--HSG> --13-JAN-1999 04:32:20 (time not set)-- Previous controlleroperation terminated with display of solid fault code, OCP Code: 3F HSG> Utilities and Exercisers Controller software includes utilities and exercisers to assist in troubleshooting and maintaining the controller and the other
4–28 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide ■ Display the instance codes that identify and accompany significant events which do not cause the controller to terminate operation. ■ Display the last-failure codes that identify and accompany failure events which cause the controller to stop operating. Last-failure codes are sent to the host only after the affected controller is restarted successfully.
Troubleshooting Resources 4–29 Last Failure Entry: 4. Flags: 006FF300 Template: 1.(01) Description: Last Failure Event Power On Time: 0. Years, 14. Days, 19. Hours, 51. Minutes, 31. Seconds Controller Model: HSG80 Serial Number: AA12345678 Hardware Version: 0000(00) Software Version: V085F(55) Informational Report Instance Code: 0102030A Description: An unrecoverable software inconsistency was detected or an intentional restart or shutdown of controller operation was requested. Reporting Component: 1.
4–30 Compaq StorageWorks HSG80 Array Controller ACS Version 8.
Troubleshooting Resources 4–31 Controlling the Display of Significant Events and Failures Control how the fault management software displays significant events and failures, as desired, using the SET command. Table 4–7 describes various SET commands that can be entered while running FMU. These commands remain in effect only as long as the current FMU session remains active, unless the PERMANENT qualifier is entered (the last entry in the table).
4–32 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 4–7 FMU SET Commands (Continued) Command Result SET PROMPT SET NOPROMPT Enable and disable the display of the CLI prompt string following the log identifier “%EVL,” or “%LFL,” or “%FLL.” This command is useful if the CLI prompt string is used to identify the controllers in a dual-redundant configuration (see the HSG80 Array Controller ACS Version 8.
Troubleshooting Resources Using VTDPY to Check for Communication Problems Use the VTDPY utility to obtain information about the following communications: ■ Communication between the controller and its hosts. ■ Communication between the controller and subsystem devices. ■ State and I/O activity of logical units, devices, and device ports in the subsystem. ■ Monitoring communnication between local and remote controllers in a Data Replication Manager configuration. Use the following steps to run VTDPY: 1.
4–34 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 4–8 VTDPY Key Sequences and Commands (Continued) Ctrl/R Refreshes current screen display Ctrl/Y Exits VTDPY Commands can be abbreviated to the minimum number of characters necessary to identify the command. Enter a question mark (?) after a partial command to see the values that can follow the supplied command. For example, if DISP ? is entered, the utility will list CACHE, DEFAULT, and so forth.
Troubleshooting Resources 4–35 Table 4–9 VTDPY Default Display Columns Column Pr Name Stk/Max Typ Sta CPU% Port Target Contents Process priority Priority name or NULL (idle) Stack size in 512 byte pages and maximum number of stack pages actually used Process type: FNC = functional process DUP = resident device utility/exerciser in use Status: Bl = waiting for completion of a process currently running Io = waiting for input or output Rn = actively running Percentage of central processing unit
4–36 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 4–9 VTDPY Default Display Columns (Continued) Column A Contents Availability of the unit: a = available to “other controller” d = disabled for servicing, offline e = mounted for exclusive access by a user f = media format error i = inoperative m = maintenance mode for diagnostic purposes o = online. Host can access this unit through “this controller”.
Troubleshooting Resources 4–37 Table 4–9 VTDPY Default Display Columns (Continued) Column Contents KB/S Average amount of data transferred to and from the unit during the last update interval in 1000-byte increments. Rd% Percentage of data transferred between the host and the unit that were read from the unit. Wr% Percentage of data transferred between the host and the unit that were written to the unit. CM% Percentage of data transferred between the host and the unit that were compared.
4–38 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide VTDPY>DISPLAY DEVICE HSG80 S/N: ZG92712820 SW: SSDRS-0 HW: E-06 99.
Troubleshooting Resources 4–39 Table 4–10 Device Map Columns Column Contents Port SCSI ports 1 through 6. Target SCSI targets 0 through 15. Single controllers occupy 7; dual-redundant controllers occupy 6 and 7.
4–40 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 4–11 Device Status Columns (Continued) Column S Contents Spindle state of the device: ^ = disk spinning at correct speed; tape loaded > = disk spinning up < = disk spinning down v = disk not spinning = unknown spindle state W Write-protection state of the device. For disk drives, a W in this column indicates that the device is hardware write-protected.
Troubleshooting Resources 4–41 Table 4–12 Device-Port Status Columns (Continued) Column Contents RdKB/S Average data transfer rate from the devices on the port (reads) during the last update interval. WrKB/S Average data transfer rate to the devices on the port (writes) during the last update interval. CR Number of SCSI command resets that occurred since VTDPY was started. BR Number of SCSI bus resets that occurred since VTDPY was started.
4–42 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 4–13 Unit Status Columns (Continued) Column A S Contents Availability of the unit: a = available to “other controller” d = disabled for servicing, offline e = mounted for exclusive access by a user f = media format error i = inoperative m = maintenance mode for diagnostic purposes o = online. Host can access this unit through “this controller”.
Troubleshooting Resources 4–43 Table 4–13 Unit Status Columns (Continued) Column Contents KB/S Average amount of data transferred to and from the unit during the last update interval in 1000-byte increments. Rd% Percentage of data transferred between the host and the unit that were read from the unit. Wr% Percentage of data transferred between the host and the unit that were written to the unit. CM% Percentage of data transferred between the host and the unit that were compared.
4–44 Compaq StorageWorks HSG80 Array Controller ACS Version 8.
Troubleshooting Resources 4–45 Table 4–14 Fibre Channel Host Status Display — Known Hosts (Connections) (Continued) Field Label S Description Status: N = online F = offline The following tables detail the remaining portions of the Fibre Channel Host Status Display. Table 4–15 includes the labels that report the status of ports one and two, and Table 4–16 describes the Link Error Counters.
4–46 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 4–16 Fibre Channel Host Status Display — Link Error Counters (Continued) Field Label Description Bad Rx Chars This field represents the number of times the 8B/10B decode detected an invalid 10-bit code. FC-PH denotes this value as “Invalid Transmission Word during frame reception.” This field may be non-zero after initialization.
Troubleshooting Resources 4–47 TACHYON Chip Status The number that appears in the TACHYON Status field represents the current state of the TACHYON or Fibre Channel control chip. It consists of a two-digit hexadecimal number, the first of which is explained in Table 4–17. The second digit is outlined in Table 4–18. Refer to the Hewlett-Packard© TACHYON user manual for a more detailed explanation of the TACHYON chip definitions.
4–48 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Checking Runtime Status of Remote Copy Sets Use the remote display to see the runtime status of all remote copy sets (see Figure 4–5). This feature is only supported in ACS V8.5P.
Troubleshooting Resources Table 4–19 Remote Display Columns—ACS V8.5P only (Continued) Column INIT U Contents Initiator unit number Availability of the unit: a = available to “other controller” d = disabled for servicing, offline e = mounted for exclusive access by a user f = media format error i = inoperative m = maintenance mode for diagnostic purposes o = online. Host can access this unit through “this controller”.
4–50 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide DILX Checking for Disk Drive Problems Use DILX to check the data-transfer capability of disk drives. DILX generates intense read/write loads to the disk drive while monitoring drive performance and status. Run DILX on as many disk drives as desired, but since this utility creates substantial I/O loads on the controller, Compaq recommends stopping host-based I/O during the test.
Troubleshooting Resources 4–51 IMPORTANT: Use the auto-configure option if testing the read and write capabilities of every disk drive in the subsystem. 4. Decline the auto-configure option to allow testing of a specific disk drive. 5. Accept the default test settings and run the test in read-only mode. 6. Enter the unit number of the specific disk drive to test. For example: to test D107, enter the number 107. 7. If testing more than one disk drive, enter the appropriate unit numbers when prompted.
4–52 Compaq StorageWorks HSG80 Array Controller ACS Version 8.
Troubleshooting Resources 4–53 4. Decline the auto-configure option to allow testing of a specific disk drive. 5. Decline the default settings. NOTE: To ensure that DILX accesses the entire disk space, enter 120 minutes or more in the next step. The default setting is 10 minutes. 6. Enter the number of minutes desired for running the DILX Basic Function test. 7. Enter the number of minutes between the display of performance summaries. 8. Choose to include performance statistics in the summary. 9.
4–54 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide DILX Error Codes Table 4–22 explains the error codes that DILX might display during and after testing. Table 4–22 DILX Error Codes Error Code Explanation 1 Illegal Data Pattern Number found in data pattern header. DILX read data from the disk and discovered that the data did not conform to the pattern in which it was previously written. 2 No write buffers correspond to data pattern.
Troubleshooting Resources 4–55 Table 4–23 HSUTIL Messages and Inquiries (Continued) Message Description Unit is in maintenance mode. Device cannot be formatted or code loaded because it is being used by another subsystem function or local program. Exclusive access is declared for unit. Another subsystem function has reserved the unit shown. The other controller has exclusive access declared for unit. The companion controller has locked out this controller from accessing the unit shown.
4–56 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide CLCP Utility Use the CLCP utility to upgrade the controller software and the EMU software. Also use it to patch the controller software. When installing a new controller, the correct (or current) software version and patch numbers must be available. See Chapter 3 for more information about using this utility. NOTE: Only Compaq field service personnel are authorized to upload EMU microcode updates.
Troubleshooting Resources 4–57 CHVSN Utility The CHVSN utility generates a new volume serial number (called VSN) for the specified device and writes it on the media. It is a way to eliminate duplicate volume serial numbers and to rename duplicates with different volume serial numbers. NOTE: Only Compaq authorized service personnel can use this utility.
Chapter 5 Event Reporting: Templates and Codes This chapter describes the event codes that the fault management software provides for spontaneous events and last failure events. The HSG80 controller uses various codes to report different types of events, and these codes are presented in template displays.
5–2 Compaq StorageWorks HSG80 Array Controller ACS Version 8.
Event Reporting: Templates and Codes 5–3 Table 5–2 Template 01—Last Failure Event Sense Data Response Format ↓ offset bit → 0 1 2 3–6 7 8–11 12 13 14 15–17 18–31 32–35 36 37 38–53 54–69 70–73 74 75 76 77–103 104–107 108–111 112–115 116–119 120–123 124–127 128–131 132–135 136–139 140–159 7 Unused 6 5 Unused 4 3 Error Code Unused 2 1 Sense Key Unused Additional Sense Length Unused Additional Sense Code (ASC) Additional Sense Code Qualifier (ASCQ) Unused Unused Reserved Instance Code Template Templa
5–4 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Multiple-Bus Failover Event Sense Data Response (Template 04) The HSG80 SCSI Host Interconnect Services software component reports Multiple Bus Failover events via the Multiple Bus Failover Event Sense Data Response (see Table 5–3). The error or condition is signaled to all host systems on all logical units. ■ ASC and ASCQ codes (byte offsets 12 and 13) are detailed in the “ASC/ASCQ Codes” section on page 5–17.
Event Reporting: Templates and Codes 5–5 Table 5–3 Template 04—Multiple-Bus Failover Event Sense Data Response Format (Continued) ↓ offset bit → 77–103 104–131 132–159 7 6 5 4 3 2 Reserved Affected LUNs Extension (TM0) Reserved 1 0 Failover Event Sense Data Response (Template 05) The HSG80 controller Failover Control software component reports errors and other conditions encountered during redundant controller communications and failover operation via the Failover Event Sense Data Response (see Ta
5–6 Compaq StorageWorks HSG80 Array Controller ACS Version 8.
Event Reporting: Templates and Codes 5–7 Table 5–5 Template 11—Nonvolatile Parameter Memory Component Event Sense Data Response Format ↓ offset bit → 0 1 2 3–6 7 8–11 12 13 14 15–17 18–31 32–35 36 37 38–53 54–69 70–73 74 75 76 77–103 104–107 108–111 112–114 115 116–159 7 Unused 6 5 Unused 4 3 Error Code Unused 2 1 Sense Key Unused Additional Sense Length Unused Additional Sense Code (ASC) Additional Sense Code Qualifier (ASCQ) Unused Unused Reserved Instance Code Template Template Flags Reserved
5–8 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Backup Battery Failure Event Sense Data Response (Template 12) The HSG80 controller Value Added Services software component reports backup battery failure conditions for the various hardware components that use a battery to maintain state during power failures via the Backup Battery Failure Event Sense Data Response (see Table 5–6). The failure condition is signaled to all host systems on all logical units.
Event Reporting: Templates and Codes 5–9 Table 5–6 Template 12—Backup Battery Failure Event Sense Data Response Format (Continued) ↓ offset bit → 104–107 108–159 7 6 5 4 3 Memory Address Reserved 2 1 0 Subsystem Built-In Self Test Failure Event Sense Data Response (Template 13) The HSG80 controller Subsystem Built-In Self Tests software component reports errors detected during test execution via the Subsystem Built-In Self Test Failure Event Sense Data Response (see Table 5–7).
5–10 Compaq StorageWorks HSG80 Array Controller ACS Version 8.
Event Reporting: Templates and Codes 5–11 Table 5–8 Template 14—Memory System Failure Event Sense Data Response Format ↓ offset bit → 0 1 2 3–6 7 8–11 12 13 14 15–17 18–19 20–23 24–27 28–31 32–35 36 37 38–39 40–43 44–47 48–51 52–53 54–69 70–73 74 75 76 77–79 80–83 84–87 88–91 92–95 96–99 100–103 7 Unused 6 5 Unused 4 3 Error Code Unused 2 1 Sense Key Unused Additional Sense Length Unused Additional Sense Code (ASC) Additional Sense Code Qualifier (ASCQ) Unused Unused Reserved Reserved or RDR2 (T
5–12 Compaq StorageWorks HSG80 Array Controller ACS Version 8.
Event Reporting: Templates and Codes 5–13 Table 5–9 Template 41—Device Services Non-Transfer Error Event Sense Data Response Format ↓ offset bit → 0 1 2 3–6 7 8–11 12 13 14 15–17 18–31 32–35 36 37 38–53 54–69 70–73 74 75 76 77–103 104 105 106 107 108–159 7 Unused 6 5 Unused 4 3 Error Code Unused 2 1 Sense Key Unused Additional Sense Length Unused Additional Sense Code (ASC) Additional Sense Code Qualifier (ASCQ) Unused Unused Reserved Instance Code Template Template Flags Reserved Controller Boa
5–14 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Disk Transfer Error Event Sense Data Response (Template 51) The HSG80 controller Device Services and Value Added Services software components report errors detected while performing work related to disk (including CD-ROM and optical memory) device transfer operations via the Disk Transfer Error Event Sense Data Response (see Table 5–10).
Event Reporting: Templates and Codes 5–15 Table 5–10 Template 51—Disk Transfer Error Event Sense Data Response Format (Continued) ↓ offset bit → 77–78 79–82 83–98 99–100 101 102–103 104–121 122–159 7 6 5 4 3 2 Reserved Device Firmware Revision Level Device Product ID Reserved Device Type Reserved Device Sense Data Reserved 1 0 Data Replication Manager Services Event Sense Response (Template 90) This section only applies to ACS version 8.5P.
5–16 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–11 Template 90—Data Replication Manager Services Event Sense Data Response Format (ACS V8.
Event Reporting: Templates and Codes 5–17 ASC/ASCQ Codes Table 5–12 lists HSG80-specific SCSI ASC and ASCQ codes. These codes are Template-specific and appear at byte offsets 12 and 13. NOTE: Additional codes that are common to all SCSI devices can be found in the SCSI specification. . Table 5–12 ASC and ASCQ Codes (Sheet 1 of 3) ASC Code ASCQ Code Description 04 80 Logical unit is disaster tolerant failsafe locked (inoperative). 3F 85 Test Unit Ready or Read Capacity Command failed.
5–18 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–12 ASC and ASCQ Codes (Sheet 2 of 3) ASC Code ASCQ Code Description A0 07 RAID membership event report. A0 08 Multiple Bus failover event. A0 09 Multiple Bus failback event. A0 0A Disaster Tolerance failsafe error mode can now be enabled. A1 00 Shelf OK is not properly asserted. A1 01 Unable to clear SWAP interrupt. Interrupt disabled. A1 02 Swap interrupt re-enabled.
Event Reporting: Templates and Codes Table 5–12 ASC and ASCQ Codes (Sheet 3 of 3) ASC Code ASCQ Code Description D1 07 Unexpected disconnect. D1 08 Unexpected message. D1 09 Unexpected Tag message. D1 0A Channel busy. D1 0B Device initialization failure. Device sense data available. D2 00 Miscellaneous SCSI driver error. D2 03 Device services had to reset the bus. D3 00 Drive SCSI chip reported gross error. D4 00 Non-SCSI bus parity error.
5–20 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Instance Codes An instance code is a number that uniquely identifies an event being reported. Instance Code Structure Figure 5–1 shows the structure of an instance code. By fully understanding its structure, each code can be translated without using the FMU. 1 3 1 2 3 4 01010302 2 4 Component ID number Event number Repair action Notification/recovery (NR) threshold CXO6992A Figure 5–1.
Event Reporting: Templates and Codes 5–21 Notification/Recovery (NR) Threshold Located at byte offset {8}32 is the NR threshold assigned to the event. This value is used during Symptom-Directed Diagnosis procedures to determine when to take notification/recovery action. For a description of event notification/recovery threshold classifications, see Table 5–14.
5–22 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Component ID A component ID is located at byte offset {11}35. This number uniquely-identifies the software component that detected the event. For details about components ID numbers, see the “Component Identifier Codes” on page 5–93. Table 5–15 contains the numerous instance codes, in ascending order, that might be issued by the controller fault management software.
Event Reporting: Templates and Codes 5–23 Table 5–15 Instance Codes (Sheet 2 of 24) Description Template Repair Action Code 020B2201 Failed read test of a write-back metadata page residing in cache. Dirty write-back cached data exists and cannot be flushed to media. The dirty data is lost. The Memory Address field contains the starting physical address of the CACHEA0 memory. 14 22 020C2201 Cache Diagnostics have declared the cache bad during testing.
5–24 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–15 Instance Codes (Sheet 3 of 24) Instance Code Description Template Repair Action Code 021B0064 Disk Bad Block Replacement attempt completed for a read of controller metadata from a location outside the user data area of the disk.
Event Reporting: Templates and Codes 5–25 Table 5–15 Instance Codes (Sheet 4 of 24) Description Template Repair Action Code 023E2401 Metadata residing in the controller and on the two cache modules disagree as to the mirror node. Note that in this instance, the Memory Address, Byte Count, FX Chip register, Memory Controller register, and Diagnostic register fields are undefined. 14 24 023F2301 The cache backup battery covering the mirror cache is insufficiently charged.
5–26 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–15 Instance Codes (Sheet 5 of 24) Description Template Repair Action Code 024B2401 Write-back caching has been disabled either due to a cache or battery-related problem. The exact nature of the problem is reported by other instance codes. Note that in this instance, the Memory Address, Byte Count, FX Chip register, Memory Controller register, and Diagnostic register fields are undefined.
Event Reporting: Templates and Codes 5–27 Table 5–15 Instance Codes (Sheet 6 of 24) Description Template Repair Action Code 025A000A The command failed because the unit became inoperative prior to command completion. The Information field of the Device Sense Data contains the block number of the first block in error. 51 00 025B000A The command failed because the unit became unknown to the controller prior to command completion.
5–28 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–15 Instance Codes (Sheet 7 of 24) Instance Code Description Template Repair Action Code 0268530A The device specified in the Device Locator field failed to be added to the RAIDset associated with the logical unit. The device will remain in the Spareset. 51 53 02695401 The device specified in the Device Locator field failed to be added to the RAIDset associated with the logical unit.
Event Reporting: Templates and Codes 5–29 Table 5–15 Instance Codes (Sheet 8 of 24) Description Template Repair Action Code 02755601 The device specified in the Device Locator field had a read error. Attempts to repair the error with data from another mirrorset member failed due to a write error on the original device. The original device will be removed from the mirrorset.
5–30 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–15 Instance Codes (Sheet 9 of 24) Instance Code Description Template Repair Action Code 02864002 The controller has set the specified unit Data Safety Write Protected due to an unrecoverable device failure which prevents writing cached data. 51 40 02872301 The CACHE backup battery has exceeded the maximum number of deep discharges. Battery capacity may be below specified values.
Event Reporting: Templates and Codes 5–31 Table 5–15 Instance Codes (Sheet 10 of 24) Description Template Repair Action Code 02931101 The Uninterruptable Power Supply (UPS) signaled a two minute warning (TMW) before it signaled AC line failure. UPS signals will be ignored until this condition clears. 12 11 0294000A A requested block of data contains a forced error. A forced error occurs when a disk block is successfully reassigned, but the data in that block is lost.
5–32 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–15 Instance Codes (Sheet 11 of 24) Description Template Repair Action Code 03080101 Miscellaneous SCSI Port Driver coding error detected during disk operation. Note that in this instance, the Associated Additional Sense Code and Associated Additional Sense Code Qualifier fields are undefined.
Event Reporting: Templates and Codes 5–33 Table 5–15 Instance Codes (Sheet 12 of 24) Instance Code Description Template Repair Action Code 03224002 Unexpected message. 51 40 03234002 Unexpected Tag message. 51 40 03244002 Channel busy. 51 40 03254002 Message Reject received on a valid message. 51 40 0326450A The disk device reported Vendor Unique SCSI Sense Data. 51 45 03270101 A disk related error code was reported which was unknown to the Fault Management software.
5–34 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–15 Instance Codes (Sheet 13 of 24) Instance Code Description Template Repair Action Code Passthrough 40 03434002 During device initialization the device reported unexpected standard SCSI Sense Data. 03BE0701 The EMU for the cabinet indicated by the Associated Port field has powered down the cabinet because there are less than four working power supplies present.
Event Reporting: Templates and Codes 5–35 Table 5–15 Instance Codes (Sheet 14 of 24) Description Template Repair Action Code 03C80101 No command control structures available for operation to a device which is unknown to the controller. Note that in this instance, the Associated Additional Sense Code and Associated Additional Sense Code Qualifier fields are undefined. 41 01 03C92002 SCSI interface chip command timeout during operation to a device which is unknown to the controller.
5–36 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–15 Instance Codes (Sheet 15 of 24) Description Template Repair Action Code 03D14002 The identification of a device does not match the configuration information. The actual device type is unknown to the controller. Note that in this instance, the Associated Additional Sense Code, and Associated Additional Sense Code Qualifier fields are undefined.
Event Reporting: Templates and Codes 5–37 Table 5–15 Instance Codes (Sheet 16 of 24) Description Template Repair Action Code 03D8450A During device initialization, the device reported the SCSI Sense Key ILLEGAL REQUEST. Indicates that there was an illegal parameter in the command descriptor block or in the additional parameters supplied as data for some commands (FORMAT UNIT, SEARCH DATA, etc.).
5–38 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–15 Instance Codes (Sheet 17 of 24) Description Template Repair Action Code 03E0450A During device initialization, the device reported the SCSI Sense Key VOLUME OVERFLOW. This indicates a buffered peripheral device has reached the end-of-partition and data may remain in the buffer that has not been written to the medium.
Event Reporting: Templates and Codes 5–39 Table 5–15 Instance Codes (Sheet 18 of 24) Instance Code Description Template Repair Action Code 03F20064 The SWAP interrupts have been cleared and re-enabled for all device ports. 41 00 41 00 41 00 41 04 41 04 Note that in this instance, the Associated Port, Associated Target, Associated Additional Sense Code, and Associated Additional Sense Code Qualifier fields are undefined.
5–40 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–15 Instance Codes (Sheet 19 of 24) Instance Code 03F80701 Description The EMU has detected one or more bad power supplies. Template Repair Action Code 41 07 41 06 41 0D 41 0E 41 0F Note that in this instance, the Associated Target, Associated Additional Sense Code, and Associated Additional Sense Code Qualifier fields are undefined. 03F90601 The EMU has detected one or more bad fans.
Event Reporting: Templates and Codes 5–41 Table 5–15 Instance Codes (Sheet 20 of 24) Description Template Repair Action Code 07040B0A Failover Control detected a transmit packet sequence number mismatch. The controllers are out of synchronization with each other and are unable to communicate. Note that in this instance, the Last Failure Code and Last Failure Parameters fields are undefined. 05 0B 07050064 Failover Control received a Last Gasp message from the other controller.
5–42 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–15 Instance Codes (Sheet 21 of 24) Instance Code Description Template Repair Action Code 0C203E02 The Quadrant 0 Memory Controller (CACHEA0) detected a Data Parity error. 14 3E 0C213E02 The Quadrant 1 Memory Controller (CACHEA1) detected a Data Parity error. 14 3E 0C223E02 The Quadrant 2 Memory Controller (CACHEB0) detected a Data Parity error.
Event Reporting: Templates and Codes 5–43 Table 5–15 Instance Codes (Sheet 22 of 24) Description Template Repair Action Code 0E098901 The remote copy set specified by the Remote Copy Set Name field has gone inoperative due to a disaster tolerance failsafe locked condition. 90 89 0E0A8D01 The unit is not made available to the host for the remote copy set specified in the Remote Copy Set Name field.
5–44 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–15 Instance Codes (Sheet 23 of 24) Instance Code 0E238F01 Description The logical unit specified by the Log Unit Number field has failed. Template Repair Action Code 90 8F 0E258F01 Write history logging encountered a write error on the log unit. 90 8F 0E260064 There is no more space left at the end of the log unit for write history logging.
Event Reporting: Templates and Codes 5–45 Table 5–15 Instance Codes (Sheet 24 of 24) Instance Code 820B2002 Description An unrecoverable error was detected during execution of the Device Port Subsystem Built-In Self Test. One or more of the device ports on the controller module has failed; some/all of the attached storage is no longer accessible via this controller.
5–46 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Last Failure Codes and FMU The format of an Last Failure Code is shown in Table 5–16. Table 5–16 Last Failure Code Format offset 104 105 106 107 bit → 7 HW 6 5 Restart Code 4 3 2 1 Parameter Count 0 Repair Action Error Number Component ID NOTE: Do not confuse the Last Failure Code with that of an Instance Code (shown on page 5–20). They appear at different byte offsets and convey different information.
Event Reporting: Templates and Codes 5–47 Repair Action The Repair Action code at byte offset 105 indicates the recommended repair action code assigned to the failure. This value is used during Symptom-Directed Diagnosis procedures to determine what notification/recovery action should be taken. For details about recommended repair action codes, see the “Recommended Repair Action Codes” section on page 5–88. Error Number The Error Number is located at byte offset 106.
5–48 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 2 of 41) Code 01090105 Description An NMI occurred during EXEC$BUGCHECK processing. Repair Action Code 01 ■ Last Failure Parameter[0] contains the executive flags value. ■ Last Failure Parameter[1] contains the RIP from the NMI stack. ■ Last Failure Parameter[2] contains the read diagnostic register 0 value. ■ Last Failure Parameter[3] contains the FX Chip CSR value.
Event Reporting: Templates and Codes 5–49 Table 5–18 Last Failure Codes (Sheet 3 of 41) Code 01140102 Description DEBUG, ASSUME, or ASSUME_LE macro executed. Repair Action Code 01 ■ Last Failure Parameter [0] contains the address of the module name where the macro is located. ■ Last Failure Parameter [1] contains the line number within the module where the macro is located. The high order byte of this value identifies the macro type: 0 = DEBUG, 1 = ASSUME, 2 = ASSUME_LE.
5–50 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 4 of 41) Code 011B0108 Description The I960 reported a machine fault (nonparity error). Repair Action Code 01 ■ Last Failure Parameter [0] contains the Fault Data (2) value. ■ Last Failure Parameter [1] contains the Fault Data (1) value. ■ Last Failure Parameter [2] contains the Fault Data (0) value. ■ Last Failure Parameter [3] contains the Number of Faults value.
Event Reporting: Templates and Codes 5–51 Table 5–18 Last Failure Codes (Sheet 5 of 41) Code Description Repair Action Code 018F2087 A NMI interrupt was generated with an indication that a controller system problem occurred. 20 ■ Last Failure Parameter [0] contains the value of read diagnostic register 0. ■ Last Failure Parameter [1] contains the value of read diagnostic register 1. ■ Last Failure Parameter [2] contains PCI status.
5–52 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 6 of 41) Code 01932588 Description An error has occurred on the CDAL. Repair Action Code 25 ■ Last Failure Parameter [0] contains the value of read diagnostic register 0. ■ Last Failure Parameter [1] contains the value of read diagnostic register 1. ■ Last Failure Parameter [2] contains the value of write diagnostic register 0.
Event Reporting: Templates and Codes 5–53 Table 5–18 Last Failure Codes (Sheet 7 of 41) Code 01970188 Description Software indicates all NMI causes cleared, but some remain. Repair Action Code 01 ■ Last Failure Parameter [0] contains the value of read diagnostic register 0. ■ Last Failure Parameter [1] contains the value of read diagnostic register 1. ■ Last Failure Parameter [2] contains the value of read diagnostic register 2.
5–54 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 8 of 41) Code 019A2093 Description Hardware Port Hardware failure - TACHYON. Repair Action Code 20 ■ Last Failure Parameter [0] contains failed port number. ■ Last Failure Parameter [1] contains gluon status. ■ Last Failure Parameter [2] contains TACHYON status. 02010100 Initialization code was unable to allocate enough memory to set up the send data descriptors.
Event Reporting: Templates and Codes 5–55 Table 5–18 Last Failure Codes (Sheet 9 of 41) Code 023A2084 Description A processor interrupt was generated by the controller’s XOR engine (FX), indicating an unrecoverable error condition. Repair Action Code 20 ■ Last Failure Parameter [0] contains the FX Control and Status Register (CSR). ■ Last Failure Parameter [1] contains the FX DMA Indirect List Pointer register (DILP). ■ Last Failure Parameter [2] contains the FX DMA Page Address register (DADDR).
5–56 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 10 of 41) Code 028A0100 028B0100 Description Invalid return status from DIAG$CACHE_MEMORY_TEST. Repair Action Code 01 028C0100 Invalid error status given to cache_fail. 01 028E0100 Invalid DCA state detected in init_crashover. 01 02910100 Invalid metadata combination detected in build_raid_node.
Event Reporting: Templates and Codes 5–57 Table 5–18 Last Failure Codes (Sheet 11 of 41) Code Description Repair Action Code 02A90100 Too many pending FOC$SEND requests by the Cache Manager. Code is not designed to handle more than one FOC$SEND to be pending because there’s no reason to expect more than one pending. 01 02AA0100 An invalid call was made to CACHE$DEALLOCATE_CLD. Either that device had dirty data or it was bound to a RAIDset.
5–58 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 12 of 41) Code 02BF0100 Description Report_error routine encountered an unexpected failure status returned from DIAG$LOCK_AND_TEST_CACHE_B. Repair Action Code 01 02C00100 Copy_buff_on_this routine expected the given page to be marked bad and it wasn’t. 01 02C10100 Copy_buff_on_other routine expected the given page to be marked bad and it wasn’t.
Event Reporting: Templates and Codes 5–59 Table 5–18 Last Failure Codes (Sheet 13 of 41) Code 02E11016 Description While attempting to restore saved configuration information, data for two unrelated controllers was found. The restore code is unable to determine which disk contains the correct information. The Port/Target/LUN information for the two disks is contained in the parameter list.
5–60 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 14 of 41) Code 02EF0102 Description A CLD is free when it should be allocated. Repair Action Code 01 ■ Last Failure Parameter [0] contains the requesting entity. ■ Last Failure Parameter [1] contains the CLD index. 02F00100 The controller has insufficient free resources for the configuration restore process to obtain a facility lock.
Event Reporting: Templates and Codes 5–61 Table 5–18 Last Failure Codes (Sheet 15 of 41) Code 02F60103 Description An invalid modification to the no_interlock VSI flag was attempted. Repair Action Code 01 ■ Last Failure Parameter [0] contains the nv_index of the config on which the problem was found. ■ Last Failure Parameter [1] contains modification flag. ■ Last Failure Parameter [2] contains the current value of the no_interlock flag.
5–62 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 16 of 41) Code Description Repair Action Code 02FD0100 The controller has insufficient free memory to restore saved configuration information from disk. 01 02FE0105 A field in the VSI was not cleared when an attempt was made to clear the interlock. 01 ■ Last Failure Parameter [0] contains the NV index of the VSI on which the problem was found.
Event Reporting: Templates and Codes 5–63 Table 5–18 Last Failure Codes (Sheet 17 of 41) Code 030B0188 Description A dip error was detected when pcb_busy was set. Repair Action Code 01 ■ Last Failure Parameter [0] contains the PCB port_ptr value. ■ Last Failure Parameter [1] contains the new info NULL-SSTAT0-DSTAT-ISTAT. ■ Last Failure Parameter [2] contains the PCB copy of the device port DBC register. ■ Last Failure Parameter [3] contains the PCB copy of the device port DNAD register.
5–64 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 18 of 41) Code 03370108 Description A device port detected an illegal script instruction. Repair Action Code 01 ■ Last Failure Parameter [0] contains the PCB port_ptr value. ■ Last Failure Parameter [1] contains the PCB copy of the device port TEMP register. ■ Last Failure Parameter [2] contains the PCB copy of the device port DBC register.
Event Reporting: Templates and Codes 5–65 Table 5–18 Last Failure Codes (Sheet 19 of 41) Code 03390108 Description An unknown interrupt code was found in a device port’s DSPS register. Repair Action Code 01 ■ Last Failure Parameter [0] contains the PCB port_ptr value. ■ Last Failure Parameter [1] contains the PCB copy of the device port TEMP register. ■ Last Failure Parameter [2] contains the PCB copy of the device port DBC register.
5–66 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 20 of 41) Code 033F0108 Description An EDC error was detected on a read of a soft-sectored device path not yet implemented. Repair Action Code 01 ■ Last Failure Parameter [0] contains the PCB port_ptr value. ■ Last Failure Parameter [1] contains the PCB copy of the device port TEMP register. ■ Last Failure Parameter [2] contains the PCB copy of the device port DBC register.
Event Reporting: Templates and Codes 5–67 Table 5–18 Last Failure Codes (Sheet 21 of 41) Code Description Repair Action Code 034B0100 Insufficient memory available for DS init buffer allocation. 01 034C0100 Insufficient memory available for static structure allocation. 01 034D0100 DS init DWDs exhausted. 01 034E2080 Diagnostics report all device ports are broken. 20 034F0100 Insufficient memory available for reselect target block allocation.
5–68 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 22 of 41) Code 03790188 Description A PCI bus fault was detected by a device port. Repair Action Code 01 ■ Last Failure Parameter [0] contains the PCB port_ptr value. ■ Last Failure Parameter [1] contains the PCB copy of the device port TEMP register. ■ Last Failure Parameter [2] contains the PCB copy of the device port DBC register.
Event Reporting: Templates and Codes 5–69 Table 5–18 Last Failure Codes (Sheet 23 of 41) Code 03A08093 Description A configuration or hardware error was reported by the EMU. Repair Action Code 80 ■ Last Failure Parameter [0] contains the solid OCP pattern which identifies the type of problem encountered. ■ Last Failure Parameter [1] contains the cabinet ID reporting the problem. ■ Last Failure Parameter [2] contains the SCSI Port number where the problem exists (if port-specific).
5–70 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 24 of 41) Code 04040103 Description The event log format found in V_fm_template_table is not supported by the Fault Manager. The bad format was discovered while trying to fill in a supplied Event Information Packet (EIP). Repair Action Code 01 ■ Last Failure Parameter[0] contains the instance code value. ■ Last Failure Parameter[1] contains the format code value.
Event Reporting: Templates and Codes 5–71 Table 5–18 Last Failure Codes (Sheet 25 of 41) Code Description 04170102 The template value found in the esd is not supported by the Fault Manager. The bad template value was discovered while trying to translate an esd into an eip. Repair Action Code 01 ■ Last Failure Parameter [0] contains the instance code value. ■ Last Failure Parameter [1] contains the template code value.
5–72 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 26 of 41) Code Description 07070100 The other controller killed this, but could not assert the kill line because nindy on or in debug. So it killed this now. Repair Action Code 01 07080000 The other controller crashed, so this one must crash too. 00 07090100 A call to EXEC$ALLOCATE_MEM_ZEROED failed to return memory when allocating VA Request Items.
Event Reporting: Templates and Codes 5–73 Table 5–18 Last Failure Codes (Sheet 27 of 41) Code 08100101 Description A call to NVFOC$TRANSACTION had a from field (id) that was out of range for the NVFOC communication utility. Repair Action Code 01 ■ Last Failure Parameter [0] contains the bad id value. 08110101 NVFOC tried to defer more than one FOC send. 01 ■ Last Failure Parameter[0] contains the master ID of the connection that had the multiple delays.
5–74 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 28 of 41) Code 09670101 Description Local FLM detected an invalid facility to act upon. Repair Action Code 01 ■ Last Failure Parameter [0] contains the faciltiy found. 09680101 Remote FLM detected an error and requested the local controller to restart. 01 ■ Last Failure Parameter [0] contains the reason for the request.
Event Reporting: Templates and Codes 5–75 Table 5–18 Last Failure Codes (Sheet 29 of 41) Code 0A190102 Description ilf_depopulate_DWD_to_cache first page guard check failed. Repair Action Code 01 ■ Last Failure Parameter [0] contains the DWD address value. ■ Last Failure Parameter [1] contains the buffer address value. 0A1C0102 0A1D0102 0A1E0102 ILF$LOG_ENTRY page guard check failed. 0A1F0100 ilf_rebind_cache_buffs_to_DWDs found duplicate buffer for current DWD.
5–76 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 30 of 41) Code 0A340101 Description ilf_output_error, no memory for message display. Repair Action Code 01 ■ Last Failure Parameter [0] contains the message address value. 0A360100 Duplicate entry found in ilf_populate_DWD_from_cache buffer stack. 01 0A370100 Duplciate entry found in ilf_rebind_cache_buffs_to_DWDs buffer stack.
Event Reporting: Templates and Codes 5–77 Table 5–18 Last Failure Codes (Sheet 31 of 41) Code OB0B0100 Description Repair Action Code Unable to find any unused partition group. With 128 available, we should be able to find at least one. 01 OB0C0100 Unable to allocate memory to use for communication with the DT manager. 01 0D000011 The EMU firmware returned a bad status when told to poweroff. 00 ■ Last Failure Parameter [0] contains the value of the bad status.
5–78 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 32 of 41) Code 0E0F0101 Description An illegal failover response was given to the Write History Log response handler. Repair Action Code 01 ■ Last Failure Parameter [0] contains failover response. 0E100100 The Write History Log failover control had a bad send count. 01 0E110100 Unable to allocate memory for WHL DBs. 01 0E120100 Unable to allocate memory for WHL HTBs.
Event Reporting: Templates and Codes 5–79 Table 5–18 Last Failure Codes (Sheet 33 of 41) Code 12070102 Description vsi_ptr->allocated_this not set. Repair Action Code 01 ■ Last Failure Parameter [0] contains the ASSUME instance address. ■ Last Failure Parameter [1] contains nv_index value. 12080102 vsi_ptr->cs_interlocked not set. 01 ■ Last Failure Parameter [0] contains the ASSUME instance address. ■ Last Failure Parameter [1] contains nv_index value. 12090102 Unhandled switch case.
5–80 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 34 of 41) Code 200D0101 Description After many calls to DS$PORT_BLOCKED, we never got a FALSE status back (which signals that nothing is blocked). Repair Action Code 01 ■ Last Failure Parameter[0] contains the port number (1 - n) that we were waiting on to be unblocked.
Event Reporting: Templates and Codes 5–81 Table 5–18 Last Failure Codes (Sheet 35 of 41) Code 20640000 Description Nindy was turned on. Repair Action Code 00 20650000 Applies to off. 20692010 To enter dual-redundant mode, both controllers must be of the same type. 20 206A0000 Controller restart forced by DEBUG CRASH REBOOT command. 00 206B0010 Applies to DEBUG CRASH NOREBOOT. 206C0020 Controller was forced to restart in order for new controller code image to take effect.
5–82 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 36 of 41) Code Description 44660100 Unable to allocate enough abort requests for Fibre Channel Host Port Transport software layer. 44670100 Applies to command HTBs. 44680100 Applies to FC HTBs. 44690100 Applies to work requests. 446A0100 Applies to HTBs. 446B0100 Applies to TIS structures. 446C0100 Applies to MFSs. 446D0100 Applies to TACHYON headers.
Event Reporting: Templates and Codes 5–83 Table 5–18 Last Failure Codes (Sheet 37 of 41) Code Description Repair Action Code 44790102 An illegal script return value was received by the Host Port Transport response script handler. 01 ■ Last Failure Parameter [0] contains the rsp function. ■ Last Failure Parameter [1] contains return value. The Host Port Transport ran out of work requests. 447A0102 An illegal script return value was received by the Host Port Transport error script handler.
5–84 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 38 of 41) Code 44892091 Description Host Port Hardware diagnostic field at system initialization. Repair Action Code 20 ■ Last Failure Parameter [0] contains failed port number. 448B0100 Host Port Transport software layer unable to allocate work item for updating NV memory during LOGI.
Event Reporting: Templates and Codes 5–85 Table 5–18 Last Failure Codes (Sheet 39 of 41) Code 64030104 Description A DD is already in use by an RCV DIAG command—cannot get two RCV_DIAGs without sending the data for the first. Repair Action Code 01 ■ Last Failure Parameter [0] contains DD_PTR. ■ Last Failure Parameter [1] contains blocking HTB_PTR. ■ Last Failure Parameter [2] contains HTB_PTR flags. ■ Last Failure Parameter [3] contains this HTB_PTR. 64040100 An attempt to allocate a free VAR failed.
5–86 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–18 Last Failure Codes (Sheet 40 of 41) Code 83020100 Description Repair Action Code An unsupported message type or terminal request was received by the CONFIG virtual terminal code from the CLI. 01 83030100 Not all alter_device requests from the CONFIG utility completed within the timeout interval.
Event Reporting: Templates and Codes 5–87 Table 5–18 Last Failure Codes (Sheet 41 of 41) Code 8B000186 Description An single bit error was found by software scrubbing. ■ Last Failure Parameter [0] contains the address of the first single bit ecc error found. ■ Last Failure Parameter [1] contains the count of single bit ecc errors found in the same region below this address. ■ Last Failure Parameter [2] contains the lower 32-bits of the actual data read at the Parameter [0] address.
5–88 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Recommended Repair Action Codes Recommended Repair Action Codes are embedded in Instance and Last Failure codes. See “Instance Codes” on page 5–20 and “Last Failure Codes” on page 5–45 for a more detailed description of the relationship between these codes. Table 5–19 contains the repair action codes assigned to each significant event in the system.
Event Reporting: Templates and Codes 5–89 Table 5–19 Recommended Repair Action Codes (Sheet 2 of 6) Code Description 0C Both controllers in a dual-redundant configuration are attempting to use the same SCSI ID (either 6 or 7 as indicated in the event report). The other controller of the dual-redundant pair has been reset with the “Kill” line by the controller that reported the event. Two possible problem sources are indicated: ■ A controller hardware failure. ■ A controller backplane failure.
5–90 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–19 Recommended Repair Action Codes (Sheet 3 of 6) Code Description 37 The Memory System Failure translator could not determine the failure cause. Follow repair action 01. 38 Replace the indicated cache memory DIMM. 39 Check that the cache memory DIMMs are properly configured. 3A This error applies to this controller’s mirrored cache.
Event Reporting: Templates and Codes 5–91 Table 5–19 Recommended Repair Action Codes (Sheet 4 of 6) Code 51 Description The mirrorset is inoperative for one of the following reasons: ■ The last NORMAL member has malfunctioned. Perform repair actions 55 and 59. ■ The last NORMAL member is missing. Perform repair action 58. ■ The members have been moved around and the consistency checks show mismatched members. Perform repair action 58.
5–92 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–19 Recommended Repair Action Codes (Sheet 5 of 6) Code Description 69 An unrecoverable fault occurred at the host port. There may be more than one entity attempting to use the same SCSI ID, or some other bus configuration error, such as improper termination, may exist. If no host bus configuration problems are found, follow repair action 01. 80 An EMU fault has occurred.
Event Reporting: Templates and Codes 5–93 Table 5–19 Recommended Repair Action Codes (Sheet 6 of 6) Code Description 8D It is not safe to present the WWLID to the host because a site failover may have taken place, but cannot confirm with the remote controller. Perform one of the following repair actions: ■ Follow repair action 8B. ■ If a site failover took place, and you don’t plan to perform a future site failback, then delete the remote copy set on this controller.
5–94 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Table 5–20 Component Identifier Codes (Continued) Code 0B Description Configuration Manager Process 0C Memory Controller Event Analyzer 0D Poweroff Process OE Data Replication Manager Services (ACS V8.
Appendix A Controller Specifications This appendix contains physical, electrical, and environmental specifications for the HSG80 array controller. Physical and Electrical Specifications for the Controller Table A–1 lists the physical and electrical specifications for the controller and cache modules. Voltage measurements in Table A–1 are nominal measurements (at +5 and +12 VDC) without tolerances.
A–2 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide Environmental Specifications The HSG80 array controller is intended for installation in a Class A environment. The optimum operating environmental specifications are listed in Table A–2; the maximum operating environmental specifications are listed in Table A–3; and the maximum nonoperating environmental specifications are listed in Table A–4. These are the same as for other Compaq storage devices.
Controller Specifications A–3 Table A–4 Maximum Nonoperating Environmental Specifications Condition Temperature Specification -40 °C to +66 °C (-40 °F to +151 °F) (During transportation and associated short-term storage) Relative Humidity 8% to 95% in original shipping container (noncondensing); Altitude From -300 m (-1000 ft) to +3600 m (+12,000 ft) Mean Sea Level (MSL) otherwise, 50% (noncondensing)
Glossary This glossary defines terms pertaining to the HSG80 Fibre Channel array controller. It is not a comprehensive glossary of computer terms. 8B/10B A type of byte encoding and decoding to reduce errors in data transmission patented by the IBM Corporation. This process of encoding and decoding data for transmission has been adopted by ANSI. adapter A device that converts the protocol and hardware interface of one bus type into another without changing the function of the bus.
GL–2 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide arbitrated loop physical address Abbreviated AL_PA. A one-byte value used to identify a port in an Arbitrated Loop topology. The AL_PA value corresponds to bits 7:0 of the 24-bit Native Address Indentifier. array controller See controller. array controller software Abbreviated ACS. Software contained on a removable ROM program card that provides the operating system for the array controller.
Glossary GL–3 block Also called a sector. The smallest collection of consecutive bytes addressable on a disk drive. In integrated storage elements, a block contains 512 bytes of data, error codes, flags, and the block’s address header. bootstrapping A method used to bring a system or device into a defined state by means of its own action. For example, a machine routine whose first few instructions are enough to bring the rest of the routine into the computer from an input device.
GL–4 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide command line interpreter The configuration interface to operate the controller software. configuration file A file that contains a representation of a storage subsystem’s configuration. container 1) Any entity that is capable of storing data, whether it is a physical device or a group of physical devices.
Glossary device See node and peripheral device. differential I/O module A 16-bit I/O module with SCSI bus converter circuitry for extending a differential SCSI bus. GL–5 See also I/O module. differential SCSI bus A bus in which a signal’s level is determined by the potential difference between two wires. A differential bus is more robust and less subject to electrical noise than is a single-ended bus. DILX Disk inline exerciser.
GL–6 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide EIA The abbreviation for Electronic Industries Association. EIA is a standards organization specializing in the electrical and functional characteristics of interface equipment. Same as Electronic Industries Association. EMU Environmental monitoring unit. A unit that provides increased protection against catastrophic failures.
Glossary GL–7 FC–PH The Fibre Channel Physical and Signaling standard. FC–SB Fibre Channel Single Byte Command Code Set FC–SW Fibre Channel Switched Topology and Switch Controls FCC Federal Communications Commission. The federal agency responsible for establishing standards and approving electronic devices within the United States. FCC Class A This certification label appears on electronic devices that can only be used in a commercial environment within the United States.
GL–8 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide FRUTIL Field Replacement utility. full duplex (n) A communications system in which there is a capability for 2-way transmission and acceptance between two sites at the same time. full duplex (adj) Pertaining to a communications method in which data can be transmitted and received at the same time. FWD SCSI A fast, wide, differential SCSI bus with a maximum 16-bit data transfer rate of 20 MB/s.
Glossary hot swap GL–9 A method of device replacement that allows normal I/O activity on a device’s bus to remain active during device removal and insertion. The device being removed or inserted is the only device that cannot perform operations during this process. See also cold swap and warm swap. HSUTIL Format and device code load utility. IBR Initial Boot Record. ILF Illegal function. INIT Initialize input and output.
GL–10 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide JBOD Just a bunch of disks. A term used to describe a group of single-device logical units. kernel The most privileged processor access mode. LBN Logical Block Number. L_port A node or fabric port capable of performing arbitrated loop functions and protocols. NL_Ports and FL_Ports are loop-capable ports. LED Light Emitting Diode.
Glossary GL–11 loop tenancy The period of time between the following events: when a port wins loop arbitration and when the port returns to a monitoring state. L_Port A node or fabric port capable of performing Arbitrated Loop functions and protocols. NL_Ports and FL_Ports are loop-capable ports. LRU Least recently used. A cache term used to describe the block replacement policy for read cache. Mbps Approximately one million (106) bits per second—that is, megabits per second.
GL–12 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide network A data communication, a configuration in which two or more terminals or devices are connected to enable information transfer. node In data communications, the point at which one or more functional units connect transmission lines. Non-L_Port A Node of Fabric port that is not capable of performing the Arbitrated Loop functions and protocols. N_Ports and F_Ports loop-capable ports.
Glossary GL–13 NVM Non-Volatile Memory. A type of memory where the contents survive power loss. Also sometimes referred to as NVMEM. OCP Operator control panel. The control or indicator panel associated with a device. The OCP is usually mounted on the device and is accessible to the operator. offset A relative address referenced from the base element address. Event Sense Data Response Templates use “offsets” to identify various information contained within the one byte of memory (bits 0 through 7).
GL–14 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide parity A method of checking if binary numbers or characters are correct by counting the ONE bits. In odd parity, the total number of ONE bits must be odd; in even parity, the total number of ONE bits must be even. Parity information can be used to correct corrupted data. RAIDsets use parity to improve the availability of data.
Glossary GL–15 program card The PCMCIA card containing the controller’s operating software. protocol The conventions or rules for the format and timing of messages sent and received. PTL Port-Target-LUN. The controller’s method of locating a device on the controller’s device bus. PVA module Power Verification and Addressing module. quiesce The act of rendering bus activity inactive or dormant. For example, “quiesce the SCSI bus operations during a device warm-swap.
GL–16 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide RAID level 3/5 A DIGITAL-developed RAID storageset that stripes data and parity across three or more members in a disk array. A RAIDset combines the best characteristics of RAID level 3 and RAID level 5. A RAIDset is the best choice for most applications with small to medium I/O requests, unless the application is write intensive. A RAIDset is sometimes called parity RAID.
Glossary GL–17 remote copy A feature intended for disaster tolerance and replication of data from one storage subsystem or physical site to another subsystem or site. It also provides methods of performing a backup at either the local or remote site. With remote copy, user applications continue to run while data movement goes on in the background. Data warehousing, continuous computing, and enterprise applications all require remote copy capabilities.
GL–18 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide SCSI bus signal converter Sometimes referred to as an adapter. (1) A device used to interface between the subsystem and a peripheral device unable to be mounted directly into the SBB shelf of the subsystem. (2) a device used to connect a differential SCSI bus to a single-ended SCSI bus. (3) A device used to extend the length of a differential or single-ended SCSI bus. See also I/O module.
Glossary GL–19 single-ended SCSI bus An electrical connection where one wire carries the signal and another wire or shield is connected to electrical ground. Each signal’s logic level is determined by the voltage of a single wire in relation to ground. This is in contrast to a differential connection where the second wire carries an inverted signal. spareset A collection of disk drives made ready by the controller to replace failed members of a storageset.
GL–20 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide striping The technique used to divide data into segments, also called chunks. The segments are striped, or distributed, across members of the stripeset. This technique helps to distribute hot spots across the array of physical devices to prevent hot spots and hot disks. Each stripeset member receives an equal share of the I/O request load, improving performance.
Glossary unit GL–21 A container made accessible to a host. A unit may be created from a single disk drive or tape drive. A unit may also be created from a more complex container such as a RAIDset. The controller supports a maximum of eight units on each target. See also target and target ID number. unwritten cached data Sometimes called unflushed data. UPS Uninterruptible power supply.
GL–22 Compaq StorageWorks HSG80 Array Controller ACS Version 8.5 Maintenance and Service Guide write hole The period of time in a RAID level 1 or RAID level 5 write operation when an opportunity emerges for undetectable RAIDset data corruption. Write holes occur under conditions such as power outages, where the writing of multiple members can be abruptly interrupted.
Index A AC input box part number 1–3 adding cache memory 3–17 DIMMs 3–17 array controller. See controller.
I–2 Compaq StorageWorks HSG80 Array Controller ACS Version 8.
Index single-controller configuration 2–11 controller specifications. See also specifications.
I–4 Compaq StorageWorks HSG80 Array Controller ACS Version 8.
Index dual-redundant controller configuration 2–53 single-configuration controller 2–53 fibre channel host status display 4–44 hub, part number 1–3 link error 4–43 optical cable, cleaning instructions 2–5 switch, part number 1–3 field replacement utility. See FRUTIL.
I–6 Compaq StorageWorks HSG80 Array Controller ACS Version 8.
Index removing 2–56 mirrorsets, duplicating data with the CLONE utility 4–56 N nonvolatile memory, fault-tolerance for write-back caching 4–13 note defined xv O other controller defined xiv P part numbers AC input box 1–3 BA370 rack-mountable enclosure 1–3 cache module 1–3 cooling fan 1–3 disk drives 1–3 dual-battery ECB 1–3 ECB 1–3 ECB Y-cable BA370 enclosure 1–5 data center cabinet 1–5 EMU 1–3 fibre channel hub 1–3 optical cabling, parts used in configuring the controller 1–4 switch 1–3 GBIC 1–3 I/O m
I–8 Compaq StorageWorks HSG80 Array Controller ACS Version 8.
Index PVA module 2–42 PVA module, master enclosure 2–42 switch 2–53 ECB with cabinet powered off 2–38 ECB with cabinet powered on 2–36 fiber cable dual-redundant controller configuration 2–53 single-controller configuration 2–53 GLM 2–40 hub dual-redundant controller configuration 2–53 single-controller configuration 2–53 I/O module 2–45 modules dual-redundant controller configuration 2–17 single-controller configuration 2–10 program (PCMCIA) card 2–54 PVA module 2–42 single-controller configuration cache
I–10 Compaq StorageWorks HSG80 Array Controller ACS Version 8.
Index checking I/O 4–41 checking status 4–41 exercising 4–50 unpartitioned mirrorsets, duplicating data with the CLONE utility 4–56 upgrading cache memory 3–17 controller software 3–2 controller software with the CLCP utility 4–56 device firmware 3–11 DIMMs 3–17 downloading new software 3–3 EMU software with the CLCP utility 4–56 from a single controller to a dual-redundant controller configuration 3–14 installing controller, cache module, and ECB 3–14 new program (PCMCIA) card 3–2 using CLCP 3–6 deleting