INAUGURAL - DISSERTATION zur Erlangung der Doktorwürde der Naturwissenschaftlich-Mathematischen Gesamtfakultät der Ruprecht-Karls-Universität Heidelberg vorgelegt von Diplom–Physiker Ralf Erich Panse aus Mannheim Tag der mündlichen Prüfung: 12.
CHARM-Card: Hardware Based Cluster Control And Management System Gutachter: Prof. Dr. Volker Lindenstruth Prof. Dr.
CHARM-Card: Hardwarebasiertes Computer-Cluster Kontroll- und Managementsystem Die Selektion und Analyse von Ereignisdaten des Schwerionen-Experiments ALICE am CERN werden durch sogenannte Triggerstufen vorgenommen. Der High Level Trigger (HLT) ist die letzte Triggerstufe des Experimentes. Er besteht aus einer Rechnerfarm von zur Zeit über 120 Computer, die auf 300 Rechner ausgebaut werden soll.
Contents 1 Introduction 1.1 Outline . . . . . . . . . . . . . . . 1.2 ALICE Experiment . . . . . . . . . 1.3 HLT Computer Cluster . . . . . . . 1.4 Remote Management Tools . . . . 1.4.1 KVM . . . . . . . . . . . . 1.4.2 BIOS Console Redirection . 1.4.3 IPMI . . . . . . . . . . . . . 1.4.4 Remote Management Cards 1.5 CHARM Card . . . . . . . . . . . 1.5.1 Features of the CHARM . . 1.5.2 Usage of the CHARM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 60 60 62 64 64 5 Device Emulation 5.1 USB Device Emulation . . . . . . . . . . 5.1.1 USB Bus System . . . . . . . . . 5.1.2 Cypress EZ-Host USB Controller 5.1.3 Human Interface Device . . . . . 5.1.4 Mass Storage Device . . .
Contents 8.2.1 8.2.2 CHARM-Host Network Bridge . . . . . . . . . . . . . . . . . . . . . 108 Network Masquerading . . . . . . . . . . . . . . . . . . . . . . . . . . 111 9 Benchmarks and Verification 9.1 VGA Function Performance . . . . . . . . . . . . 9.1.1 Estimation of the VGA Data Throughput 9.1.2 CHARM PCI Target Throughput . . . . . 9.1.3 CHARM VGA Processing Performance . . 9.1.4 CHARM Graphical Output Performance . 9.2 USB CD-ROM Performance . . . . . . . . . . . . 9.3 USB Compliance Test . . . .
List of Figures 1.1 1.2 1.3 1.4 . . . . . . of . . . . 19 20 22 1.5 Overview of the LHC ring at CERN. . . . . . . . . . . . . . . . . . . The HLT cluster nodes. . . . . . . . . . . . . . . . . . . . . . . . . . Remote management of computer systems. . . . . . . . . . . . . . . . Screenshot of a VNC session while setup the BIOS settings with the the CHARM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Screenshot of the web page provided by the CHARM. . . . . . . . . 2.1 2.2 2.
List of Figures 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13 7.1 7.2 8.1 8.2 8.3 8.4 12 USB logical pipes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . HPI Bridge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . USB keyboard implementation. The VNC server takes the user interaction and converts it to USB keycodes. These keycodes are written into the keycode buffer inside the USB controller. . . . . . . . . . . .
List of Figures 8.5 8.6 8.7 8.8 9.1 9.2 9.3 9.4 9.5 9.6 9.7 CHARM-Host network communication. In principle, there is no direct network connection between the host and the CHARM. But the PCI bus is used to establish a network bridge between the CHARM and the host computer. Block diagram of the network function of the CHARM. . . . . . . . . . . . . Layout of the shared SRAM content [3]. The left side represents the lower addresses. The right side marks the end of the SRAM content. . . . . . . .
List of Figures B.1 CHARM card front view (model B). . . . . . . . . . . . . . . . . . . . . . . 134 B.2 CHARM card back view (model B). . . . . . . . . . . . . . . . . . . . . . . 134 E.1 SDRAM address map. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 G.1 Test system #1 with an installed CHARM card. It is the topmost PCI card.
List of Tables 2.1 2.2 Features of the FPGA used in the EPXA1 chip where LE means Logic Element. 31 CHARM PCI Base Address Register. . . . . . . . . . . . . . . . . . . . . . . 33 3.1 3.2 3.3 3.4 3.5 3.6 Device driver of the CHARM. . . . . . . . . . . MTD partitions of the CHARM’s flash memory. Directory structure of the Root File System . . Default settings of the NFS connection. . . . . Directories of the NFS share /mnt/charmserver. Content of the card specific subdirectory. . . . . . . . . . . . . .
List of Tables 9.4 9.5 Performance of the CHARM VGA function. The transfer time is the period of the successful PCI cycle. The CHARM cannot immediately accept data after a data transfer. The dead time defines the period while the CHARM rejects PCI accesses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Power consumption and power limitation of the CHARM card. . . . . . . . 127 B.1 Characteristics of the CHARM. . . . . . . . . . . . . . . . . . . . . . . . . . 133 E.1 AHB address map.
1 Introduction At present, computer clusters1 are the predominant construction type of supercomputer installations. They are used in a wide range of applications like web search engines [4], weather forecasts [5], simulation of financial markets [6] and high energy experiments. The data analysis of future high energy experiments like CMS2 and ALICE3 are accomplished by computer clusters, for example.
1 Introduction 1.1 Outline The following sections give an overview about the target system of this thesis. They also discuss existing remote access tools for computer systems. The heart of the hardware based remote control presented in this thesis is the CHARM5 PCI card which will be referenced simply as CHARM in the rest of the text. The features of the card are summarized in section 1.5.1. The architecture of the card is illustrated in chapter 2.
1.3 HLT Computer Cluster Figure 1.1: Overview of the LHC ring at CERN. 1.2 shows several HLT cluster nodes in the ALICE counting room. The current setup of the cluster installed at CERN contains approximately a quarter of the foreseen nodes (>100). The installation and administration of such a big computer farm is an extensive task. Therefore, automatization of periodically task is highly recommended. Another issue to be performed on the HLT cluster is the remote control of its nodes.
1 Introduction Figure 1.2: The HLT cluster nodes. 1.4 Remote Management Tools Remote management tools are used to remotely connect and manage a single or multiple computers. There are a couple of software and hardware based remote management and remote control tools on the market. Remote control software are widespread to any operating systems. SSH [21] and Telnet [22] are two of the best known remote access tools.
1.4 Remote Management Tools of the device provides access to the serial port to emulate keystrokes or mouse movements. The screen content is fetched from the graphic card and provided to remote computers. 1.4.2 BIOS Console Redirection The main-board manufactures equip their products with hardware based remote maintenance units. The remote console is one of the widespread remote access tools for computer systems.
1 Introduction one. A separate network environment has the advantage to provide a secure remote interface to the Internet, while the servers are only accessible via the local network. Local Network Adapter Connection Internet Management Computer Remote Management Card Connection Server Figure 1.3: Remote management of computer systems. Currently on the market there are several remote management cards as Peppercon eRIC II [29], AMI MegaRAC R G4 [30] and TYAN SMDC M329 [31].
1.5 CHARM Card 1.5 CHARM Card Common remote maintenance devices use existent management facilities of the main-board. For the most part the devices access the BMC of the main-board via an IPMB. The absence of an IPMB limits the features of the device or makes these devices unusable. Furthermore, the most remote control devices provide solely a KVM function. In addition, monitoring features or capabilities to inspect the computer are missing on the remote control cards.
1 Introduction • Detection of the PCI devices via PCI bus scanning. • PCI master capability to read out the host computer’s memory space. • Reconfiguration to change the function of the card, if needed. • Operating system Linux. • Automatic installation and configuration of the host computer. 1.5.2 Usage of the CHARM A couple of standard interfaces provide access to the CHARM and the usage of its functions. The card can be used via SSH, VNC or HTTP.
1.5 CHARM Card the host name and the revision date of the card. Additionally, the last ten POST codes of the host computer (see section 6.1 to get more information about POST) are shown on a web page. The CHARM can obtain real time information of the host computer like the BIOS CMOS content or the PCI device list. This information is also provided by the web server. Furthermore, an embedded Java VNC applet provides an interactive remote access to the host computer.
2 CHARM Architecture This chapter explains the hardware units and their organization on the CHARM. Section 2.1 gives an overview of the board architecture. Afterwards, the hardware units are explained through section 2.2 to section 2.3. 2.1 Overview of the CHARM Board The different hardware components that form the CHARM system are mounted in a multi chip module board which PCB1 consists of 8 layers.
2 CHARM Architecture RS232 connector The RS232 connector provides access to the operating system of the card. EZ-Host USB Controller The USB controller is used to emulated peripheral devices to the host computer. It is explained with more detail in section 5.1.2. Flash Memory The flash memory is the sole nonvolatile memory of the CHARM. It contains the kernel of the embedded system, the root file system and a configuration file for the FPGA.
2.2 Excalibur Chip SRAM The SRAM is used for fast data storage of the FPGA unit. SDRAM The SDRAM is the main memory of the embedded system. Excalibur EPXA1 Chip The EPXA1 contains the CPU and an FPGA unit. It is illustrated in section 2.2. 32 bit PCI Connector The card can be plugged into any PCI or PCI-X slot. Bus switches allow to use the card with 5V and 3.3V PCI slots. Ethernet Chip An 10/100 MBit Ethernet chip provides the network interface of the CHARM.
2 CHARM Architecture Flash Ethernet SDRAM Interrupt Controller Watchdog Timer ARM Processor AHB1 EBI UART Memory Controller AHB1-2 Bridge Single-Port SRAM Single-Port SRAM Dual-Port SRAM Dual-Port SRAM AHB2 Timer Stripe-to-PLD Bridge Configuration Logic PLD-to-Stripe Bridge Embedded Stripe PLD Array Figure 2.2: Structure of the Excalibur Embedded Processor Stripe [1]. The internal SDRAM controller is connected to the AHB bus system.
2.3 FPGA Design of the CHARM 2.2.3 FPGA Device The embedded stripe of the Excalibur depicted in section 2.2 interfaces with a programmable logic architecture similar to that of an APEX 20KE [35] device. Altera’s APEX20KE devices are designed with MultiCore architecture, which combines LUT6 -based and product-termbased logic. Additionally, the device contains an enhanced memory structure to provide a variety of memory functions, including CAM, RAM or dual-port RAM.
2 CHARM Architecture AHB-Avalon Bridge ADC SRAM SPI Tristate Bridge Avalon Bus Stripe PLD Bridge AHB Master PCI Target PCI Core PCI Master PCI BARHide FAN Speed AHB HPI Bridge PLD Stripe Bridge POST Sniffer CHARM Register AHB Bus PCI Bus USB Chip AHB Bus Stripe PLD PCI Bus LEDs, Optocoupler, FAN Power Source Connectors Figure 2.3: Structure of the CHARM PLD design. Altera PCI Core The Altera PCI MegaCore is a soft IP7 core sold by the Altera Corporation.
2.3 FPGA Design of the CHARM BAR No.. 0 1 2 3 Type Memory I/O Memory I/O Size 1 MB 64 KB 128 KB 32 KB Function Enhanced video function BIOS RPC functions VGA Memory Region VGA I/O Region Table 2.2: CHARM PCI Base Address Register. cation port between the CHARM and the VGA BIOS running on the host computer. The VGA BIOS is explained in section 4.3. The last two BARs implement the VGA address windows. The VGA protocol and the related address window are explained in section 4.1.
2 CHARM Architecture (SOPC). Furthermore, the Avalon Bus architecture consists of logic and routing resources inside a PLD. The principal design goals of the Avalon Bus are: simplicity, optimized resource utilization and synchronous operation. AHB Bus System The Advanced High-Performance Bus (AHB) is a high-performance bus developed for AMBA. A typical AMBA-based system contains a microcontroller, highbandwidth on-chip RAM and a bridge interfacing low-bandwidth devices.
3 Software of the CHARM The CHARM is an embedded system with a hard-core CPU, main memory and a non volatile storage. The operating system of the card is Linux which is started by ARMboot, the boot-loader of the CHARM. The following sections describe the booting procedure, the Linux system and the file system of the card. 3.1 Boot Loader ARMboot [43] is the boot-loader of the CHARM. It is available as free software under the GNU Public License (GPL).
3 Software of the CHARM SDRAM FLASH Executed from Tasks Configure Excalibur Chip: ! PLL ! Memory Map ! Embedded Strip I/O ! Cache ! SDRAM Controller ! PLD Logic Console Output ARMboot 1.0.
3.3 File system of the CHARM pcimaster The pcimaster driver provides access to the PCI master functionality of the card. For example, the programs "lspci" or "dmidecode" use this interface to inject PCI cycles. barSwitch It communicates with the PCI target control unit (see section 2.3) and distributes the PCI requests to the appropriate sub modules. vga The VGA module undertakes processing of the IBM VGA specification. Section 4.2.3 illustrates the processing of the driver more precisely.
3 Software of the CHARM utilities. It replaces the most of the utilities usually found in GNU3 fileutils, shellutils, etc. However, the utilities in BusyBox generally have fewer options than their full-featured GNU counterparts. Third party software like ssh or the web server axhttpd are located in the directory /usr/local/bin or /usr/local/sbin.
3.4 NFS-Directory NFS-Share NFS-Server NFS CHARM DHCP-Server Hostname IP NFS-Server Name NFS-Share Directory /mnt/charmserver Figure 3.2: CHARMs connects to the NFS-Server after boot up. Description Default NFS server host name Default NFS directory name Mount point on the CHARM Directory charmserver /mnt/charmserver /mnt/charmserver Table 3.4: Default settings of the NFS connection.
3 Software of the CHARM The shared NFS directory is mount to /mnt/charmserver on the local file system. The NFS share contains additional software, web pages and a boot script. Table 3.5 shows the content of the NFS-Share. Directory ./cards ./webpage ./temp ./tools ./Autostart Description Contains card specific settings and boot scripts Contains project specific web pages of the internal web server. Additional temporary directory Directory for additional software for the CHARM.
4 Graphic Card Implementation The CHARM implements a VGA graphic card. It replaces the primary graphic card of the host system to get the screen content of the host computer. But the CHARM is not installed to a local monitor. Instead, the graphic data are processed on the CHARM and send it via the network to an arbitrary remote computer. The card does not use a commercial graphic processor chip because the raw graphic data can be better inspected than the video output signal.
4 Graphic Card Implementation today is based on the VGA standard and all VGA video modes are supported by the common operating systems. 4.1.1 VGA Components The VGA system has four main functional areas: the Cathode Ray Tube (CRT) controller, the sequencer, the graphics controller, and the attribute controller. Figure 4.1 shows a diagram of the VGA functional areas and the connections between the video memory and the video Digital Analog Converter (DAC).
4.1 VGA Specification Graphics Controller The graphics controller is the interface between the video memory and the attribute controller during active display time. The graphics controller can perform logical operations on the memory data. Attribute Controller It contains the color look up table which determines what color will be displayed for a given pixel value in the video memory.
4 Graphic Card Implementation font of the character has to be loaded. The third video plane contains the fonts of the 256 ASCII characters. Character Plane 0 Attribute Plane 1 Font Font Font Font Font Font Font Font Font Plane 2 8 7 6 5 4 3 2 1 0 Unused Plane 3 Figure 4.3: Organization of the video planes in alphanumeric mode. Three default fonts are contained in the read-only memory (ROM) of the VGA card: an 8x8 font, an 8x14 font, and an 8x16 font.
4.1 VGA Specification Odd Scan/Address 0x0 Pixel Even Scans Reserved Odd Scans Reserved Even Scan/Address 0x8000 Video Memory Display Screen Figure 4.4: The screen is divided into odd and even columns. both mechanism: classification of the pixel into two parts and fragmentation of the pixel itself. These modes use two video planes with two separate pixel areas per plane. Four Plane Modes Video modes using four video planes separate a pixel into four parts.
4 Graphic Card Implementation Port Address 0x3B0 - 0x3BB 0x3C0 - 0x3DF Description MDA register VGA register Table 4.2: VGA I/O ports controlling the video mode. To be downward compatible with older display standards, the address window and I/O ports of the MDA2 are part of the VGA standard but they are extremely rarely used in modern computer systems. The address window of a VGA card is smaller than the size of the video memory.
4.2 Graphic Card Implementation Layout software based VGA processing is sufficient to display the screen content. The results of the performance measurement is illustrated in chapter 9. However, there are hardware units which provide the incoming VGA request to the VGA software. The part based on hardware undertakes the low level PCI protocol and the software manages and processes the data content. The hardware modules which receives the VGA requests are explained in section 4.2.2.
4 Graphic Card Implementation BARs. This means that the base address register contains a fixed value or rather an address window. This window is not changeable at runtime and cannot not be initialized by the computer BIOS. Two PCI BARs of the Altera PCI core are setup with the VGA address window 0xA0000-0xBFFFF and the I/O port range from 0x3C0-0x3DF. But these BARs have to be hidden from the computer system.
4.2 Graphic Card Implementation Layout Host does not see BAR2 Host Computer View PCI Bus PCI Signals PCI Core PCI Core View PCI BAR Hide PCI BAR Hide View Figure 4.6: PCI Configuration Space hiding. CHARM. The module returns zero to the PCI AD (address/data) bus while the PCI data phase. The reason is that unimplemented Base Address Registers have to be hardwired to zero [40]. The signal of the PCI BAR Hide overwrites the signal of the PCI core returning the hardwired VGA address.
4 Graphic Card Implementation 4.2.2 Hardware Implementation of the PCI Target Interface The VGA related interface between the host computer and the CHARM is the PCI bus. All VGA requests of the host are sent over the PCI bus. Thereby, a VGA request is a screen update or a video mode setting for example. The CHARM contains a PCI core which interfaces with the PCI bus. It undertakes the low level PCI protocol. More information about the PCI core can be found in chapter 2 or here [39].
4.2 Graphic Card Implementation Layout 31 Request Buffer 0 0x0 11 PCI Request 1 8 7 PCI BAR Nr 4 3 PCI COMMAND 0 PCI BYTEENABLE 0x10 PCIRequest Request 2 2 PCI 31 0 BAR NR., CMD, BYTEENABLE. 0x20 0x20 PCI ADDRESS PCI Request 3 0x30 PCI DATA GAP 0x30 PCI Request N Figure 4.8: Structure of the Request Buffer An entry of the Request Buffer must contain all information about the PCI cycle. The necessary information is: PCI address, PCI command, PCI byteenable and the PCI data.
4 Graphic Card Implementation Address Request Buffer Request Buffer PCI Write #1 PCI Write #8 PCI Write #2 PCI Write #9 PCI Write #3 PCI Write #10 PCI Write #4 PCI Read #1 PCI Write #5 MARK PCI Write #6 PCI Write #6 PCI Write #7 PCI Write #7 MARK MARK Time = X Time = X+1 Figure 4.9: Two sample Request Buffer contents. The yellow frames mark the valid content of the buffer. stops accepting further PCI requests from the host.
4.2 Graphic Card Implementation Layout Host CPU PCI Core / PCI Target Control Time Window Write #1 Request Buffer write to success Write #2 write to success Write #3 retry Processing Write #3 Time Window retry Write #3 success write to Write #4 success write to Write #5 write to success Figure 4.10: Timing of the access to the Request Buffer. released from the software after data processing.
4 Graphic Card Implementation PCI Target Control BAR Switch Driver Request Buffer VGA Driver Time Window write to write to write to wake up (interrupt) acknowledge read from send to Processing signal end of processing Figure 4.11: Timing of the access to the Request Buffer. the location of the CHARM Register. It interfaces with the AHB Bus. The register entry which is used to acknowledge the interrupt is called VGA_ACK_REGISTER. Table E lists all entries of the CHARM Register.
4.2 Graphic Card Implementation Layout INT_REQ=0 ACK=X INT_REQ=X ACK=0 INT_REQ=X ACK=1 P CI 00 SW1 01 ACCESS STATE INT_REQ=1 ACK=X SW0 10 (INT) (ACK) INT_REQ=X ACK=0 INT_REQ=X ACK=1 Figure 4.12: Request Buffer access synchronization. PCI Read Access In principle, PCI read access to the CHARM is handled in the same manner like a PCI write access. The Request Buffer is used to serve information about the incoming PCI Read request.
4 Graphic Card Implementation in form of PCI signals. These signals are processed according to the VGA protocol. Figure 4.13 illustrate the processing queue of the data inside the Request Buffer. The Request Buffer does not solely contains VGA related PCI requests. The CHARM use the PCI interface for features beyond the VGA function, too. The BAR Switch driver reads out the Request Buffer according to the FIFO principle. It distributes every PCI request to the related processing driver.
4.2 Graphic Card Implementation Layout for BAR 1 (RPC is explained in section 4.3.3). Following an example of the Request Buffer content: ... 0000033d 000003d5 00003600 ffffffff 0000027c 000b806c 00007053 ffffffff 0000033e 000003d4 0000000e ffffffff ... This example of a content of the Request Buffer is organized according to the data structure illustrated in figure 4.8. The first double word contains the BAR number, byteenable and the type of PCI request.
4 Graphic Card Implementation As a reminder, the video plane contains ASCII characters while running an alphanumeric (text) video mode (section 4.1.2 explains the alphanumeric mode). The host computer can change the offset of the viewable part of the video plane. Changing the start of the actual video content avoids the copy of the video content, if the cursor is located at the last line and the text on the screen has to be scrolled to the top.
4.2 Graphic Card Implementation Layout Request Buffer PCI Read #1 MARK Read Out 1 BAR Switch Driver 2 Inform 5 PCI Target Control Execute Callback Function Read Buffer VGA Driver 3 Write To Requested Data 4 Read Out VGA VGA VGA Driver Video Driver Driver Planes Figure 4.14: Processing of read requests. VNC Server The video screen content is served via the VNC4 protocol [60]. A VNC server running on the CHARM interfaces to the VGA kernel driver.
4 Graphic Card Implementation of the host computer. Due to the missing of an X Server5 on the CHARM the program can only display text mode graphic. In text mode, the video planes contain ASCII characters instead of pixel data. The "terminal" program prints out these characters. The terminal program reads out periodically the device file /proc/charm/vga/text to provide the screen content. 4.3 VGA BIOS The VGA BIOS is a library of functions providing a basic interface to a VGA adapter.
4.3 VGA BIOS related operating system to obtain the information of the host computer. To be flexible as possible, the CHARM uses another possibility to start a program on the host computer. PCI expansion devices can provide a program to initialize itself and the related function on the host system [59]. The code is stored at the expansion ROM of the device and is read out by the computer BIOS during the POST7 phase.
4 Graphic Card Implementation 4.3.2 Host Interface of the RPC The CHARM provides three 32-bit I/O ports for the RPC communication: command, data and status port. The host computer can send or receive RPC commands with the aid of these I/O ports. The I/O ports are defined by the PCI Base Address Register of the PCI core. RPC Command Port receives or provides RPC command IDs. An RPC command is a task or an information for the receiver of the command. The command ID defines a specific RPC command.
4.3 VGA BIOS Write In RPC Command Port [ No Data To Send ] [ Data To Send ] Read Out RPC Status Port Write In RPC Command Port [ Status = FIFO Full ] [ Status != FIFO Full ] Write In RPC Data Port Figure 4.16: Sending of an RPC message. Read Out RPC Command Port [ Command = 0 ] [ Command != 0 ] Read Out RPC Status Port Read Out RPC Data Port [ Status = Data Available ] [ Status = Transfer End ] Exceute RPC Command Figure 4.17: Receiving of an RPC message.
4 Graphic Card Implementation 4.3.3 CHARM Interface of the RPC The CHARM sends or receives an RPC command with the aid of the RPC driver and the RPC handler. The RPC driver is the RPC related interface between the host and the CHARM. It handles the access to the RPC I/O ports from the host and provides a Linux character device to the CHARM Linux system: /dev/charm/rpc/control. A program on the CHARM can initiate an RPC command for the host by writing to the RPC device.
5 Device Emulation One of the most important feature of a remote management card is the interaction with the host. Keyboard and mouse interactions from the commanding computer have to be transmitted to the remote managed host. The CHARM card has USB interfaces to facilitate these devices. Additionally, the CHARM card implements also a USB mass storage device. For this reason, the card can provide a boot device to the host computer.
5 Device Emulation content. In principle, every USB port on the CHARM can be used to provide devices for the host, but the CHARM uses only one specific port. 5.1.1 USB Bus System The Universal Serial Bus (USB) is a serial bus standard used to interface devices. USB can connect computer peripherals such as mouse devices, keyboards, PDAs, joysticks, scanners, digital cameras, printers and flash drives. Originally released in 1995, USB has a throughput of 12 Mbps, but today USB operates at 480 Mbps.
5.1 USB Device Emulation Bulk-Endpoint Used to transfer large bursty data. Bulk transfers provide error correction and error re-transmission mechanisms. Isochronous-Endpoint Isochronous transfers occur continuously and periodically. They typically contain time sensitive information, such as an video or audio stream. Descriptors All USB devices have a hierarchy of descriptors which contain information about the type and attributes of the device.
5 Device Emulation running on the chip are stored in the on-chip 16 KB SRAM memory. Additionally, an 8 KB sized ROM provides a built-in BIOS. It supports boot control and a set of basic low level USB functions. Therefore, the controller is powerful and flexible to implement USB functions. But the drawback is the complexity of the programs and the difficult development of the USB firmware. Figure 5.2 depicts the interfaces of the USB controller on the CHARM card.
5.1 USB Device Emulation Onboard Internal USB Plug #1 Internal USB Plug #2 PLD CHARM Register Stripe PLD Bridge AHB HPI Bridge 16 USB Chip AHB Bus External USB Plug #1 External USB Plug #2 Card Bracket Figure 5.2: HPI Bridge illustrate this process. The VNC server (explained in section 4.2.3) converts the keyboard interaction inside a connected VNC client into USB keycodes. The USB mouse emulation functions in the same way like the keyboard emulation.
5 Device Emulation CBW SCSI Command USB Firmware Keycode VNC Server Keycode Interrupt Pipe Host Computer USB Wrapper Keycode Buffer ... ARM CPU AHB-HPI Bridge USB Controller Figure 5.3: USB keyboard implementation. The VNC server takes the user interaction and converts it to USB keycodes. These keycodes are written into the keycode buffer inside the USB controller. Device Controlling Unlike the HID emulation, the ARM CPU is affected while providing a mass storage device.
5.1 USB Device Emulation command set is the basic unit of SCSI communication. It consists of a one byte operation code followed by five or more bytes containing command-specific parameters. Figure 5.5 illustrates the process of encapsulating SCSI commands. The SCSI commands are sent over the bulk out pipe. Every SCSI command is preceded by a Command Block Wrapper (CBW) [72, 70]. It is a packet containing a command block and additional information. A command block specifies a standardized command set.
5 Device Emulation ARM-USB-Controller Communication The software running on the USB controller is called Mass Storage Bulk Only (MSBO) firmware. The program processing the high level protocol is named MSBO daemon. The MSBO daemon runs on the ARM CPU. By the reason of the unidirectional connection between the AHB bus and the USB Controller, the MSBO Firmware cannot contact the ARM CPU. Instead, the MSBO daemon asks periodically the MSBO firmware for new commands.
5.1 USB Device Emulation After the daemon detects an active interrupt, it acknowledges the request. Figure 5.6 illustrates the process of this protocol. AHB MSBO Firmware 2. polling Interrupt Port MSBO Daemon 3. set 6. reset 1. set 4. reset Acknowledge Port ... Command Port 5. read out Data Pointer Port Size Port ... ARM CPU AHB-HPI Bridge USB Controller SRAM Figure 5.6: Processing of an MSBO message. The numbers represent the time flow of the processing steps.
5 Device Emulation Operation Code 0x12 0x25 0x28 0x2A Meaning Inquiry Read Capacity Read Write Description Requests basic information like the vendor name. Requests the data capacity information. Read Request. Write Request. There are four basic SCSI commands for a mass storage device: More information about the supporting SCSI commands can be found at [73]. The MSBO firmware administrates a send and receive buffer for the incoming and outgoing SCSI commands. This buffer is called Transfer Buffer.
5.2 Legacy Device Emulation USB read and write requests to HTTP or file system requests. Before, the daemon has to get announced which data source has to be used. NFS Source The Linux running on the ARM CPU mounts an NFS11 share. The image files are stored on an NFS directory. The MSBO daemon has therefore direct access to the file. But in principle, every network file system available for LINUX is usable to store large files on the CHARM.
5 Device Emulation 1. 2. 3. 4. Keyboard sends one byte to the controller. Controller stores the byte in the output buffer. Controller activates the interrupt line. Device driver reads out one byte at port 0x60. Port 0x64 0x64 0x60 0x60 Mode read write read write Description Status Register Command Register Output Register Data Register Table 5.4: Register of the 8042 keyboard controller. The CHARM card does not interface with the physical keyboard interface of the keyboard controller.
5.3 Computer Power Control 0x41E 0 0x43D 1 2 3 4 5 Scancode ASCII 1 Byte 1 Byte 6 7 8 9 10 11 12 13 14 15 Start Pointer End Pointer 0 { Figure 5.9: Organization of the keyboard buffer of the BIOS. can contain 16 characters. The BIOS keyboard buffer can be used to emulate keystrokes. 0x410 are This technique is referred to as stuff keys by programmers. The emulated keystrokes written into the keyboard buffer behind the tail of the buffer.
6 Hardware Monitor Functionality Computer systems are in general not fail-safe with respect to availability and reliability. Besides the errors produced by the software running on the system, the hardware is also error-prone. Moving parts of the hardware are one of the first devices with the highest potential to fail over a specified period of time. Specially, these are the fans and the hard disks [76] of the computer system.
6 Hardware Monitor Functionality that the system is currently executing [79]. The error or checkpoint code is either a byte or a word value. The BIOS and mainboard manufactures provide tables assigning the code value to a certain checkpoint or failure. Most of the POST code tables can be obtained at the website www.bioscentral.com. As a general rule, the BIOS vendor provides a set of basic POST codes and the motherboard manufactures extend the POST code set with their own codes.
6.2 Host System Inspector Processor Cache Bridge/ Memory Controller Audio DRAM PCI Local Bus #0 LAN CHARM Card PCI-to-PCI Bridge Other I/O Functions PCI Local Bus #1 Figure 6.1: The PCI bus provides the CHARM card access to the hardware units of the host computer. orchestrates the access to the PCI Master Control unit. This enables full PCI access of the Linux system of the CHARM card. The PCI Master Control registers are: PCI MASTER ADDRESS contains the PCI address for the next master access.
6 Hardware Monitor Functionality series. The other signals like the address or the command are initialized before starting the PCI Master Control unit and they do not have to be synchronized. The start signal locks the input register of the PCI Master Control unit. Afterwards, the PCI Master Control unit commands the PCI core to initiate the related PCI cycle [39]. The done signal informs the PCI master driver about the end of the PCI transaction. The driver has to check this register periodically.
6.2 Host System Inspector get repaired manually. But if the system of the node failed, the detection of the error source is difficult. The CHARM card provides information which can identify the error source. The following paragraphs list the status information which can be obtained by the CHARM card while computer state detection is performed. POST Code Provides information about hardware failures at boot time. BIOS CMOS Content The CHARM card gets the BIOS CMOS content at boot time.
6 Hardware Monitor Functionality onboard ADC measures the related change of the voltage between the NTC and a constant reference resistor. The temperature is calculated by two steps. First, the measured voltage is converted into the related resistance of the NTC. Afterwards, the temperature is calculated by the resistance of the NTC using the Steinhart & Hart equation [81, 82]. Port 1 2 3 4 5 - 10 11 12 Usage Voltage measurement of 12V PCI pin. Temperature measurement using an external NTC.
6.3 Display Screen Inspector content of the screen. Generally, the alphanumeric content of the VGA data is used to get status information of the host computer, for example, to detect error keywords or to validate a keyboard entry which was sent by the CHARM. A windows blue screen, for example, has a specific design [83] or a Linux kernel panic message is printed on the text console. These events can be detected via their typical screen content.
6 Hardware Monitor Functionality • Validation of the automatic BIOS CMOS setup by the CHARM card. • Inspection of the BIOS boot messages. • Analysis of the screen content after a system crash. The advantage of this approach is the possibility of the automatic validation and analysis of the screen content. Figure 6.5 shows the boot screen of a computer running in a graphical video mode. The program getscreen inspects the VGA content and returns the textual representation of the screen content. Figure 6.
6.3 Display Screen Inspector start of the valid screen content inside a video plane. Increasing this pointer will scroll the screen content to the top. Figure 6.7 illustrates this process. 0x0 Video Mode CRTC Start Address Register Viewable Screen Content non visible 0x2000 Video Plane Figure 6.7: Diagram of the viewable part of the video plane. Running an alphanumeric mode, the CRTC Start Register defines the start pointer of the current screen content.
6 Hardware Monitor Functionality Figure 6.8: Actual content of the screen. Figure 6.9: Previous content of the screen. Figure 6.10: Menu bar of the BIOS setup utility of an AMI BIOS. This information is used to validate and to control automated host interaction of the CHARM like the setup of the BIOS settings. Running an alphanumeric video mode, the detection of highlighted text content is very simple.
6.4 Monitoring Software 00000e0 00000f0 0000100 0000110 0000120 0000130 027 027 r 027 t 027 u 027 027 027 027 027 027 r 027 E 027 P 027 027 027 027 i 027 x 027 C 027 027 027 027 t 027 i 027 I 027 P 027 027 027 y 027 t 027 P o B S 027 027 027 027 027 027 n w o e 027 027 027 027 027 027 P e o c 027 027 027 027 027 027 As shown above, the address "00000a0" contains the entry of the "Main" menu item.
6 Hardware Monitor Functionality System Management for Networked Embedded Systems and Clusters. It is used as the central cluster management for the HLT computing cluster and some of its benefits are also used for executing some tasks on the CHARM, too. SysMES does not only collects sensor information from the computer, but also provides a rule based system management framework. The framework consists of two basic elements: events and tasks.
6.4 Monitoring Software Figure 6.11: Screenshot of the Lemon GUI presenting the CHARM sensor information.
6 Hardware Monitor Functionality Figure 6.12: Screenshot of the HLT SysMES GUI. Figure 6.13: Screenshot of the HLT SysMES GUI.
7 Automatic Cluster Management The maintenance of a computer cluster causes high administration efforts. Often, there are periodical tasks like set up new cluster node or the inspection of certain hardware components which has to be done manually by human intervention. For example, the LDAP1 entry of the Ethernet MAC of a new cluster node or the inspection of the screen content of a failed computer are performed by the administrator. The CHARM card undertakes administration tasks of a computer cluster.
7 Automatic Cluster Management model name stepping cpu MHz cache size fdiv_bug hlt_bug f00f_bug coma_bug fpu fpu_exception cpuid level wp flags bogomips : : : : : : : : : : : : : : AMD Athlon(tm) XP 1600+ 2 1410.644 256 KB no no no no yes yes 1 yes fpu vme de pse tsc msr pae mce cx8 apic sep ... 2823.16 CHARM$>_ The crsh.sh script is called on the CHARM but the command more /proc/cpuinfo is executed on the host system. There is a limitation using the crsh.sh script.
USB KNOPPIX ISO Image Primary VGA of the host shell command 7.1 Complex Tasks Host Computer USB CPU running KNOPPIX CHARM 3 Keystrokes 2 crsh.sh keyb_cmd 4 7 PCI VGA Output 5 Video Memory 1 Shell terminal 6 KNOPPIX$>more /proc/cpuinfo processor :0 vendor_id :AuthenticAMD cpufamily :6 cpuidlevel :1 CHARM$>crsh.sh more /proc/cpuinfo processor :0 vendor_id :AuthenticAMD cpufamily :6 cpuidlevel :1 ... ... KNOPPIX$>_ CHARM$>_ Host Console CHARM Console Figure 7.
7 Automatic Cluster Management # Power on the node hostPowerOn while (hostPOSTcode != 0x37); # 0x37 = (Displaying sign-on message) # Send/push "delete key" to enter setup keyb_cmd -u --key 0x7e335b1b -s The POST code 0x37 represents the display of the BIOS start-up screen with the sign-on message [79]. This is the moment, to push the "delete" button to enter the BIOS setup utility. With the aid of the program key_cmd single keystrokes can be emulated.
7.1 Complex Tasks for key in $ANSI_RIGHT $ANSI_RIGHT $ANSI_DOWN do keyb_cmd -u --key $key -s done $ANSI_DOWN $ASCII_ENTER ... # Switch off the PC hostPowerOff The setup of the BIOS settings is finally a sequence of arrow, escape and return keys which are sent to the host system. But first, the BIOS setup has to be set into an initial state. The key sequence of the BIOS setup depends on this initial state. 7.1.
7 Automatic Cluster Management not on the CHARM card. The main building blocks of the test script are explained step by step: # Start CD-ROM emulation provides a KNOPPIX CD msbod -i KNOPPIX.iso -t CDROM The msbod daemon is the interface between the USB controller and the data source of the provided image. A KNOPPIX live CD provides the operating system for the computer tests. # Setups the BIOS CMOS to boot from CD-ROM sh setupCMOS.sh # Power on the node hostPowerOn The script setupCMOS.
7.1 Complex Tasks of the new cluster nodes have to be written to a mapping list. Hence, the node mapping of hundreds of computer nodes is very time consuming. The situation will get worse if the node has more than one Ethernet interface. The front end processor nodes of the HLT have three Ethernet ports, for example. To ease network setup, the HLT at CERN uses the CHARM to identify the node.
7 Automatic Cluster Management 7.1.5 Automatic Operating System Installation After the CHARM cards have tested the nodes, the operating system of the node has to be installed. The installation of the cluster nodes is also done automatically. The software tool SystemImager is used to install the nodes [95]. The main issue of the SystemImager is that it clones a Linux system [96]. Thereby, a master node, called golden client contains the prototype of the Linux system.
7.1 Complex Tasks video output of the host computer is periodically inspected by the CHARM card. If an error keyword is detected, the CHARM card can take appropriated actions. The table 7.2 lists the failures which can be handled by the CHARM card. Failure BIOS CMOS checksum error Reboot and Select proper Boot device Detection/Condition Keyword "checksum error" detected. Action Setup CMOS Ping to the host failed. Host is in boot stage.
8 Special Implementations The main function of the CHARM is the remote management of the HLT cluster nodes. However, the CHARM is used in other fields. Thereby, special firmware implementations enable new functions of the card. The base system of the card (boot-loader, the Linux kernel and the base root file system) is not changed in this process. The difference are the FPGA logic and the related Linux device driver. Two further functions are used with the CHARM card: a PCI Bus Analyzer and a network card.
8 Special Implementations Feature Trigger Conditions Traced PCI Signals Memory (FIFO) Depth Number of traced PCI cycles Description PCI address and a related bus command, manual trigger AD, CBE, FRAME, IRDY, TRDY, STOP, PAR, PERR[40] 49152 bits 1024 Table 8.1: Features of the CHARM PCI bus analyzer. 8.1.1 FPGA logic The FPGA logic of the PCI bus analyzer is divided into three parts: a PCI trace engine, a FIFO and an interface to the AHB bus system of the card [2].
8.1 PCI Bus Analyzer 8.1.2 Controller Software The PCI trace software is a Linux console program. The program initializes, starts and stops the trace module via an address window on the AHB bus. After a successful trace, the program reads out the FIFO containing the traced PCI signals and stores them to a file. The data is formatted in a human readable way. Every line in the file represents one PCI cycle.
8 Special Implementations Figure 8.2: GUI of the CHARM PCI bus analyzer. trace program and provides a bridge for the socket connection of the Yapt software with the standard input/output stream of the PCI trace program. The commands are sent to the input stream of the trace program. The results are received with the aid of the output stream of the trace program. Following, the content of the configuration file /etc/services for the CHARM PCI bus analyzer is shown: # /etc/services: # pci_trace 45000/tcp ...
8.2 Network Card TCP/IP TCP/IP Intranet PORT 45000 TCP/IP Inet Daemon Stream TCP/IP STDIN Yapt GUI STDOUT PCI Trace Program Remote Computer CHARM Figure 8.3: Data flow of Yapt and the PCI trace program. The Inet daemon builds a bridge between the TCP stream of the Yapt software and the standard console stream of the PCI trace program. • Setup of the trigger conditions. • Present the waveform output of the traced PCI signals. • Allows signal filter of the traced PCI signals.
8 Special Implementations The physical layer is part of the OSI model which is an abstract description for network protocol design. Normally, the bus interface has a higher throughput as the physical link interface. Therefore, FIFOs are used to optimize the data traffic. The CHARM provides an additional network interface to the host computer. This interface is used to establish a network connection between the CHARM and the host computer without using a network cable.
8.2 Network Card unit with the aid of the Stripe-PLD bridge. The middle box of the picture represents the interface between the local I/O system and the physical layer which is the PCI bus in our case. The SRAM is shared between the CHARM card and the host computer.
8 Special Implementations Figure 8.7: LayoutAbbildung of the shared2.2: SRAM contentShared-Memory [3]. The left side represents the lower Schema: addresses. The right side marks the end of the SRAM content. 2.3.5 buffer.c behavior avoids race conditions while accessing the buffer. The buffer contains standard Linux network packets: the sk_buf [102, 3]. The sk_buf structure is defined in the file located in the Linux kernel source code.
8.2 Network Card struct chn_packet packet[CHN_BUFFER_LENGTH]; }; struct chn_sharedmem { unsigned long status; unsigned long jiffies[2]; struct chn_buffer buffer[2]; }; The structure chn_sharedmem describes the shared SRAM content. The two transfer FIFOs are represented by the structure chn_buffer. One FIFO contain 32 packets. A packet is defined by the structure chn_packet. It contains a Linux network packet (sk_buf) with a size of up to 1600 bytes.
9 Benchmarks and Verification This chapter details the performance of the functional units of the CHARM card. Especially, the throughput and the frame rate of the VGA card implementation are presented. Previously, two running operating systems are inspected for VGA accesses to get an estimation of the required performance of the CHARM card. Additionally, the throughput of the emulated USB mass storage device is discussed and finally the power consumption of the CHARM is examined. 9.
9 Benchmarks and Verification 9.1.1 Estimation of the VGA Data Throughput A VGA card is not a simple device providing a linear framebuffer. Instead, a VGA screen is a collection of several I/O and memory accesses to the VGA card. This section gives an overview of VGA accesses on a running system. The test system #1 (see appendix G) was equipped with an off-the-shelf VGA card. Additionally, a PCI Bus Analyzer was installed on the system to scan the PCI bus for VGA accesses.
9.1 VGA Function Performance Location High Register (CRTC register 0xE). The write to port 0x3D5 sets the value of the Cursor Location High Register. The next two I/O writes are similar to the last I/O writes. First, the host selects the Cursor Location Low Register (CRTC register 0xF). Subsequently, the content for this register is written to port 0x3D5.
9 Benchmarks and Verification 9.1.2 CHARM PCI Target Throughput The PCI target interface is controlled by a Linux driver running on the ARM CPU (see section 4.2.3). The throughput of the PCI interface depends on the processing time of the VGA driver, the size of the Request Buffer and the flush period of the Request Buffer. Table 9.4 lists the throughputs of the PCI interface. The first entry depicts the write throughput to the Request Buffer which buffers the PCI commands.
9.1 VGA Function Performance PCI accesses also contains the overhead of the interrupt handling. An interrupt handle spends a large amount of processing time because it causes the processor to save its state of execution via a context switch. However, the interrupt handle spends approximately 20 µs of processor time, which does not have a relevant impact while processing the whole Request Buffer.
9 Benchmarks and Verification effect on the processing time. The reason for the low impact of the number of video planes used is the marking of the so-called dirty regions during the VGA processing that requires significant amount of processing time. The dirty regions function is used to improve the screen generation and is explained in the next section. The next section discusses also the impact of the VGA processing on the PCI throughput. Figure 9.
9.1 VGA Function Performance Figure 9.3 shows the PCI throughput of the CHARM card in relation to the running video mode. The throughput decreases from 3.2 MB/s (see table 9.3) without data processing to narrowly 400 KB/s with data processing. The colors of the bars specify the number of provided video planes of a video mode and the type of the video mode is printed on the bars. The throughput is measured with and without screen generation.
9 Benchmarks and Verification Figure 9.3: Write throughput to the CHARM card in relation to the running video mode. The color of the bars represents the number of provided video planes of the dedicated video mode. The bars filled with a pattern define the throughput of the CHARM card without screen generation. The solid-colored bars represent the throughput with a running VNC server generating the screen content. Additionally, the bars are labeled with the type of video mode: text or graphic mode.
9.1 VGA Function Performance Impact of the Dirty-Regions Function on the PCI Throughput The build and the transfer of the VNC screen content is time consuming. To improve the processing time of the VNC server and the transfer of the VNC data to the client, the VNC server only rebuilds new VGA data content into a VNC framebuffer. On this account, the VGA driver provides a map which marks the part of the screen that contains new data. Thereby, the screen content is segmented into regions of a fixed size.
9 Benchmarks and Verification Figure 9.4: Write throughput to the CHARM card in relation to the running video mode. In this process, the processing VGA driver use the dirty-region function. The color of the bars represents the number of provided video planes of the dedicated video mode. The bars filled with a pattern define the throughput of the CHARM card without screen generation. The solid-colored bars represents the throughput with a running VNC server generating the screen content.
9.1 VGA Function Performance Figure 9.5: Input frame rate of the CHARM card in relation to the running video mode. The color of the bars represents the number of video planes used for the dedicated video mode. The bars filled with a pattern define the input frame rate of the CHARM card without screen generation. The solid-colored bars represent the input frame rate with a running VNC server generating the screen content.
9 Benchmarks and Verification protocol of the VNC framebuffer is the RFB1 protocol. It supports color depths of 8,16 and 32 bits. However, the VNC client normally chooses the highest color depth of 32 bits per pixel. Figure 9.6 depicts the frame rate provided by the VNC server in relation to the VGA mode. In this case, the frame rate is the maximum number of frames per time which can be generated by the VNC server.
9.2 USB CD-ROM Performance Figure 9.6: Frame rate of the VNC server. The plain-colored boxes mark the frame rate of the VNC server which does a full framebuffer generation and the boxes which are filled with a pattern represent the frame rate of the VNC server, when only 15% of the framebuffer has to be updated. The color of the bars represent the number of used video planes for the dedicated video mode. Additionally, the bars are labeled with the corresponding screen resolution in pixels.
9 Benchmarks and Verification 512 bytes. The endpoint descriptor of a transfer pipe defines the maximum packet size (wMaxPacketSize) which is accepted by the particular endpoint. But the host controller can use smaller data packets than the allowed maximum packet size for the transfer. Block Size of a data transfer defines the size of the data packet which is requested by the initiator. Figure 9.7 shows the read performance to the emulated CD-ROM device.
9.4 Power Consumption the USB-IF4 has instituted a Compliance Program that provides reasonable measures of acceptability [105]. The USB-IF provides a compliance test tool, the USB Command Verifier (USBCV) which evaluates High, Full and Low-speed USB devices for conformance to the USB Device Framework, Hub device class, HID class, and OTG specifications [105]. The USB devices provided by the CHARM are test with the USB Command Verifier.
9 Benchmarks and Verification PCI bus. And the USB host function is neither used at the moment nor necessary for the remote control function. The third row in table 9.5 depicts the power consumption of the CHARM after reducing the input voltage of the CHARM. Therefore, the input voltage is adjusted to get the minimal dropout voltage6 of the power regulator of the CHARM. This gives an estimation of how much power can be saved removing the 5V power source and removing the related power regulator.
10 Conclusion and Outlook This thesis has presented a powerful remote management device which is suitable for the most of the IBM compatible computer systems. The basic concept of the system is to provide an independent and reliable off-band remote management facility. The remote device, called CHARM card, is usable in a heterogeneous computer cluster environment to provide a generic and uniform remote management interface for the administrator.
10 Conclusion and Outlook the CHARM has potential to improve the performance. With the aid of code optimization, there will be place inside the FPGA to add functions to help the processing of the VGA requests. Moreover, the VESA BIOS Extensions (VBE) provides a linear framebuffer format [57]. The pixel data can be easily transformed into a VNC framebuffer using a linear frame buffer instead of the several complex VGA frame buffer formats.
A Abbreviations ADC AGP AHB ALICE AMBA ASCII ATLAS BAR BCV BEV BIST BMC CGA CHARM CHN CMS CIA CERN CMOS COTS CPLD CPU DMA DMI DPRAM DRAM EBI FEP FIFO FPGA HPI IP IPL IPMI Analog-to-Digital Converter Accelerated Graphics Port AMBA High Speed Bus A Large Ion Collider Experiment Advanced Microcontroller Bus Architecture American Standard Code for Information Interchange A Toroidal LHC Apparatus Base Address Register Boot Connection Vector Bootstrap Entry Vector Built-in Self Test Baseboard Management Controll
A Abbreviations KVM LAN LHC LHCb LUT MAC MDA MSBO MTD NFS NTC OCR PCB PCI PCIe PLD POST RGB SCSI SDRAM SRAM SOPC USB VGA VHDL VHSIC WOL 132 Keyboard, Video, Mouse Local Area Network Large Hadron Collider Large Hadron Collider beauty Look-Up Table Media Access Control Monochrome Display Adapter Mass Storage Bulk Only Memory Technology Device Network File System Negative Temperature Coefficient Thermistor Optical Character Recognition Printed Circuit Board Peripheral Component Interconnect Peripheral Compon
B Characteristics of the CHARM System Description Operating system CPU frequency SDRAM (main memory) Flash memory (non-volatile memory) Dhrystone1 2.1 VAX MIPS Dhrystone 2.1 per Seconds Whetstone2 1.2 PCI Target write performance PCI Target read performance PCI Master write performance PCI Master read performance USB CD-ROM Read Performance Value Linux 2.4.21 120 MHz 32 MB 8 MB 78.8 136798.9 497.5 KIPS 3.3 MB/s 177 KB/s 330 KB/s 330 KB/s 316 KB/s Table B.1: Characteristics of the CHARM.
B Characteristics of the CHARM System Figure B.1: CHARM card front view (model B). Figure B.2: CHARM card back view (model B).
C Application of the CHARM The following descriptions list the most important programs of the CHARM card. They are divided into third parts programs which were adapted to run on the CHARM and the CHARM specific applications. C.1 Third Party Application Web server The CHARM card uses two web servers: boa and axTLS [111]. Boa is a small embedded web server without access control features. It will be replaced by axTLS in a newer revision of the CHARM card.
C Application of the CHARM msbod The Mass Storage Bulk Only Daemon (msbod) is the central application of the USB CD-ROM emulation. Section 5.1.4 explains the functioning of this program. hostPOSTcode The program hostPOSTcode returns the last POST code of the host system. hostPowerOn, hostPowerOff The power switch of the mainboard of the host computer is connected to the CHARM. The program hostPowerOff and the program hostPowerOn use this connection to switch off or power on the computer.
D CHARM Register Map #ifndef __CIA_PORT_MAP__ #define __CIA_PORT_MAP__ // // // // AUTO-CREATED: Mi 11. Jun 15:58:51 CEST 2008 based on cia_controller.vhd SVN Revision: unknow These are the offsets of the CHARM control register. All offsets are double word addresses.
D CHARM Register Map #define #define #define #define #define #define #define #define #define #define #define #define POWER_SW_INPUT (0x94/4) EXTERN_LED (0x98/4) BAR_HIDING (0x9c/4) VGA_ENABLE (0xA0/4) PCI_TARGET_STATE (0xA8/4) PCI_IRQ (0xAC/4) PCI_MASTER_BYTEENABLE_N (0xB4/4) PCI_MASTER_STATE (0xBC/4) FAN1_SPEED (0xC0/4) FAN2_SPEED (0xC4/4) FAN3_SPEED (0xC8/4) VGA_ACK (0xCC/4) #endif // __CIA_PORT_MAP__ 138
E CHARM Internal Address Map The system bus of the Excalibur Chip is the AHB bus. Every hardware module has an address window inside the AHB bus. The address map is configured with the aid of the Altera SOPC builder. Table E.1 depicts the address windows and the related hardware modules.
E CHARM Internal Address Map Register Name POST_CODE_ADDR POST_CODE_DATA PC_STATE_ACK AHB_BRIDGE_STATE PCI_MASTER_DATA_OUT PCI_MASTER_DATA_IN PCI_MASTER_CONTROL PCI_MASTER_RESET VGA_INTERRUPT SNIFFER_REQUEST CIA_CONTROLLER_VERSION USB_RESET ADC_INTERRUPT POWER_SOURCE OPTOCOUPLER_1 OPTOCOUPLER_3 OPTOCOUPLER_6 OPTOCOUPLER_8 CIA_CONTROLLER_DATE PCI_BUS_NRESET Offset 0x00 0x04 0x08 0x0c 0x10 0x14 0x18 0x20 0x28 0x2c 0x30 0x34 0x38 0x3C 0x40 0x44 0x48 0x4c 0x50 0x58 Register Name PCI_CORE_STATUS CHARM_IRQ FAN
0x0000000 Linux Main Memory (30 MB) 0x1F00000 0x1F41000 0x1F50000 Request Buffer (256 KB) 30MB Read Buffer (16 B) Expansion ROM Content (32 KB) 0x1E00000 VGA Plane 0 (64 KB) 31MB VGA Plane 1 (64 KB) VGA Plane 2 (64 KB) VGA Plane 3 (64 KB) 0x2000000 32MB Figure E.1: SDRAM address map.
F Device Emulation File: SIE1_keyb_mouse_msbo_intern.bin USB Interface Onboard USB plug of SIE1 Description Composite Device Interface 0: keyboard Interface 1: mouse Interface 2: mass storage File: SIE1_msbo_SIE2_keyb_mouse_intern.bin USB Interface Description Onboard USB plug of SIE1 Single Device Interface 0: mass storage Onboard USB plug of SIE2 Composite Device Interface 0: keyboard Interface 1: mouse File: SIE1_keyb_mouse_msbo_extern.
F Device Emulation File: SIE1_keyb_mouse_intern.bin USB Interface Onboard USB plug of SIE1 File: SIE1_keyb_mouse_extern.bin USB Interface Card bracket Mini USB plug of SIE1 Description Composite Device Interface 0: keyboard Interface 1: mouse Description Composite Device Interface 0: keyboard Interface 1: mouse Table F.2: USB controller firmware (continued).
G Test Setup There are two basic test setups, which were used to verify and test the CHARM card and its applications. Test system #1 (see figure G.1) is a server motherboard installed in a rack mount chassis and test system #2 is a systemboard installed in a midi tower case. Table G.1 and G.2 shows the parameters of the test systems. Figure G.1: Test system #1 with an installed CHARM card. It is the topmost PCI card.
G Test Setup G.1 Supported Mainboards The initialization of the computer system depends on the BIOS of the motherboard. Table G.1 lists the motherboards which are successfully tested for supporting the CHARM card. Vendor TYAN ASRock DELL Siemens Supermicro Model S3992 (h2000M) TIGER HEsl S2567 S5397 Thunder K8S Pro (S2882) K7S41GX OptiPlex 210L Celsius 600 H8DCi BIOS AMI BIOS V2..04 07/10/2008 AMI BIOS 09/28/2001 Phoenix Technologies LTD BIOS V1.03 AMI BIOS 8.0 AMI BIOS P2.
H VGA H.1 Video Modes Mode No. 0,1 2,3 4,5 6 7 D E F 10 11 12 13 Type text text graphic graphic text graphic graphic graphic graphic graphic graphic graphic Number of Planes 2 2 2 1 2 4 4 2 4 1 4 4 Resolution [Pixel] 360 x 400 x 4 720 x 400 x 4 320 x 200 x 2 640 x 200 x 1 720 x 400 x 1 320 x 200 x 4 640 x 200 x 4 640 x 350 x 1 640 x 350 x 4 640 x 480 x 1 640 x 480 x 4 320 x 200 x 8 Screen Size [KB] 70,31 140,62 15.62 15,62 35,15 31,25 62,50 27,34 109,37 37,50 150,00 62,50 Table H.1: VGA video modes.
H VGA H.
H.
Bibliography [1] Excalibur Devices. Hardware Reference Manual, Altera Coperation, November 2002. URL http://www.altera.com/literature/ [2] Martin Le-Huu. CHARM PCI Tracer . Internship report, University of Heidelberg, Kirchhoff Institute of Physics, Chair of Computer Science and Computer Engineering, November 2007. URL http://www.kip.uni-heidelberg.de/ti [3] Christoph Straehle and Marcel Schuh. Kommunikationsplattform zwischen CHARM und HOST .
Bibliography [13] A Toroidal LHC Apparatus Experiment Homepage, 2008. URL http://atlas.web.cern.ch [14] Large Hadron Collider beauty Experiment Homepage, 2008. URL http://lhcb.web.cern.ch [15] H.Tilsner, Timm Steinbeck, and Volker Lindenstruth. The high-level trigger of ALICE . The European Physical Journal C, 33:1041–1043, 2004. [16] S. Bablok, Matthias Richter, Dieter Roehrich, and Kjetil Ullaland. ALICE HLT interfaces and data organisation. In CHEP2006 . CERN, "Mumbai (India)", 2006. URL http://indico.
Bibliography [27] Remote Management with the Baseboard Management Controller in Eighth-Generation Dell PowerEdge Servers. October 2004. URL http://www.dell.com/downloads/global/power/ps4q04-20040110-Zhuo.pdf [28] Douglas E. Comer. Computer Networks and Internets with Internet Applications. Prentice Hall, 4 edition, August 2003. ISBN 978-0131433519. URL http://www.netbook.cs.purdue.edu/index.htm [29] Raritan, Inc. Peppercon eRIC II , 2008. URL http://www.raritan.
Bibliography [42] EZ-HostTM Programmable Embedded USB Host/Peripheral Controller - data sheet. Technical report, Cypress Semiconductor Corporation, 2008. URL http://download.cypress.com.edgesuite.net/design_resources/ datasheets/contents/cy7c67300_8.pdf [43] Marius Groeger. Open-Source firmware suite for ARM based platforms. URL http://armboot.sourceforge.net/ [44] Altera Corporation. Using Run-From-Flash Mode with the Excalibur Bootloader . Application note, Altera Corporation, September 2002. Version 1.0.
Bibliography [57] Video Electronics Standards Association. VESA BIOS EXTENSION Core Functions Standard . Technical report, Video Electronics Standards Association, September 1998. Version 3.0. [58] Edward Solari and George Willse. PCI Hardware and Software Architecture and Design. Annabooks, February 1998. ISBN 0-929392-59-0. URL http://www.annabooks.com [59] Conventional PCI Specification. December 1998. URL http://www.pcisig.com [60] RealVNC Ltd. The original cross-platform remote control solution.
Bibliography [71] Curtis E. Stevens. El Torito Bootable CD-ROM Format Specification. January 1995. URL http://www.phoenix.com [72] USB Mass Storage Class Bulk-Only Transport. Technical report, USB Implementers Forum, Inc., September 1999. URL http://www.usb.org/developers/docs/ [73] SCSI Primary Commands - 4 . September 2005. (SPC-4). URL http://www.t10.org [74] PS/2 and PC BIOS Interface Technical Reference. September 1991. URL http://www.ibm.com [75] Michael Tischler. PC intern 4 - Systemprogrammierung.
Bibliography [84] Ethan Galstad. Nagios, 2008. URL http://www.nagios.orgl [85] Miroslav Siket. LEMON - LHC Era Monitoring, 2008. URL http://lemon.web.cern.ch/lemon/index.shtml [86] Camilo Lara. The SysMES Architecture: System Management for Networked Embedded Systems and Clusters. Date 2007 PhD Forum, Nice, France, 2007. [87] American Standard Code for Information Interchange. Technical report, American Standards Association, 1963. URL http://www.ansi.org [88] Control Functions for Coded Character Sets.
Bibliography [99] Aeleen Frisch. Essential System Administration. O’Reilly & Associates, Inc., 1993. ISBN 0937175803. [100] Mathias Hein. Ethernet. Internat. Thomson Publishing, 2 edition, 1998. ISBN 3-8266-4041-1. TM [101] Ethernet Media Access Controller AllianceCORE - Product Specification. January 2004. URL http://www.xilinx.com/publications/3rd_party/products/CAST_MAC.pdf [102] Alessandro Rubini and Jonathan Corbet. LINUX - Device Driver . O’Reilly & Associates, Inc., June 2001. ISBN 0-596-00008-1.
Danksagung/Acknowledgements Mein Dank geht an alle diejenigen, die zum Gelingen dieser Arbeit beigetragen haben. Ein Projekt dieser Größe lässt sich kaum von einer Einzelperson bewältigen. Zuallererst möchte ich mich bei Herrn Prof. Lindenstruth bedanken für die freundliche Aufnahme in seine TI-Gruppe und die Möglichkeit an diesem spannenden Thema zu arbeiten.