System information

6 Hardware Monitor Functionality
series. The other signals like the address or the command are initialized before starting the
PCI Master Control unit and they do not have to be synchronized. The start signal locks
the input register of the PCI Master Control unit. Afterwards, the PCI Master Control
unit commands the PCI core to initiate the related PCI cycle [39]. The done signal informs
the PCI master driver about the end of the PCI transaction. The driver has to check this
register periodically. The error signal reports a failed PCI cycle. It is only valid if the done
signal is also asserted.
PCI
Master
Driver
CHARM
Register
PCI Master
Control
Address
Command
Data In
Data Out
Byteenable
Start
PCI
Core
AHB
Control
Signals
AHB Clock Domain PCI Clock Domain
Done
Error
Figure 6.2: Communication flow of the PCI Master driver.
The PCI master driver creates Linux device files which provide access to the host com-
puter PCI space. Programs like lspci
1
or dmidecode
2
use this interface to return information
about the host.
6.2.2 Computer Health Analyzer
Computer systems have a range of error sources. The most of them are detectable and
correctable by the running operating system. Thereby, software failures are easier to han-
dle than hardware failures. Whilst software failures can be corrected by remote control,
hardware failures cannot. In a computer cluster environment it is necessary to detect and
correct the error by remote control. But if a failure crashes the system or the network
connection failed, a direct interaction with the computer is unavoidable. Sometimes a com-
puter system does not restart after rebooting. The cause can be trivial failures like a bad
boot-loader configuration or a wrong BIOS CMOS setup. In this case the failure is easy to
repair if remote access to the boot console is possible. The CHARM card helps inspecting
failures and repairs automatically also such kind of errors.
Computer State Detector
Searching the source of an error can be very restricted if the nodes lost the remote access
capability. With the aid of the CHARM card a failed network setup can be repaired. The
VNC server of the CHARM provides access to the host computer. The network setup can
1
Lists the devices which are connected to the PCI bus.
2
Shows the DMI information of the host computer.
82