System information
1.3 HLT Computer Cluster
Figure 1.1: Overview of the LHC ring at CERN.
1.2 shows several HLT cluster nodes in the ALICE counting room. The current setup of the
cluster installed at CERN contains approximately a quarter of the foreseen nodes (>100).
The installation and administration of such a big computer farm is an extensive task.
Therefore, automatization of periodically task is highly recommended. Another issue to be
performed on the HLT cluster is the remote control of its nodes. The counting rooms of the
HLT are located near the ALICE detector. During beam time, the access to these rooms is
restricted and the computer cluster must be controlled remotely. The failed computers have
also to be fixed by remote control. Especially the front end processors (FEP) which get
the raw data from the detector have to run in any case. Normally, an FEP node cannot be
exchanged by a redundant node, because the node is directly connected with the detector
via an optical link. A broken FEP node has to be replaced completely with a new computer
at the same physical location.
At the beginning planning stage of the HLT, the model or type of PCs for the cluster was
not specified. To get a good price-performance ratio, the PCs should be purchase as late
as possible. An important aspect of the cluster node was to provided a good throughput
and compatibility for the Read Out Receiver Cards (RORC) [20]. These cards connect the
detector with the HLT. Unfortunately, the computers which are suited for the RORC do
not provide a built-in remote control which fulfills our requirements. To be as flexible as
possible the CHARM was developed to provided full remote control of the cluster nodes
independent of the final solution adopted for the computer components.
The following section summarizes common remote management facilities. The hardware
based remote control tools will be especially discussed whether it can be used for the HLT.
19