HP XC System Software Installation Guide Version 3.0

Table K-1 Diagnosing System Imaging Problems
Possible SolutionHow To DiagnoseSymptom
Verify BIOS settings to ensure that the node is set to
network boot and that the correct network adapter is
at the top of the boot order.
An nconfig starting entry appears in
the imaging.log file.
A node boots to local disk and runs
through the node configuration phase
(nconfigure) instead of imaging.
You can determine when a node hangs
during imaging by monitoring the
imaging.log file, which is described
in “Monitoring An Imaging Session
(page 145). Further inspection can be
done by setting the correct console
parameter in the
/tftpboot/pxelinux.cfg/default
file before booting.
A node hangs while imaging.
Retry the imaging operation.
Verify that the network is functioning properly.
Ensure that disk is working correctly and is properly
seated in the node.
Identified by monitoring imaging.log
file or watching the console.
Disk device not found.
Correct the cluster configuration using the
cluster_config utility. Then, you can use the
startsys command to reimage or you can rerun
the nconfigure phase:
# service nconfig nconfigure
Identified by monitoring imaging.log
file. The system will completely boot,
but the node will not show up as
available by the sinfo command.
The node configuration phase
(nconfig) fails, and the system is left
in single-user mode.
Verify hardware, BIOS, and kernel boot option
settings.
Verified by multiple “starting imaging”
messages in the rsyncd log file.
A node spontaneously reboots during
imaging.
The system boots from local disk and
runs nconfigure. You can verify this by
checking messages written to the
imaging.log file.
The network boot times out.
Verify DHCP settings and status of daemon.
Verify network status and connections.
Monitor the /var/log/dhcpd.log file for
DHCPREQUEST messages from the client node
MAC address.
Check boot order and BIOS settings.
Rerun imaging/booting operations with less nodes.
The system is placed in single-user
mode.
A node configuration (nconfigure)
operation fails while attempting to
access the database on the head
node.
Ensure that the mysqld daemon is running on the
head node.
# service mysqld status
Verify network connections.
Boot fewer nodes in a single operation.
Verify that the node has started
imaging by looking for
“imaging_started” messages in the
rsyncd log file. Verify that no
“finished” messages are in the
imaging.log file.
An imaged node boots correctly, but
the node hangs in the autoinstall
script waiting for the first multicast
operation.
Ensure that startsys is was used to image the
nodes.
Check for instances of flamethrower running on
the head node.
# ps -aef | fgrep flamethrower
Verify that the imaging operation has
failed by examining the imaging.log
file and look for multiple retries of
flamethrower.
Multicast operation fails.
Verify that the network is quiet. A very busy
network can cause dropped multicast UDP packets.
Try this:
1. Stop the imaging operation.
2. Verify that no flamethrower daemons are
running.
3. Open the
/etc/systemimager/flamethrower.conf
file.
4. Comment out the line with FEC =
5. Save the file.
6. Retry the imaging operation.
Monitoring An Imaging Session
To monitor an imaging operation, use the tail -f command in another window to view the imaging log
files.
Troubleshooting the Imaging Process 145