HP XC System Software Installation Guide Version 3.0

ManualsBrandsHP ManualsSoftwareHP XC System 3.x Software

141

142

143

144

145

146

147

148

149

150

Table K-1 Diagnosing System Imaging Problems

Possible SolutionHow To DiagnoseSymptom

Verify BIOS settings to ensure that the node is set to

network boot and that the correct network adapter is

at the top of the boot order.

An nconfig starting entry appears in

the imaging.log file.

A node boots to local disk and runs

through the node configuration phase

(nconfigure) instead of imaging.

You can determine when a node hangs

during imaging by monitoring the

imaging.log file, which is described

in “Monitoring An Imaging Session”

(page 145). Further inspection can be

done by setting the correct console

parameter in the

/tftpboot/pxelinux.cfg/default

file before booting.

A node hangs while imaging.

• Retry the imaging operation.

• Verify that the network is functioning properly.

Ensure that disk is working correctly and is properly

seated in the node.

Identified by monitoring imaging.log

file or watching the console.

Disk device not found.

Correct the cluster configuration using the

cluster_config utility. Then, you can use the

startsys command to reimage or you can rerun

the nconfigure phase:

# service nconfig nconfigure

Identified by monitoring imaging.log

file. The system will completely boot,

but the node will not show up as

available by the sinfo command.

The node configuration phase

(nconfig) fails, and the system is left

in single-user mode.

Verify hardware, BIOS, and kernel boot option

settings.

Verified by multiple “starting imaging”

messages in the rsyncd log file.

A node spontaneously reboots during

imaging.

The system boots from local disk and

runs nconfigure. You can verify this by

checking messages written to the

imaging.log file.

The network boot times out.

• Verify DHCP settings and status of daemon.

• Verify network status and connections.

• Monitor the /var/log/dhcpd.log file for

DHCPREQUEST messages from the client node

MAC address.

• Check boot order and BIOS settings.

• Rerun imaging/booting operations with less nodes.

The system is placed in single-user

mode.

A node configuration (nconfigure)

operation fails while attempting to

access the database on the head

node.

• Ensure that the mysqld daemon is running on the

head node.

# service mysqld status

• Verify network connections.

• Boot fewer nodes in a single operation.

Verify that the node has started

imaging by looking for

“imaging_started” messages in the

rsyncd log file. Verify that no

“finished” messages are in the

imaging.log file.

An imaged node boots correctly, but

the node hangs in the autoinstall

script waiting for the first multicast

operation.

• Ensure that startsys is was used to image the

nodes.

• Check for instances of flamethrower running on

the head node.

# ps -aef | fgrep flamethrower

Verify that the imaging operation has

failed by examining the imaging.log

file and look for multiple retries of

flamethrower.

Multicast operation fails.

• Verify that the network is quiet. A very busy

network can cause dropped multicast UDP packets.

• Try this:

1. Stop the imaging operation.

2. Verify that no flamethrower daemons are

running.

3. Open the

/etc/systemimager/flamethrower.conf

file.

4. Comment out the line with FEC =

5. Save the file.

6. Retry the imaging operation.

Monitoring An Imaging Session

To monitor an imaging operation, use the tail -f command in another window to view the imaging log

files.

Troubleshooting the Imaging Process 145