Users Guide
Power Troubleshooting
The following information helps you to troubleshoot power supply and power-related issues:
• Problem: Congured the Power Redundancy Policy to Grid Redundancy, and a Power Supply Redundancy Lost event was raised.
– Resolution A: This conguration requires the power supply in side 1 (the left slot) and the power supply in side 2 (the right slot) to
be present and functional in the enclosure. Additionally the capacity of each supply must be enough to support the total power
allocations for the chassis to maintain Grid redundancy.
– Resolution B: Check if all power supplies are properly connected to the two AC grids: the power supply in side 1 must be connected
to one AC grid, the one in side 2 must be connected to the other AC grid, and both AC grids must be working. Grid Redundancy is
lost when one of the AC grids is not functioning.
• Problem: The PSU state is displayed as Failed (No AC), even when an AC cord is connected and the power distribution unit is
producing good AC output.
– Resolution A: Check and replace the AC cord. Check and conrm that the power distribution unit providing power to the power
supply is operating as expected. If the failure still persists, call Dell customer service for replacement of the power supply.
– Resolution B: Check that the PSU is connected to the same voltage as the other PSUs. If CMC detects a PSU operating at a
dierent voltage, the PSU is turned o and marked Failed.
• Problem: Inserted a new server into the enclosure with sucient power supplies, but the server does not power on.
– Resolution A: Check for the system input power cap setting—it might be congured too low to allow any additional servers to be
powered up.
• Problem: Available power keeps changing, even when the enclosure conguration has not changed.
– Resolution: CMC has dynamic fan power management that reduces server allocations briey if the enclosure is operating near the
peak user congured power cap; it causes the fans to be allocated power by reducing server performance to keep the input power
draw below System Input Power Cap. This is normal behavior.
• Problem: Overall server performance decreases when the ambient temperature increases in the data center.
– Resolution: This can occur if the System Input Power Cap has been congured to a value that results in an increased power need
by fans having to be made up by reduction in the power allocation to the servers. User can increase the System Input Power Cap
to a higher value that allow for additional power allocation to the fans without an impact on server performance.
Troubleshooting Alerts
Use the CMC log and the trace log to troubleshoot CMC alerts. The success or failure of each email and/or SNMP trap delivery attempt is
logged into the CMC log. Additional information describing the particular error is logged in the trace log. However, since SNMP does not
conrm delivery of traps, use a network analyzer or a tool such as Microsoft’s snmputil to trace the packets on the managed system.
Viewing Event Logs
You can view hardware- and chassis logs for information on system-critical events that occur on the managed system.
Viewing Hardware Log
CMC generates a hardware log of events that occur on the chassis. You can view the hardware log using the web interface and remote
RACADM.
NOTE
: To clear the hardware log, you must have Clear Logs Administrator privilege.
NOTE: You can congure CMC to send email or SNMP traps when specic events occur.
Examples of hardware log entries
critical System Software event: redundancy lost
Wed May 09 15:26:28 2007 normal System Software
Troubleshooting and recovery
163