Users Guide

Table Of Contents
First Steps to Troubleshoot a Remote System
The following questions are commonly used to troubleshoot high-level issues in the managed system:
Is the system turned on or turned off?
If turned on, is the operating system functioning, not responding, or stopped functioning?
If turned off, did the power turn off unexpectedly?
Power Troubleshooting
The following information helps you to troubleshoot power supply and power-related issues:
Problem: Configured the Power Redundancy Policy to Grid Redundancy, and a Power Supply Redundancy Lost event
was raised.
Resolution A: This configuration requires at least one power supply in side 1 (the left two slots) and one power supply
in side 2 (the right two slots) to be present and functional in the modular enclosure. Additionally the capacity of each
side must be enough to support the total power allocations for the chassis to maintain Grid redundancy. (For full Grid
Redundancy operation, make sure that a full PSU configuration of four power supplies is available.)
Resolution B: Check if all power supplies are properly connected to the two AC grids; power supplies in side 1 must be
connected to one AC grid, those in side 2 must be connected to the other AC grid, and both AC grids must be working.
Grid Redundancy is lost when one of the AC grids is not functioning.
Problem: The PSU state is displayed as Failed (No AC), even when an AC cord is connected and the power distribution
unit is producing good AC output.
Resolution A: Check and replace the AC cord. Check and confirm that the power distribution unit providing power to
the power supply is operating as expected. If the failure still persists, call Dell customer service for replacement of the
power supply.
Resolution B: Check that the PSU is connected to the same voltage as the other PSUs. If CMC detects a PSU
operating at a different voltage, the PSU is turned off and marked Failed.
Problem: Dynamic Power Supply Engagement is enabled, but none of the power supplies display in the Standby state.
Resolution A: There is insufficient surplus power. One or more power supplies are moved into the Standby state only
when the surplus power available in the enclosure exceeds the capacity of at least one power supply.
Resolution B: Dynamic Power Supply Engagement cannot be fully supported with the power supply units present in the
enclosure. To check if this is the case, use the web interface to turn Dynamic Power Supply Engagement off, and then on
again. A message is displayed if Dynamic Power Supply Engagement cannot be fully supported.
Problem: Inserted a new server into the enclosure with sufficient power supplies, but the server does not power on.
Resolution A: Check for the system input power cap settingit might be configured too low to allow any additional
servers to be powered up.
Resolution B: Check for the maximum power conservation setting. If this is set, then this issue occurs. For more details,
see the power configuration settings.
Resolution C: Check for the server slot power priority of the slot associated with the newly-inserted server, and then
ensure it is not lesser than any other server slot power priority.
Problem: Available power keeps changing, even when the modular enclosure configuration has not changed.
Resolution: CMC has dynamic fan power management that reduces server allocations briefly if the enclosure is
operating near the peak user configured power cap; it causes the fans to be allocated power by reducing server
performance to keep the input power draw below System Input Power Cap. This is normal behavior.
Problem: <number>W is reported as the Surplus for Peak Performance.
Resolution: The enclosure has <number>W of surplus power available in the current configuration, and the System
Input Power Cap can be safely reduced by this amount being reported without impacting server performance.
Problem: A subset of servers lost power after an AC Grid failure, even when the chassis was operating in the Grid
Redundancy configuration with four power supplies.
Resolution: This can occur if the power supplies are improperly connected to the redundant AC grids at the time the AC
grid failure occurs. The Grid Redundancy policy requires that the left two power supplies be connected to one AC grid,
and right two power supplies be connected to other AC grid. If two PSUs are improperly connected, such as PSU 2 and
PSU 3 are connected to the wrong AC grids, an AC grid failure cause loss of power to the least priority servers.
Problem: The least priority servers lost power after a PSU failure.
Resolution: To avoid a future power supply failure causing servers to power off, make sure that the chassis has at least
three power supplies and is configured for the Power Supply Redundancy policy to prevent PSU failure from impacting
server operation.
Problem: Overall server performance decreases when the ambient temperature increases in the data center.
198
Troubleshooting and Recovery