Dell EMC Server PRO Management Pack Version 5.
Notes, cautions, and warnings NOTE: A NOTE indicates important information that helps you make better use of your product. CAUTION: A CAUTION indicates either potential damage to hardware or loss of data and tells you how to avoid the problem. WARNING: A WARNING indicates a potential for property damage, personal injury, or death. Copyright © 2009 - 2017 Dell Inc. or its subsidiaries. All rights reserved. Dell, EMC, and other trademarks are trademarks of Dell Inc. or its subsidiaries.
Contents 1 Introduction....................................................................................................................................................4 What's new in this release.................................................................................................................................................5 Overview..............................................................................................................................................................
1 Introduction This document is intended for system administrators who use the Dell EMC Server PRO Management Pack (Dell EMC PRO Pack) to monitor the Dell systems and take remedial action when an inefficient system is identified. The Dell EMC PRO Pack version 5.
What's new in this release The release highlights of Dell EMC PRO Pack are: • Support for OpenManage Server Administrator 8.4 to 9.0.
• Works with Operations Manager and VMM to detect events such as loss of power supply redundancy, higher temperature than threshold values, system storage battery error, virtual disk failure, and so on. For more information on events supported by Dell EMC PRO Pack, see Alerts and Recovery Actions. • Generates PRO Tip when the monitored hardware moves to an unhealthy state. • Performs VM live migration with no downtime. For more information, see VM Live Migration.
Sequence Number Event 5 VMM displays a corresponding entry in the PRO Tip window with remedial action. 6 Implement the PRO Tip to enable recovery action on the managed system by placing the managed system in the Restrict mode or Restrict and Migrate mode. 7 VMM notifies Operations Manager about the successful completion of the recovery action. 8 The VMM console displays the status of the PRO Tip as Resolved after it is successfully implemented. 9 PRO Tip disappears from VMM PRO Tip window.
2 Using Dell EMC Server Performance Resource Optimization Pack This chapter suggests steps to use PRO Pack. Topics: • • • • • • Planning the Environment for PRO Tips Monitoring using VMM Monitoring using PRO specific alerts on Operations Manager Using Health Explorer to Reset Alerts Overriding Recovery Actions Alerts and Recovery Actions Planning the Environment for PRO Tips You can plan for enabling the PRO Monitors that are relevant for the environment.
Figure 2. PRO Tips Alternatively, if you select the Show this window when new PRO Tips are created option in the PRO Tip window, the window automatically opens on the VMM console when a PRO Tip is generated. The PRO Tipwindow displays information such as source, tip, and state of the PRO Tip in a tabular format. The window also displays description of the problem that triggered the alert, the cause, and the suggested remedial action for recovery.
• An entry is displayed in the Jobs section on the VMM console. This entry displays the status of the job as Completed, as shown in the following figure: Figure 3. Completed Jobs PRO Tip implementation of moving VMs can fail if no other healthy hosts are available in the host group or host cluster. In such a case, the PRO Tip window displays the state of the corresponding PRO Tip as Failed, and the reason is elaborated in the Error section.
The Operations Manager console provides the following views: • Alerts View • State View Alerts View Alerts View displays Dell PRO specific alerts in a tabular format with information on the severity level, source, name, resolution state, and date and time of creation. To access the Alert View: 1 Launch the Operations Manager console. 2 Click the Monitoring tab. 3 Click Dell Server PRO Pack > Dell Server PRO Alerts.
Using Health Explorer to Reset Alerts Health Explorer enables you to view and take action on alerts. When you select Dismiss in the PRO Tip window, the alert is removed from it. To manually reset the alert: 1 On the Actions menu, click Health Explorer. 2 Right-click the alert that you want to close. 3 Select Reset Health. The alert disappears from the PRO Tip window. Overriding Recovery Actions PRO Pack 5.0 supports two recovery actions.
Figure 6. Override Properties Alerts and Recovery Actions The following table lists the alerts and the recommended remedial actions: Table 2. Alerts and Recovery Actions Dell Event ID Alert Description on Operations Manager and PRO Tip in VMM Severity Alert Cause Dell PRO Tip Recommended Remedial Action 1004;5004 Thermal shutdown protection has been initiated Error This message is generated when a system is configured for thermal shutdown due to an error event.
Dell Event ID Alert Description on Operations Manager and PRO Tip in VMM Severity Alert Cause Dell PRO Tip Recommended Remedial Action specified system exceeded its failure threshold value. 1055;5055 Temperature sensor detected a nonrecoverable value Error A temperature sensor on the backplane board, system board, or drive carrier in the specified system detected an error from which it cannot recover.
Dell Event ID Alert Description on Operations Manager and PRO Tip in VMM Severity Alert Cause Dell PRO Tip Recommended Remedial Action 1204;5204 Current sensor detected a failure value Error A current sensor in the Restrict and Migrate specified system exceeded its failure threshold value. 1205;5205 Current sensor detected a non-recoverable value Error A current sensor in the specified system detected an error from which it cannot recover.
Dell Event ID Alert Description on Operations Manager and PRO Tip in VMM Severity Alert Cause Dell PRO Tip Recommended Remedial Action redundant unit has been disconnected, has failed, or is not present. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided. 1353;5353 Power supply detected a warning Warning A power supply sensor Restrict reading in the specified system exceeded definable warning threshold.
Dell Event ID Alert Description on Operations Manager and PRO Tip in VMM Severity Alert Cause Dell PRO Tip Recommended Remedial Action 1454;5454 Fan enclosure removed from system for an extended amount of time Error A fan enclosure has been removed from the specified system for a user-definable length of time. The sensor and chassis location information is provided.
Dell Event ID Alert Description on Operations Manager and PRO Tip in VMM Severity Alert Cause Dell PRO Tip Recommended Remedial Action state and processor sensor status are provided. 1605;5605 Processor sensor detected a nonrecoverable value 1703;5703 Battery sensor detected a Warning warning value A battery sensor in the specified system detected that a battery is in a predictive failure state.
Dell Event ID Alert Description on Operations Manager and PRO Tip in VMM Severity Alert Cause Dell PRO Tip Recommended Remedial Action 2082 Virtual Disk Rebuild Failure Critical A physical disk included in the virtual disk has failed or is corrupt. Restrict 2083 Physical Disk Rebuild Failed Critical A physical disk included in the virtual disk has failed or is corrupt. Restrict 2094 Predictive Failure reported Warning The physical disk is predicted to fail.
Dell Event ID Alert Description on Operations Manager and PRO Tip in VMM Severity Alert Cause Dell PRO Tip Recommended Remedial Action 2145 Controller battery low Warning The controller battery charge is low. Restrict 2169 The controller battery needs to be replaced Critical The controller battery cannot recharge. The battery may have been already recharged the maximum number of times. In addition, the battery charger may not be working.
Dell Event ID Alert Description on Operations Manager and PRO Tip in VMM Severity Alert Cause Dell PRO Tip Recommended Remedial Action redundancy. In the case of a virtual disk, one or more physical disks included in the virtual disk have failed. 2246 The controller battery is degraded Warning The temperature of the battery is high. This may be due to the battery being charged. Restrict 2264 A device is missing Warning The controller cannot communicate with a device. The device may be removed.
Dell Event ID Alert Description on Operations Manager and PRO Tip in VMM Severity Alert Cause Dell PRO Tip Recommended Remedial Action 2283 A redundant path is broken Warning The controller has two connectors that are connected to the same enclosure. Restrict and Migrate 2289 Multi-bit ECC error on controller DIMM Critical An error involving multiple bits has been encountered during a read or write operation.
Dell Event ID Alert Description on Operations Manager and PRO Tip in VMM Severity Alert Cause Dell PRO Tip Recommended Remedial Action 2307 Bad block table is full. Critical The bad block table is the table used for remapping bad disk blocks. Restrict 2310 A virtual disk is permanently degraded Critical A redundant virtual disk has lost redundancy. This may occur when the virtual disk suffers the failure of more than one physical disk.
Dell Event ID Alert Description on Operations Manager and PRO Tip in VMM Severity Alert Cause Dell PRO Tip Recommended Remedial Action overheated and become warped and nonfunctional. 2327 The NVRAM has corrupted data. The controller is reinitializing the NVRAM Warning The NVRAM has corrupted data. This may occur after a power surge, a battery failure, or for other reasons. The controller is reinitializing the NVRAM.
Dell Event ID Alert Description on Operations Manager and PRO Tip in VMM Severity Alert Cause Dell PRO Tip Recommended Remedial Action 2355 Enclosure firmware download failed. Warning The system was unable to Restrict and Migrate download firmware to the enclosure. The controller may have lost communication with the enclosure. There may have been problems with the data transfer or the download media may be corrupt.
Dell Event ID Alert Description on Operations Manager and PRO Tip in VMM Severity Alert Cause Dell PRO Tip Recommended Remedial Action 2699 OMSS-FluidCache communications Failure Critical The communications connection between OMSS and the Fluid Cache service is no longer present. Restrict and Migrate 2900 A PCIeSSD cache device is no longer functional Critical The PCIeSSD cache device identified in the message is no longer functional.
Dell Event ID Alert Description on Operations Manager and PRO Tip in VMM Severity Alert Cause Dell PRO Tip Recommended Remedial Action failed. This condition can cause system performance issues and degradation in the monitoring capability of the system. 5200 A current sensor has failed Critical The current sensor identified in the message has failed. This condition can cause system performance issues and degradation in the monitoring capability of the system.
3 Related Documentation and Resources This chapter gives the details of documents and resources to help you work with the Pro Pack 5.0. Topics: • Security Considerations • Other Documents You May Need Security Considerations Operations Console access privileges are handled internally by Operations Manager. You can setup this using the User Roles option under Administration Security feature on the Operations Manager console.
4 Contacting Dell NOTE: If you do not have an active Internet connection, you can find contact information on your purchase invoice, packing slip, bill, or Dell product catalog. Dell provides several online and telephone-based support and service options. Availability varies by country and product, and some services may not be available in your area. To contact Dell for sales, technical support, or customer service issues: 1 Go to Dell.com/support. 2 Select your support category.
5 Accessing documents from the Dell EMC support site You can access the required documents using the following links: • For Dell EMC Enterprise Systems Management documents — Dell.com/SoftwareSecurityManuals • For Dell EMC OpenManage documents — Dell.com/OpenManageManuals • For Dell EMC Remote Enterprise Systems Management documents — Dell.com/esmmanuals • For iDRAC and Dell EMC Lifecycle Controller documents — Dell.