Dell™ Server PRO Management Pack 1.0 For Microsoft® System Center Virtual Machine Manager 2008 User’s Guide w w w. d e l l . c o m | s u p p o r t . d e l l .
Notes and Cautions NOTE: A NOTE indicates important information that helps you make better use of your computer. CAUTION: A CAUTION indicates potential damage to hardware or loss of data if instructions are not followed. ____________________ Information in this document is subject to change without notice. © 2009 Dell Inc. All rights reserved. Reproduction of these materials in any manner whatsoever without the written permission of Dell Inc. is strictly forbidden.
Contents 1 Introduction . Overview . . . . . . . . . . . . . . . . . . . . . . . 5 . . . . . . . . . . . . . . . . . . . . . . . . . 5 Related Terms . . . . . . . . . . . . . . . . . . . . . . . What is a PRO Tip? . . . . . . . . . . . . . . . . . . . . 6 Feature Highlights . . . . . . . . . . . . . . . . . . . . . 6 Understanding PRO Tip Management Supported Operating Systems . 2 6 . . . . . . . . . . 7 . . . . . . . . . . . . . . 9 9 Other Documents You May Need . . . . . . . . . . .
Uninstalling PROPack 3 . . . . . . . . . . . . . . . . . . 17 Security Considerations . . . . . . . . . . . . . . . . . 17 Using Dell PROPack . . . . . . . . . . . . . . . . 19 . . . . . . . . . . . . . . . 19 Monitoring Using SCVMM . . . . . . . 20 Monitoring Using PRO Specific Alerts on SCOM/SCE . . . . . . . . . . . . . . . . . . . . . . 23 Using Health Explorer to Reset Alerts . . . . . . . . . 24 . . . . . . . . . . . .
1 Introduction This document is intended for system administrators who use the Dell™ Server PRO Management Pack (Dell PROPack) to monitor Dell systems and take remedial action when an inefficient system is identified. The integration of Dell PROPack with System Center Operations Manager (SCOM) 2007 SP1/ System Center Essential (SCE) 2007 SP1and System Center Virtual Machine Manager (SCVMM) 2008, enables you to proactively manage virtual environments and ensure high availability of your Dell systems.
Related Terms • A managed system is a Dell system running Dell™ OpenManage™ System Administrator and monitored and managed using SCOM/SCE and SCVMM. It can be managed locally or remotely through a supported Web browser. • A management station (or) managing station can be a Microsoft® Windows® -based Dell system which is used to manage virtualized infrastructures.
• • Generates PRO Tip when the monitored hardware move to an unhealthy state. The PROTip can be: • a remedial action, such as movement of virtual machines. • a recommended action, such as placing of host into maintenance mode. You can then choose to take remedial action, such as migrate the virtual machines to another healthy host. Minimizes downtime by implementing the remedial action provided on PRO Tips, if so configured.
generates relevant severity alerts for monitored objects when there is a transition to an unhealthy state. The Dell PROPack contains a mapping between Server Administrator alerts and the associated PRO Tip. The following table describes the sequence of events that occur in generating and handling of a typical PRO Tip. Table 1-1.
Supported Operating Systems For the detailed Operating Systems support matrix, see the Dell PROPack readme file, DellPROMP1.0_Readme.txt. You can find the readme packaged in the self-extracting executable - Dell_ PROPack_1.0.0_A00.exe. It is also posted on the Systems Management documentation page on the Dell Support website at support.dell.com. Other Documents You May Need Besides this User's Guide, you might need to refer to the following guides available on the Dell Support website at support.dell.
• The Dell OpenManage Server Administrator Command Line Interface User's Guide documents the complete command line interface for Server Administrator, including an explanation of the command line interface (CLI) commands to view system status, access logs, create reports, configure various component parameters, and set critical thresholds.
2 Getting Started With Dell PROPack Minimum Requirements To implement the Dell PROPack, you must ensure that the following minimum execution environment exists: • • Management Station running: • System Center Operations Management (SCOM) 2007 SP1/ System Center Essentials (SCE) 2007 SP1 installed on a supported hardware and operating system. • System Center Virtual Machine Manager (SCVMM) 2008 installed on a supported hardware and operating system.
Installing SCOM/SCE and SCVMM Agents When you use the setup to monitor your infrastructure, SCOM/SCE and SCVMM agents installed on the hosts enable data transfer between the managed system and management stations. Agents of both SCVMM and SCOM/SCE are installed manually or automatically during the discovery process on all Hyper-V hosts. Integrating SCOM/SCE with SCVMM For the setup to support Dell PROPack, SCOM/SCE must be integrated with SCVMM.
Figure 2-1. Security Warning Message 7 Click Import. A confirmation dialog box is displayed. 8 Click Yes. For alerts and PRO Tips to be generated, ensure SCVMM discovery happens and SCVMM objects are displayed in the State View. For more information on the State View, see "Monitoring Using PRO Specific Alerts on SCOM/SCE". Configuring PRO Tips The Dell systems and virtual infrastructure are monitored for either Critical only or both Critical and Warning alerts.
The component may still be functioning, but it could potentially fail. Or, the component may be functioning in an impaired state. A Critical alert is generated when the component has either failed or failure is imminent. By default, the monitoring level is set to "Warning and Critical". To enable ProTips for both Warning and Critical alerts and automatic implementation of ProTips do the following: 1 Open the SCVMM console. 2 In the Host Groups section, right-click All Hosts and select Properties.
3 Select the PRO tab and select the Enable PRO on this Host Group option. 4 By default, the monitoring level is set to Warning and Critical, which means that the application will display PRO Tips generated for both Warning and Critical alerts. To restrict the PRO Tips to Critical alerts only, select the Critical only option. 5 Select the Automatically implement PRO tips option.
Table 2-1. Checking recovery action for warning alert conditions. (continued) Your Actions Expected System Response Verify that the host is placed in the • After successful implementation of the PRO maintenance mode and the PRO Tip Tip, its status is changed to "Resolved" and resolved the alert. PRO Tip entry is moved out of the PRO Tip window. • Corresponding alert disappears in the SCOM/SCE Alert View. Select the Dismiss option instead of PRO Tip is dismissed.
Table 2-2. Checking recovery action for failure alert conditions. (continued) Your Actions Expected System Response Verify that the virtual systems are • After successful implementation of the PRO moved to a healthy host and PRO Tip, its status is changed to "Resolved" and PRO Tip resolved the alert. Tip entry is moved out of the PRO Tip window. • Corresponding alert disappears in the SCOM/SCE Alert View.
Getting Started With Dell PROPack
3 Using Dell PROPack Monitoring Using SCVMM You can manage the health of your virtualized environment using PRO Tips displayed on the SCVMM console. To see the PRO Tip window, click the PRO Tips button on toolbar located below the main menu, as shown in Figure 3-1. The button also shows the number of active PRO Tips in brackets. Figure 3-1. PRO Tip Button on the SCVMM Console Click the PRO Tips button.
Figure 3-2. PRO Tip Window Implementation of Recovery Actions The PRO Tip window provides an option to either implement or dismiss the recommended action. If you select the Implement option, one of these recovery tasks may be executed based on the type of alert: Placing the host in maintenance mode Placing a host in maintenance mode prevents future assignment of workload to the host until the problem is resolved.
Moving of virtual machines The PRO Tip management pack uses SCVMM algorithms to move virtual machines from the affected system to a healthy one. The placement requirements for identifying a healthy system and moving the virtual machines are as follows: • Hard requirements - these are requirements that a machine hosting the virtual machines must meet in order to run - sufficient memory and storage.
Figure 3-3. Completed Job PRO Tip implementation of moving virtual machines can fail if no other healthy hosts are available in the host group or host cluster. In such a case, the PRO Tip window displays the State of the corresponding PRO Tip as "Failed", and the reason is elaborated in the Error section. The status of the corresponding entry in the Jobs section on the SCVMM console also displays as "Failed". NOTE: In the PRO Tip window the failure message is updated dynamically.
Monitoring Using PRO Specific Alerts on SCOM/SCE You can monitor the physical devices in your network using the SCOM/SCE console. The SCOM/SCE console provides the following views: • Alert View - The Alert View on the SCOM/SCE console displays Dell PRO specific alerts in a tabular format with information on the severity level, source, name, resolution state, along with the date and time of creation. To access the Alert View do the following: a Open SCOM/SCE console. b Select the Monitoring tab.
• State View - The State View displays the Dell system objects discovered in a tabular format. The State View displays objects with the name, path, storage health of the Dell system, and so on. You can personalize the State View by defining which objects you want displayed and customizing how the data looks. Figure 3-5. State View For more information on creating a State View see the Microsoft website. Using Health Explorer to Reset Alerts Health Explorer enables you to view and take action on alerts.
Alert Cause and Recovery Action The following table lists the alert and the corresponding recommended remedial action: Table 3-1. Alert Cause and Recovery Action Dell Event ID Alert Description Severity in SCOM/ SCE & PRO Tip in SCVMM Alert Cause Dell PRO Tip Recommended Remedial Action 1053 Temperature sensor detected a warning value Warning A temperature sensor on the backplane board, system board, CPU, or drive carrier in the specified system exceeded its warning threshold value.
Table 3-1. Alert Cause and Recovery Action (continued) Dell Event ID Alert Description Severity in SCOM/ SCE & PRO Tip in SCVMM Alert Cause Dell PRO Tip Recommended Remedial Action 1203 Current sensor detected a warning value. Warning A current sensor in the specified system exceeded its warning threshold value. In SCVMM, PRO Tip implementation places the host in Maintenance mode so that it is no longer available for new virtual machine placements. 1204 Current sensor detected a failure value.
Table 3-1. Alert Cause and Recovery Action (continued) Dell Event ID Alert Description Severity in SCOM/ SCE & PRO Tip in SCVMM Alert Cause Dell PRO Tip Recommended Remedial Action 1353 Power supply detected a warning. Warning A power supply sensor reading in the specified system exceeded definable warning threshold. In SCVMM, PRO Tip implementation places the host in Maintenance mode so that it is no longer available for new virtual machine placements. 1354 Power supply detected a failure.
Table 3-1. Alert Cause and Recovery Action (continued) Dell Event ID Alert Description Severity in SCOM/ SCE & PRO Tip in SCVMM Alert Cause Dell PRO Tip Recommended Remedial Action 1703 Battery sensor detected a warning value. Warning A battery sensor in the specified system detected that a battery is in a predictive failure state. In SCVMM, PRO Tip implementation places the host in Maintenance mode so that it is no longer available for new virtual machine placements. 2048 Device Failed Error.
Table 3-1. Alert Cause and Recovery Action (continued) Dell Event ID Alert Description Severity in SCOM/ SCE & PRO Tip in SCVMM Alert Cause Dell PRO Tip Recommended Remedial Action 2076 Virtual Disk Check Consistency Failed. Error A physical disk included in the virtual disk failed or there is an error in the parity information.
Table 3-1. Alert Cause and Recovery Action (continued) Dell Event ID Alert Description Severity in SCOM/ SCE & PRO Tip in SCVMM Alert Cause Dell PRO Tip Recommended Remedial Action 2100 Temperature exceeded Maximum Warning Threshold Warning The physical disk enclosure is too hot. A variety of factors can cause the excessive temperature.
Table 3-1. Alert Cause and Recovery Action (continued) Dell Event ID Alert Description Severity in SCOM/ SCE & PRO Tip in SCVMM Alert Cause Dell PRO Tip Recommended Remedial Action 2129 BGI (Back Ground Initialization) Failed Error BGI of a virtual disk has failed.
Table 3-1. Alert Cause and Recovery Action (continued) Dell Event ID Alert Description Severity in SCOM/ SCE & PRO Tip in SCVMM Alert Cause Dell PRO Tip Recommended Remedial Action 2300 Unstable Enclosure Failure The controller is not receiving a consistent response from the enclosure. In SCVMM, PRO Tip implementation places the host in Maintenance mode so that it is no longer available for new virtual machine placements. 2301 Enclosure Error Hardware Error.
Table 3-1. Alert Cause and Recovery Action (continued) Dell Event ID Alert Description Severity in SCOM/ SCE & PRO Tip in SCVMM Alert Cause Dell PRO Tip Recommended Remedial Action 2314 SAS (Serial Error Attached SCSI) Components Failure. Storage Management is In SCVMM, PRO unable to monitor or Tip implementation manage SAS devices. migrates running virtual machines from unhealthy host to healthy host(s). 2328 NVRAM (Non Error Volatile Random Access Memory) has corrupt data.
Using Dell PROPack
A Appendix A - Known Limitations in Dell PROPack These are the known limitations in Dell PROPack: 1 How does Dell PROPack handle failure in SCOM/SCE/SCVMM infrastructure? The SCE/SCOM-SCVMM infrastructure has multiple software services (for example, management station, SQL server, and so on) leading to a complex distributed setup.
3 Are there limitations with respect to number of virtual machines and systems that can be managed through Dell PROPack? The number of hosts and virtual machines that can be managed depends on SCVMM and not Dell PROPack. 4 A security warning message is displayed when you import Dell PROPack. What does this indicate? The warning message you see is a generic warning that SCOM/SCE provides when you manually install Dell PROPack and is part of its security processes.
B Appendix B - Microsoft Knowledge Base Articles for Dell PROPack The following tables list the Microsoft Knowledge Base articles along with the corresponding Knowledge Base IDs. For details see the Microsoft support site at support.microsoft.com. Required Hotfixes on Managed System Table B-1.
Recommended Hotfixes on Managed System Table B-2. Recommended Hotfixes for PROPack on Managed System Applicable System Description Microsoft Knowledge Base Link Hyper-V Stop error message on a Windows Server 2008 system that has the Hyper-V role installed: "STOP 0x0000001A". 957967 Hyper-V A wmiprvse.exe process may leak memory when a Windows Management Instrumentation (WMI) notification query is used heavily on a Windows Server 2008 or Windows Vista system.
Recommended Hotfixes on Management Station Table B-3. Recommended Hotfixes for PROPack on Management Station Applicable System Description Microsoft Knowledge Base Link System Center Virtual Machine Manager 2008 A wmiprvse.exe process may leak memory when 958124 a Windows Management Instrumentation (WMI) notification query is used heavily on a Windows Server 2008 or Windows Vista system.
Appendix B - Microsoft Knowledge Base Articles for Dell PROPack
Glossary The following list defines or identifies technical terms, abbreviations, and acronyms used in this document. managed system A managed system is any system that is monitored and managed using SCOM/SCE and SCVMM and running Dell OpenManage Server Administrator. Systems running Server Administrator can be managed locally or remotely through a supported Web browser.
Glossary
Index D Dell Event ID 1053, 25 1054, 25 1104, 25 1154, 25 1203, 26 1204, 26 1305, 26 1306, 26 1353, 27 1354, 27 1403, 27 1404, 27 1703, 28 2048, 28 2056, 28 2057, 28 2076, 29 2077, 29 2082, 29 2083, 29 2100, 30 2101, 30 2102, 30 2103, 30 2129, 31 2137, 31 2268, 31 2293, 31 2300, 32 Dell Event ID (continued) 2301, 32 2302, 32 2314, 33 2328, 33 Dell Management Pack What’s New, 5 Dell Management Packs, 5 G generic warning, 12 M managed system, 6 Management Pack features, 5 management station, 6 Monitoring U
P V PRO, 6 View Alert, 23 State, 23 PROPack Features, 6 Importing, 12 Minimum Requirements, 11 Security Considerations, 17 Testing, 15 Uninstalling, 17 PROTip, 6 Configuring, 13 R Recovery Action, 25 S Setup Testing Scenario 1, 15 Scenario 2, 16 T Technical Assistance, 10 U User Roles, 17 44 Index