HP Insight Cluster Management Utility v7.2 User Guide

ManualsBrandsHP ManualsSoftwareHP Insight Cluster Management Utility 7.x Software

Table Of Contents

HP Insight Cluster Management Utility v7.2

User Guide

Abstract

This guide describes how to install, configure, and use HP Insight Cluster Management Utility (CMU) v7.2 on HP systems. HP

Insight CMU is software dedicated to the administration of HPC and large Linux clusters. This guide is intended primarily for

administrators who install and manage a large collection of systems. This document assumes you have access to the documentation

that comes with the hardware platform where the HP Insight CMU cluster will be installed, and you are familiar with installing

and administering Linux operating systems.

HP Part Number: 5900-3115

Published: November 2013

Edition: 1

Summary of content (223 pages)

PAGE 1
HP Insight Cluster Management Utility v7.2 User Guide Abstract This guide describes how to install, configure, and use HP Insight Cluster Management Utility (CMU) v7.2 on HP systems. HP Insight CMU is software dedicated to the administration of HPC and large Linux clusters. This guide is intended primarily for administrators who install and manage a large collection of systems.
PAGE 2
© Copyright 2013 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. The information contained herein is subject to change without notice.
PAGE 3
Contents 1 Overview................................................................................................11 1.1 Features...........................................................................................................................11 1.1.1 Compute node monitoring............................................................................................11 1.1.2 HP Insight CMU configuration......................................................................................11 1.1.
PAGE 4
2.5.4 Upgrading Java Runtime Environment...........................................................................35 2.5.5 Removing the previous HP Insight CMU package...........................................................35 2.5.6 Installing the HP Insight CMU v7.2 package..................................................................35 2.5.7 Installing your HP Insight CMU license.........................................................................36 2.5.
PAGE 5
.2.6 Customization...........................................................................................................57 5.2.6.1 RHEL autoinstall customization for nodes configured with Dynamic Smart Array RAID (B120i, B320i RAID mode).............................................................................................58 5.2.7 Restrictions...............................................................................................................58 5.3 Backing up...........................
PAGE 6
.3.4 Resource view in the central frame...............................................................................89 6.3.4.1 Resource view overview......................................................................................89 6.3.4.2 Detail mode in resource view..............................................................................90 6.3.5 Gauge widget..........................................................................................................90 6.3.
PAGE 7
.13 Parallel distributed copy (pdcp)........................................................................................125 7.14 User group management.................................................................................................125 7.14.1 Adding user groups.................................................................................................125 7.14.2 Deleting user groups...............................................................................................126 7.
PAGE 8
A.1.3 Backup log files.......................................................................................................162 A.1.4 Monitoring log files..................................................................................................162 A.2 Network boot issues.......................................................................................................162 A.2.1 Troubleshooting network boot...................................................................................163 A.
PAGE 9
Figures 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 Typical HPC cluster...........................................................................................................13 iLO server power controls..................................................................................................17 NIC2 on the SL2x170z G6 Server..........................................................................
PAGE 10
52 53 54 55 56 57 58 59 60 61 62 63 64 cmu_get_ams_metrics..............................................................................................116 Instant view display........................................................................................................117 Contextual menu for administrator....................................................................................118 Halt dialog ..............................................................................................
PAGE 11
1 Overview HP Insight Cluster Management Utility (CMU) is a collection of tools that manage and monitor a large group of computer nodes, specifically HPC and large Linux Clusters. You can use HP Insight CMU to lower the total cost of ownership (TCO) of this architecture. HP Insight CMU helps manage, install, and monitor the compute nodes of your cluster from a single interface. You can access this utility through a GUI or a CLI. 1.1 Features HP Insight CMU is scalable and can be used for any size cluster.
PAGE 12
• Managing the system images stored by HP Insight CMU • Configuring actions performed when a node status changes such as display a warning, execute a command, or send an email • Exporting the HP Insight CMU node list in a simple text file for reuse by other applications • Importing nodes from a simple text file into the HP Insight CMU database 1.1.3 Compute node administration The HP Insight CMU GUI and CLI enable you to perform actions on any number of selected compute nodes.
PAGE 13
2 Installing and upgrading HP Insight CMU 2.1 Installing HP Insight CMU A typical HP Insight CMU cluster contains three kinds of nodes. Figure 1 (page 13) shows a typical HPC cluster. • The management node is the central point that connects all the compute nodes and the GUI clients. Installation, management, and monitoring are performed from the management node. The package cmu-v7.2-1.x86_64.rpm must be installed on the management node. All HP Insight CMU files are installed under the /opt/cmu directory.
PAGE 14
NOTE: The IP address of the NIC connected to the compute node administration network is needed during configuration of the HP Insight CMU management node. 2.1.2 Disk space requirements A total of 400 MB of free disk space is necessary to install all the subsets or packages required for HP Insight CMU. Up to 4 Gb of additional space is needed to store each master disk image. 2.1.3 Support for non-HP servers IMPORTANT: You must obtain a valid license to run HP Insight CMU on non-HP hardware.
PAGE 15
The management cards must be configured with a static IP address. All the compute node management cards must have a single login and password. NOTE: HP Insight CMU uses DHCP and PXE. Do not run other DHCP or PXE servers on the HP Insight CMU management network in the range of ProLiant MAC addresses belonging to the HP Insight CMU cluster.
PAGE 16
• Parameters that affect the behavior of the local disk controller. Parameter names can differ from one server to another and cannot be documented exhaustively. IMPORTANT: If the boot order is not correctly set, then cloning and backup fail on the cluster. Examples are provided in the following sections. 2.1.8.
PAGE 17
1. 2. 3. 4. 5. Access the iLO card. Create the username and password. Each server must have the same username and password. Select the Power Management tab. For Automatically Power On Server, select No. Select Submit. Figure 2 iLO server power controls 2.1.8.2 DL160 G5, DL165c G5, DL165c G6, and DL180 G5 Servers • IDE ◦ ATA/IDE Enhanced ◦ Configure SATA as IDE IMPORTANT: The embedded SATA Raid Controller option is not supported. Do not select this option.
PAGE 18
◦ • Terminal VT100 Boot Configuration ◦ Boot Order 1. Embedded NIC 2. Disk or smart array ◦ Embedded NIC1 Enabled 2.1.8.3 DL160 G6 Servers • • • IPMI ◦ Serial Port assigned to System ◦ Serial Port Connection Mode Direct PCI ◦ NIC1 control Enabled ◦ NIC1 PXE Enabled SATA ◦ • SATA#1 Controller Mode AHCI Boot Configuration ◦ Boot Order 1. NIC 2. CD 3. Disk 2.1.8.4 SL2x170z G6 and DL170h G6 Servers BIOS setting IMPORTANT: To enable BIOS updates, you must restart the server.
PAGE 19
• ◦ Numlock Enabled ◦ Restore after AC loss Last state ◦ Post F1 prompt Delayed CPU setup ◦ • • IDE configuration ◦ SATA controller mode AHCI ◦ Drive cache Enabled ◦ IDE timeout 35 Chipset ACPI configuration ◦ • • Proc hyper threading Disabled High Performance Event timer Enabled IPMI serial port configuration ◦ Serial port assignment BMC ◦ Serial port switching Enabled ◦ Serial port connection mode Direct LAN configuration If your node is wired with the LO100i management port s
PAGE 20
2.2 Preparing for installation 2.2.1 HP Insight CMU kit delivery The HP Insight CMU kit is delivered on CD-ROM and is provided in the appropriate format for your operating system. These features enable HP Insight CMU files to be installed directly from the CD-ROM to your disk. The Linux versions of HP Insight CMU are in the Red Hat Package Manager (RPM) format. 2.2.2 Preinstallation limitations • HP Insight CMU monitors only the compute nodes and not the infrastructure of the cluster.
PAGE 21
2.2.3 Operating system support HP Insight CMU software is generally supported on Red Hat Enterprise Linux (RHEL) 5 and 6; and SUSE Linux Enterprise Server (SLES) 11. The HP Insight CMU diskless environment is supported on RHEL5, RHEL6, and SLES11. Ubuntu 12.x and 13.x are supported on the compute nodes only, on HP Ubuntu certified servers. Debian is supported on the compute nodes only, but requires active approval and verification from HP. Contact HP for support.
PAGE 22
Table 1 Directory structure (continued) Subdirectory Contents Tools Useful tools that can be used in conjunction with HP Insight CMU Documentation Documentation and release notes Licenses Contains the following licenses: Apache_LICENSE-2_0.txt, gluegen_LICENSE.txt, jogl_LICENSE.txt. Also contains system-config-netboot-legalnotice.html 2.2.
PAGE 23
are supported on the HP Insight CMU management node see “Operating system support” (page 21) The following rpms must be installed on the HP Insight CMU management node. Any missing rpms are flagged as dependencies when the HP Insight CMU rpm is installed and must be installed to continue the installation. a. expect b. dhcp c. tftp client d. tftp server e. Oracle Java Runtime Environment, update 33 or newer f. tcl-8 g. OpenSSL h. NFS i. xterm j. libX11 k. libXau l. libXdmcp m. perl-IO-Socket-SSL. n.
PAGE 24
Preparing... ########################################### [100%] 1:cmu ########################################### [100%] post-installation... post-installation of x86_64 tree.......done...
PAGE 25
The following is an example of executing the command on a management node running Red Hat Linux. In this example, the management node has the HP Insight CMU compute nodes connected to eth0 and has a second network on eth1 as a connection outside the cluster. # /opt/cmu/bin/cmu_mgt_config -c Checking that SELinux is not enforcing... [ OK ] Checking for required RPMs... [ OK ] Checking existence of root ssh key... [ OK ] Checking if firewall is down/disabled... [ OK ] Checking tftp for required configuration.
PAGE 26
This command can be rerun at any time to change your configuration without adversely affecting previously configured steps. You can also verify your current configuration by running /opt/cmu/bin/cmu_mgt_config -ti. For additional options and details on this command, run /opt/cmu/bin/cmu_mgt_config -h. 6. Start HP Insight CMU. After the initial rpm installation, HP Insight CMU is configured in audit mode. To run HP Insight CMU, unset audit mode and start the HP Insight CMU service. # /etc/init.
PAGE 27
high-availability Indicates whether the HP Insight CMU management node has been configured for high availability. 7. Configure HP Insight CMU to start automatically. IMPORTANT: This installation depends on the operating system installed and might have to be adapted to your specific installation. NOTE: a. The /etc/init.d/cmu file is available as a result of the HP Insight CMU installation.
PAGE 28
attached to the HP Insight CMU management server that is running the HP Insight CMU software at a given time. The HP Insight CMU management cluster is known on the site network by the address IP0, and on the compute network by the address IP1. IP0 and IP1 are the only addresses HP Insight CMU recognizes. If that server fails, then IP0 and IP1 migrate to the other server. The two servers each have one IP address per network (IP2, IP3, IP4, IP5).
PAGE 29
The address IP0 is attached to the server running the HP Insight CMU software. This is the unique address HP Insight CMU recognizes. Each HP Insight CMU management server has its own IP address on the site network, IP2 and IP 3 respectively, unknown to HP Insight CMU. 2.4.1 HA hardware requirements The hardware requirements for HP Insight CMU under HA control are: • Two or more management servers. • One shared storage accessed by both servers. 2.4.
PAGE 30
2.4.3.2 HP Insight CMU HA service requirements When you configure the HA software layer, configure the HP Insight CMU HA service with the following resources: • A shared file system. The mount point of this file system must be /opt/cmu-store and must be created on all HP Insight CMU management servers. • A shared IP address. • If your HP Insight CMU cluster uses separate site and compute networks, an additional IP address resource must be configured and assigned to your HP Insight CMU HA service.
PAGE 31
* it must be NFS exportable (for kickstart/diskless/backup/cloning) * * * * 2] (at least) one alias IP address: * * * * this is the address used by the compute nodes to contact the mgt * * service, set CMU_CLUSTER_IP into /opt/cmu/etc/cmuserver.
PAGE 32
This command does not actually start HP Insight CMU. It only clears the audit mode to enable HP Insight CMU to be started by the HA tool. 7. 8. 9. Run the appropriate command for your HA software to start HP Insight CMU. To verify that HP Insight CMU is still running correctly, review the /var/log/cmuservice_hostname.log file for errors. Install and configure HP Insight CMU on additional management cluster members. Installing new cluster members is basically the same as for configuring the first member.
PAGE 33
e. Unset the audit mode on the new member: # /etc/init.d/cmu unset_audit cmu ha:cmu service needs (re)start f. g. Start HP Insight CMU under HA control. Use your HA tool to migrate the HP Insight CMU HA service on the new member. 2.4.
PAGE 34
13. Unset the audit mode on server 1. 14. Using the appropriate command for your HA software, restart the HP Insight CMU HA service. 2.5 Upgrading HP Insight CMU Complete the steps in this section if you are upgrading an existing HP Insight CMU system from a previous HP Insight CMU version. 2.5.1 Upgrading to v7.2 important information IMPORTANT: The HP Insight CMU v7.2 license key format is changed from previous versions. Older license keys will not work with HP Insight CMU v7.2.
PAGE 35
2.5.2.3 Java version dependency HP Insight CMU v7.2 depends on Oracle Java version 1.6 update 33 or later. HP strongly recommends upgrading the Java JVMs on both the management node and the endstations running the GUI to version 1.6u33 or later to avoid security problems with the remote file browser (used by the cmu_pdcp and autoinstall GUI dialogs). 2.5.2.4 Monitoring clients Upgrading the management node to HP Insight CMU v7.2 also requires upgrading the monitoring clients to v7.2 on the compute nodes.
PAGE 36
* /etc/init.d/cmu start * * * ******************************************************************************** NOTE: HP Insight CMU has dependencies on other rpms (for example, dhcp). If any missing dependencies are reported, install the required rpms and repeat this step. 3. Install the HP Insight CMU Windows Moonshot add-on rpm: HP Insight CMU v7.2 supports autoinstall, backup, and cloning of select Windows images for supported HP Moonshot cartridges.
PAGE 37
2.5.9 Configuring the updated UP Insight CMU To configure HP Insight CMU, run /opt/cmu/bin/cmu_mgt_config -c. The following is an example of executing the command on a management node running Red Hat Linux. In this example, the management node has the HP Insight CMU compute nodes connected to eth0 and has a second network on eth1 as a connection outside the cluster. # /opt/cmu/bin/cmu_mgt_config -c Checking that SELinux is not enforcing... [ OK ] Checking for required RPMs...
PAGE 38
backup copy: # /etc/ssh/cmu_sshd_config_before_cmu_mgt_config This command can be rerun at any time to change your configuration without adversely affecting previously configured steps. You can also verify your current configuration by running /opt/cmu/bin/cmu_mgt_config -ti. For additional options and details on this command, run /opt/cmu/bin/cmu_mgt_config -h. 2.5.10 Starting HP Insight CMU After the initial rpm installation, HP Insight CMU is configured in audit mode.
PAGE 39
dhcpd.conf Indicates the status of the DHCPD configuration. high-availability Indicates whether the HP Insight CMU management node has been configured for high availability. 2.5.11 Deploying the monitoring client If you use HP Insight CMU monitoring, upgrade the monitoring client on your HP Insight CMU client nodes. For more information about deploying the monitoring client, see “Deploying the monitoring client” (page 85). 2.
PAGE 40
3 Launching the HP Insight CMU GUI 3.1 HP Insight CMU GUI The HP Insight CMU GUI can be used from any workstation connected through the network to the cluster management node. The HP Insight CMU GUI is composed of the following modules: • A Java GUI running on the client Windows or Linux workstation • A server module on the management node to run tasks on compute nodes IMPORTANT: If the server module is not running on the management node, the client module cannot perform any tasks.
PAGE 41
• The central frame displays the global cluster view. In Figure 4 (page 40), the global cluster view is empty because the cluster is not yet configured. • The bottom frame shows log information. 3.3 Administrator mode Click Options→Enter Admin Mode. You must have administrator privileges to perform the cluster configuration tasks described in this chapter. If you do not have administrator privileges, then you can monitor the cluster status, but you cannot perform all the tasks described in this chapter.
PAGE 42
HP Insight CMU automatically starts a minimal web server on port 80 of the management node that serves only the HP Insight CMU website. If an HTTP service is already running on this port on the management node, then the HP Insight CMU web service does not run. If you want to use a different port number, then edit the environment variable CMU_THTTPD_PORT in the /opt/cmu/ etc/cmuserver.conf file. To launch the HP Insight CMU GUI: 1.
PAGE 43
1. Edit /etc/ssh/sshd_config as follows: X11Forwarding yes PasswordAuthentication yes 2. Restart sshd. # /etc/init.d/sshd restart • 2. Stopping sshd: [ OK ] Starting sshd: [ OK ] Localhost must be resolved and pingable. Verify the ssh tunnel is working correctly. a. From the GUI workstation, open an ssh connection to the HP Insight CMU management server. # ssh x.x.x.x -l root Where x.x.x.x is the IP address of the HP Insight CMU management server. b.
PAGE 44
4 Defining a cluster with HP Insight CMU 4.1 HP Insight CMU service status Obtain the status of all HP Insight CMU service components with the following command on the management node: # /etc/init.d/cmu status HP Insight CMU must be properly configured before using the GUI. Ensure that the core and java services report configured. 4.
PAGE 45
4.3.1 Node management Figure 7 Node management window In Figure 7 (page 45), the node list of the cluster will appear as the node database is populated by adding, scanning, or importing nodes.
PAGE 46
4.3.1.1 Scanning nodes Cluster Administration→Node Management→Scan Node The HP Insight CMU Node Management component provides the capability to scan new nodes into the HP Insight CMU database. You can also manually add node information. Use this interface to scan nodes in the HP Insight CMU database to retrieve hardware addresses and configure IP addresses. The HP Insight CMU database is updated with the new nodes. Enter parameters in the initial Scan Node dialog box.
PAGE 47
NOTE: This is necessary only for the first scan operation. For subsequent scans, the Management card password window will not be displayed. Figure 9 Management card password window 4. 5. The Scan Node Result window appears. Figure 10 (page 47) Select to either add or replace scanned nodes. Figure 10 Scan node result 4.3.1.2 Adding nodes Cluster Administration→Node Management→Add Node Use this interface to add a new node to the HP Insight CMU database. 4.
PAGE 48
Figure 11 Add node dialog At the Node Dialog box: 1. Click OK. A dialog box displays the successful addition of a node completion. 2. Click OK. A dialog box asks if you want to add another node. NOTE: utility. When you add a node, include it in a network entity using the Network Entity Management The newly added nodes appear in the node list. Figure 12 Populated database node management window 4.3.1.
PAGE 49
To modify the attributes of a node, select the node in the Node Management list, and then select Modify Node. The same interface as Add Node appears. NOTE: The node name cannot be changed. 4.3.1.4 Importing nodes Cluster Administration→Node Management→Import Node To import nodes from a flat text file, select an existing text file and then click Open to import all the nodes from this file into the HP Insight CMU database. The following is a sample import/export file: cn001 cn002 cn003 cn004 cn005 16.16.
PAGE 50
You can use the Network Entity Management window to add and delete network entities. To perform tasks by using the Network Entity Management option, click Cluster Administration and then select Network Entity Management. 4.3.2.1 Adding network entities NOTE: The cloning process does not clone nodes that are not assigned to a network entity. Figure 13 Network entity management 1. Specify the name of the network entity to create. The length is limited to 15 characters.
PAGE 51
5 Provisioning a cluster with HP Insight CMU 5.1 Logical group management A logical group in HP Insight CMU represents a disk image that has been captured (backed up). Each logical group is associated with a single backup image. The logical group must contain the nodes with good hardware configurations that can be cloned with this image. The Logical Group Management window is used to add, modify, delete, or rename logical groups.
PAGE 52
• For the first smart array logical drive on ProLiant servers, use cciss/c0d0. IMPORTANT: For RHEL6, the smart array device name depends on the smart array controller. For additional information, see “HP Smart Array warning with RHEL6 and future Linux releases” (page 21). IMPORTANT: For Windows logical groups (supported only on specific Moonshot cartridges), use sda in the Associated devices field. 4. 5. Click OK.
PAGE 53
5.2 Autoinstall HP Insight CMU provides automated compute node installation from software distribution repositories available on the HP Insight CMU management node. The following distributions are supported: • RHEL5 • RHEL6 • SLES11 • Ubuntu 12.x, 13.x • Windows 7 Enterprise (on specific Moonshot cartridges only) • Windows 2012 Server Standard (on specific Moonshot cartridges only) • Windows 2012 R2 server Standard (on specific Moonshot cartridges only) 5.2.
PAGE 54
• autoinst_windows.templ The template autoinst_windows.templ is delivered with the rpm cmu-windows-moonshot-addon-7.2.1-1.noarch. This rpm is available on the HP Insight CMU CD. This template is supported on specific Moonshot cartridges only. In the templates provided by HP Insight CMU, special CMU keywords are automatically substituted by the autoinstall process. All HP Insight CMU keywords begin with CMU_. HP Insight CMU locates the correct values for these variables and makes the substitutions.
PAGE 55
IMPORTANT: When creating a Windows logical group, HP insight CMU uses Samba for exporting the repository. However, this is done automatically and does not require any intervention from HP Insight CMU users. Exporting via NFS is useless in this case. • Autoinstall template file—The path to a Red Hat kickstart file, SLES autoyast file, Ubuntu preseed file or Windows unattended installation xml file. Information can be entered in the text box, or browsed by clicking on the right side of the text box.
PAGE 56
# ls -l /opt/cmu/image/rh5u5_autoinstall/ total 24 -rw-r--r-- 1 root root 2881 Oct 11 15:38 autoinst-node1 -rw-r--r-- 1 root root 2861 Oct 11 15:38 autoinst.tmpl-cmu -rw-r--r-- 1 root root 1313 Oct 11 15:35 autoinst.tmpl-orig -rw-r--r-- 1 root root 13 Oct 11 15:38 node1.
PAGE 57
cmu> add_to_logical_group node to logical_group_name For example: cmu> add_to_logical_group node1 to rh5u5_autoinst selected nodes: node1 processing 1 node ... cmu> Or: # /opt/cmu/bin/cmu_add_to_logical_group_candidates -t rh5u5_ autoinstall node1 node2 processing 2 nodes... 5.2.5.
PAGE 58
5.2.6.1 RHEL autoinstall customization for nodes configured with Dynamic Smart Array RAID (B120i, B320i RAID mode) To autoinstall RHEL6 on nodes with Dynamic Smart Array RAID configured, the following additional steps are required to enable the hpvsa driver diskette: • Download the appropriate hpvsa driver diskette image for the corresponding RHEL OS version. • Uncompress the driver diskette image and copy it to the RHEL repository directory, which is NFS exported.
PAGE 59
Figure 20 Backup dialog box IMPORTANT: When backing up a Windows golden node (supported only on specific Moonshot cartridges): • The backup image size (the total size of the compressed part-archi*.tar.bz2 files) must be <85% of RAM size on the nodes to be cloned. For example, on nodes with 8GB RAM, the maximum image size available for cloning is approximately 7GB. • Ensure that the root partition number is correct. Root partition is the partition containing the Windows system folder.
PAGE 60
IMPORTANT: If partitions to be backed up are less than 50% empty, you must configure HP Insight CMU to use the tmpfs file system for cloning partitions. To make this functionality work, two conditions must be satisfied: • The size of the largest partition to back up and clone must be smaller or equal to the compute node memory size. • Cloning must be enabled using tmpfs by setting CMU_CLONING_USE_TMPFS to yes in /opt/cmu/etc/cmuserver.conf and then restart HP Insight CMU.
PAGE 61
5.4 Cloning The HP Insight CMU cloning operation copies the complete contents of the golden image to other nodes. The copied image is the same except for two changes: • HP Insight CMU updates the host name of the node. • HP Insight CMU updates the IP address of the network used for cloning. All other configurations remain the same. Node-specific configuration changes can be made with the HP Insight CMU reconf.sh script.
PAGE 62
Figure 23 Cloning status When cloning is complete, a popup window displays the results. The correctly cloned compute nodes appear in the chosen logical group. The compute nodes that failed remain in the default logical group. The cloning feature duplicates the software installation configuration from an installed Linux system to systems with similar hardware configurations. This function eliminates the time-consuming task of system installation and configuration for each node in the cluster.
PAGE 63
The default content of pre_reconf.sh is: #!/bin/bash #keep this version tag here CMU_PRE_RECONF_VERSION=1 #starting from cmu version 4.2 this script is dedicated to custom code #it is running at cloning time after netboot is done and before the #filesystems or even the partitioning is created. exit 0 5.4.2 Reconfiguration During cloning, automatic reconfiguration is performed on each node.
PAGE 64
# CMU_RCFG_IP = mgt network ip of this compute node # CMU_RCFG_NTMSK = net mask exit 0 5.4.3 Cloning Windows images IMPORTANT: cartridges. Windows is only available on HP ProLiant m300 and HP ProLiant m700 Server Limitations for cloning Windows images: • The golden image size (the total size of compressed part-archi*.tar.bz2 files) must be less than 85% of RAM size on the nodes to be cloned. For example, on nodes with 8GB RAM, the maximum image size available for cloning is approximately 7GB.
PAGE 65
Figure 24 Node static info 5.6 Rescan MAC Use this command only if you must replace a failing node. This command enables retrieving the new MAC address of the node after node replacement. Right-click on a node in the node tree. On the contextual menu, select Update. A submenu is displayed. Select Rescan MAC on the submenu. NOTE: The Rescan MAC option is only active when a single node is selected in the node tree. 5.
PAGE 66
Figure 25 Rescan MAC 5.7 HP Insight CMU image editor An existing HP Insight CMU cloning image can be modified directly on the HP Insight CMU management node, without making the modifications on a golden node and backing up the system. Image editing involves three steps: 1. Use the cmu_image_open command to expand the image. 2. Make changes. 3. Use the cmu_image_commit command to save the image. 5.7.1 Expanding an image An HP Insight CMU cloning image is stored in /opt/cmu/image.
PAGE 67
5.7.2 Modifying an image Modifications can consist of simple manual commands such as adding, removing, or modifying files. However, complex operations using chroot commands on the expanded image directory are also possible, such as installing a new rpm. IMPORTANT: When using chroot, HP recommends performing chroot mount /proc or chroot mount /sys in the image directory before executing other chroot commands.
PAGE 68
5.8 HP Insight CMU diskless environments 5.8.1 Overview HP Insight CMU provides two methods of provisioning a diskless OS with NFS: • The legacy method in HP Insight CMU, called system-config-netboot. • A new method based on the open-source oneSIS software package. Both methods configure a central read-only root file system on an NFS server. The system-config-netboot method also configures per-node read-write file systems on the NFS server.
PAGE 69
The oneSIS method also configures a central read-only root file system on the NFS server. But instead of writing back to the NFS server, the oneSIS software configures a local read-writable tmpfs file system on each diskless client, copies the read-writable files into that file system, and makes the appropriate soft links from the read-only root file system into this read-writable tmpfs file system.
PAGE 70
2. 3. Save and exit the file. Restart the HP Insight CMU server: # /etc/init.d/cmu restart Also restart the HP Insight CMU GUI. 5.8.2 The system-config-netboot diskless method The HP Insight CMU system-config-netboot diskless method implements a diskless feature to build diskless clusters derived from Red Hat. In the HP Insight CMU system-config-netboot implementation, the compute nodes share a common operating system from the NFS server, which by default is the HP Insight CMU management node.
PAGE 71
# default: off # description: The tftp server serves files using the trivial file transfer \ # protocol. The tftp protocol is often used to boot diskless \ # workstations, download configuration files to network-aware printers, \ # and to start the installation process for some operating systems. service tftp { disable = no socket_type = dgram protocol = udp wait = yes user = root server = /usr/sbin/in.tftpd server_args = /tftpboot /opt/cmu/ntbt/tftp -v per_source = 11 cps = 100 2 flags = IPv4 } 4.
PAGE 72
Figure 26 Naming a logical group 3. Select the Diskless option to the right of the group name. NOTE: If you cannot see the Diskless option, the diskless feature is not activated properly. To correct the error, see “Enabling diskless support in HP Insight CMU” (page 69). 4. 5. 6. Enter the IP address of the golden node. Click Get Kernel List. This will retrieve the list of available kernels from the golden node. Select one of these kernels as the kernel to boot diskless, and then click OK.
PAGE 73
Figure 27 Adding nodes to logical groups From the CLI Add the node into the logical group as follows: cmu> add_to_logical_group node1 – noden to 5.8.2.8 Booting the compute nodes From the GUI 1. 2. 3. Select the compute nodes you added to the diskless logical group. Right-click to launch a boot command on these nodes. Select network. The list of all the diskless images registered in HP Insight CMU appears. The cmu network image is also listed.
PAGE 74
5.8.2.9 Understanding the structure of a diskless image Like every HP Insight CMU image, all directories and files related to an HP Insight CMU diskless image are stored in /opt/cmu/image/. A diskless image is composed of the following directories: • root—Contains the root directory of the golden node and is mounted in read-only mode by the diskless compute nodes and used as '/'. • snapshot—Contains one subdirectory per node.
PAGE 75
#cmu_end_interface #-- custom code starts here -exit 0 This script is invoked at the end of the image creation process, when /opt/cmu/image//root is populated. The script is invoked with the following input parameters: • CMU_RCFG_PATH—The path to the root directory of the image. /opt/cmu/image//root • CMU_RCFG_OSTYPE—The type of operating system detected by HP Insight CMU. • CMU_RCFG_IMAGENAME—The name of the HP Insight CMU image.
PAGE 76
• CMU_RCFG_IP—The IP address of the node. • CMU_IMAGENAME—The name of the diskless image. 5.8.2.10.4 Templates and image file If the changes are valid for one image only, keep the modifications only in the reconfiguration files for the specific /opt/cmu/image/ directory. If instead, the changes defined in the reconf files are valid for all the diskless images, copy them to the templates files /opt/cmu/ etc/reconf-diskless-image.sh and /opt/cmu/etc/reconf-diskless-snapshot.sh.
PAGE 77
the website and the oneSIS implementation included with HP Insight CMU is that the HP Insight CMU implementation does not require you to rebuild your kernel with NFS support. Instead HP Insight CMU allows you use the existing kernel from the golden node, and it rebuilds an initrd image containing the appropriate driver support plus NFS support for mounting the read-only root file system. Everything else is the same.
PAGE 78
• 2. 3. bind-utils (RHEL) Install the oneSIS rpm on the golden node. Install the same oneSIS rpm that was installed on the HP Insight CMU management node. Configure the DISTRO setting for oneSIS. The /etc/sysimage.conf file is present after the oneSIS rpm is installed on the golden node. This is the main oneSIS configuration file for this image. For now, the only configuration setting that must be made is the DISTRO setting.
PAGE 79
1. 2. Log in with Administrator privileges. Select Cluster Administration→Logical Group Management→Create a Logical Group. 3. 4. In the New Logical Group window, enter a name for this diskless logical group. Check the diskless box. NOTE: If the diskless box is not available, then add CMU_DISKLESS=true to /opt/cmu/ etc/cmuserver.conf and restart the GUI. 5. In the Diskless toolkit drop-down box, select oneSIS.
PAGE 80
If you want to select the current running kernel as the kernel to boot diskless, then provide the CURRENT keyword (for example, –k CURRENT ). If any errors occur during the creation of the golden image, the logical group will not be created in HP Insight CMU. Correct the errors and recreate the diskless logical group. 5.8.3.5.2 Customizing an HP Insight CMU oneSIS diskless image An HP Insight CMU oneSIS diskless image can be managed in two ways: manually and automatically.
PAGE 81
The onesis_pxeboot directory contains • The vmlinuz kernel. • The initrd.img initial ramdisk. • The pxelinux.0 PXE-boot loader. • The pxelinux.cfg/ directory where the PXE-boot files for each node will be installed. These components are used during the PXE-boot process to boot the compute nodes into a diskless environment. The onesis_pxeboot_template file is the PXE-boot template file.
PAGE 82
5.8.4 Scaling out an HP Insight CMU diskless solution with multiple NFS servers By default, the HP Insight CMU diskless support configures the HP Insight CMU management node as the NFS server that will serve the diskless image, regardless of the diskless implementation method. HP recommends that a single NFS server can support up to ~200 diskless clients over a 1Gb ethernet management network, and an NFS server with a 10Gb ethernet management network can support up to ~400 diskless clients.
PAGE 83
enables the scalable diskless support in HP Insight CMU. Edit this file and insert the NFS topology of the cluster. The syntax of this file supports HP Insight CMU node names and network entities. Remember that HP Insight CMU network entities represent groups of nodes on a common network switch. The acceptable formats of this file are: ... ... ...
PAGE 84
5.8.4.1 Comments on High Availability (HA) Configuring an HA solution for the additional NFS servers is beyond the scope of the procedure described in “Scaling out an HP Insight CMU diskless solution with multiple NFS servers” (page 82). If HA NFS servers are needed, then configure the HA solution on the NFS servers during step #2 in“Scaling out an HP Insight CMU diskless solution with multiple NFS servers” (page 82) so the servers are ready for use by step #6.
PAGE 85
6 Monitoring a cluster with HP Insight CMU NOTE: Monitoring support is not available for Windows cartridges. However, users can gather metrics for Windows cartridges from external sources and scripts, then use HP Insight CMU extended metrics features to feed those metrics to the HP Insight CMU monitoring engine and GUI display. 6.1 Installing the HP Insight CMU monitoring client You must install the HP Insight CMU monitoring client to properly monitor your cluster. 1.
PAGE 86
NOTE: If you are upgrading from an older version of HP Insight CMU, then you must reinstall the new HP Insight CMU monitoring agents from the HP Insight CMU v7.2 rpm or HP Insight CMU monitoring will not start. 6.3 Monitoring the cluster Launch the HP Insight CMU GUI. Figure 30 Main window In Figure 30 (page 86), the left frame lists the resources, such as Network Entities, Logical Groups, Nodes Definitions, etc. The '+' sign expands a resource.
PAGE 87
Figure 31 Node status The status of this node is okay. Node values are correctly reported to the main monitoring daemon. The node is pinging properly, and the monitoring is working properly, but an alert is currently reported for this node. One of the thresholds defined by you has been exceeded. Click the node in the tree to view the detail of this alert. The status of this node is "No Ping". This node is not pinging at all. User action is required to identify the problem.
PAGE 88
In the central frame, the following tabs are available: • Instant View • Table View • Time View • Details • Alerts For a single node view, the following tabs are available: • Monitoring • Details • Alerts 6.3.3 Global cluster view in the central frame By default, the central frame displays the monitoring values of the whole cluster. You can return to this view at any time by clicking CMU Cluster at the root of the node tree.
PAGE 89
6.3.4 Resource view in the central frame Monitoring values can be visualized by: • Global cluster • A specific logical group • A specific network entity • A specific user group Click the desired resource in the left-frame tree and the title of the central frame displays the name of the selected resource. NOTE: Resource or node specific monitoring metrics and alerts can be displayed in CLI mode using /opt/cmu/bin/cmu_monstat. For details, see the cmu_monstat manpage. 6.3.4.
PAGE 90
Figure 34 Alert messages 6.3.4.2 Detail mode in resource view To display a table with sensor values, select the Instant View tab in the central frame. • The cell is green when the value is below 33% of the maximum value. • The cell is orange when the value is between 33% and 66% of the maximum value. • The cell is red when the value is above 66% of the maximum value. Figure 35 Resource view details 6.3.5 Gauge widget The middle of the pie shows average values for a sensor.
PAGE 91
Figure 36 Memory used summary The widget also displays marks for average, maximum, and minimum values during the last two minutes for a given metric. 6.3.6 Node view in the central frame To display the details of a node, select that node in the tree. The following tabs are available in the central frame: • Monitoring — Shows monitoring metric values for that node. • Details — Shows static data for the node. Some of the values are filled during the initial node discovery (scan node).
PAGE 92
Figure 37 Node details The central frame title displays the name of the node. The title is colored according to the state of the node. The following tables appear: • The Node Properties table contains the static information from HP Insight CMU monitoring (contained in the /opt/cmu/etc/cmu.conf.complementary file). • The Information Retrieved table contains the current values of the sensors retrieved for this node. • The Alerts Raised table contains the alerts currently raised for this node. 6.3.
PAGE 93
6.3.7.2 Adaptive stacking Adaptive stacking is an efficient way to monitor your cluster over a long period of time. Adaptive stacking provides 42 minutes of data, without sacrificing the finest 5 second granularity provided by the monitoring engine. The first 24 rings (representing 2 minutes of data, with a 5 second granularity) progressively slide and consolidate into an intermediate ring, making room for newest data. The intermediate ring is full when six rings are stacked in it, representing 30 seconds.
PAGE 94
◦ Press the mousewheel and drag – Zoom 6.3.7.3.2 Keyboard control Keyboard shortcuts are available for some Time View options. All of the following shortcuts are also available in Options→Properties.
PAGE 95
6.3.7.5 Troubleshooting Problems can occur with Time View running on Windows Vista. To disable Time View from the GUI, click the second link on the cluster webstart page which launches HP Insight CMU without Time View. In the CLI, set –Detrunk=false argument. If Time View prints an "OutOfMemory [...]" error, try increasing the maximum HEAP memory usage of the GUI. To specify the memory consumption allowed for the JVM, set the –Xmx JVM argument when starting the CLI.
PAGE 96
Figure 40 Archived user groups NOTE: User groups can also be archived using a the cmu_del_user_group command. For details, see the cmu_del_user_group manpage. 6.3.8.1 Visualizing history data When selecting an archived user group in the left-frame tree, a static Time View picture displays in the central frame. The picture shows the activity view of the user group during its existence. All options available with Time View are also available when visualizing archived user groups. 6.3.8.
PAGE 97
#This is a CMU action and alerts description file #============================================================= # # ACTIONS # # # #-------------KERNEL VERSION, RELEASE, BIOS VERSIONS---------# kernel_version "kernel version" 9999999 string Instantaneous release uname -r #-------------CPU--------------------------------------------# # #- Native cpuload "% cpu load (raw)" 1 numerical MeanOverTime 100 % awk '/cpu / {printf"%d\n",$2+$3+$4}' /proc/stat #- Collectl #cpuload "% cpu load (normalized)" 1 numerical
PAGE 98
Description A quote-contained string to describe in a few words what the sensor is. This appears in the GUI. Time multiple An integer value that determines when the sensors are monitored. If the monitoring has a default timer of 5 seconds: • A time multiple of 1 means the value is monitored every 5 seconds. • A time multiple of 2 means the value is monitored every 10 seconds. Data type This can be numerical or a string. A string sensor cannot be displayed in the pies by the interface.
PAGE 99
Operator The comparison operator between the sensor and the threshold. Only > is available. Unit The unit of the sensor. The GUI uses this measurement. Command The command to be executed by the script. This can be an executable or a shell command. The executable and the shell command must be available on compute nodes. 6.5.4 Alert reactions Each alert reaction contains the following fields: Name(s) The names of one or more alerts from the ALERTS section. The reaction is associated with each of the alerts.
PAGE 100
CMU_ALERT_SEQUENCE_FILE The path of the HP Insight CMU “sequence” file containing the alerts and alert values from the monitoring pass that triggered the reaction. Analyze this file with the /opt/cmu/bin/cmu_monstat command. NOTE: To protect the management node from large numbers of concurrent reactions, a reaction will only launch on behalf of compute nodes that do not have previous instances of the reaction still running.
PAGE 101
IMPORTANT: For HP Insight CMU diskless configurations, only use the DaemonCommands options provided in the example above. Do not use any option which causes disk I/O. 4. Start collectl: # /etc/init.d/collectl start Starting collectl: 5. [ OK ] Configure collectl to start automatically: # chkconfig --add collectl collectl 0:off 1:off 2:on 3:on 4:on 5:on 6:off 6.5.6.2 Modifying the ActionAndAlerts.txt file The ActionAndAlerts.txt file contains definitions for using collectl monitoring.
PAGE 102
cpuinfo.irq.cpu0 0 cpuinfo.soft.cpu0 0 cpuinfo.steal.cpu0 0 cpuinfo.idle.cpu0 100 cpuinfo.intrpt.cpu0 0 cpuinfo.user.cpu1 0 cpuinfo.nice.cpu1 0 cpuinfo.sys.cpu1 0 cpuinfo.wait.cpu1 11 cpuinfo.irq.cpu1 0 cpuinfo.soft.cpu1 0 cpuinfo.steal.cpu1 0 cpuinfo.idle.cpu1 89 cpuinfo.intrpt.cpu1 0 cpuinfo.user.cpu2 4 cpuinfo.nice.cpu2 0 cpuinfo.sys.cpu2 2 cpuinfo.wait.cpu2 0 Create the monitoring lines by using these variables. Native HP Insight CMU lines and collectl lines can be mixed in the ActionAndAlertFile.
PAGE 103
• For SUSE: # cp -a /srv/www/htdocs/colplot /opt/cmu/www/colplot 6. Change the default colplot plot directory to point to the common collectl directory: # vi /etc/colplot.conf #PlotDir = /opt/hp/collectl/plotfiles PlotDir = /var/log/collectl 7. If not already done, install the collectl rpm on compute nodes: # mount /dev/cdrom /mnt # cd /mnt/tools/collectl # rpm -ivh collectl-3.x.x-x.noarch.rpm Preparing... 1:collectl 8.
PAGE 104
Figure 41 ColPlot window Select plotting options, then click Generate Plot.
PAGE 105
Figure 42 ColPlot results 6.5.7 Monitoring GPUs and coprocessors 6.5.7.1 Monitoring NVIDIA GPUs If your client nodes contain NVIDIA GPUs and are running version 270.xx.xx or newer of the NVIDIA GPU driver, you can monitor your GPUs with HP Insight CMU. If you haven’t done so already, install the NVIDIA GPU driver version 270.xx.xx or newer on your client nodes. This can be done two ways: 1.
PAGE 106
Running /opt/cmu/bin/cmu_config_nvidia adds a list of predefined GPU metrics to ActionAndAlertsFile.txt. To monitor these metrics using the GUI, select the desired metrics from the Monitoring sensors list as described in Figure 32 (page 88). NOTE: Not all metrics are supported by all NVIDIA GPUs and some lesser used metrics may be commented out within ActionAndAlertsFile.txt.
PAGE 107
6.5.7.3 Monitoring Intel coprocessors If your client nodes contain Intel coprocessors, you can monitor the coprocessors with HP Insight CMU. IMPORTANT: If you currently monitor Intel coprocessors using HP Insight CMU, you must deploy an updated set of images. To deploy the images: 1. Redeploy the HP Insight CMU monitoring client to all nodes. It contains a new binary for collecting coprocessor metrics. 2. Remove the existing coprocessor metrics from the /opt/cmu/etc/ ActionsAndAlertsFile.
PAGE 108
e. For SUSE Enterprise Linux Server: # sudo zypper --no-gpg-checks install *.rpm f. If this is the first time installing the driver, initialize the MIC cards: # sudo micctrl g. initdefaults Restart the driver: # sudo micctrl –r h. Verify the driver is loaded and the coprocessors are initialized and ready: # sudo micctrl mic1: ready i. status mic0: ready Start up the coprocessors: # service mpss start j. Verify the cards are seen by the OS and are working: # /opt/intel/mic/bin/micinfo k. l.
PAGE 109
If you use HP SIM, then you can create an environment to monitor HP Insight CMU alerts using SIM. This can be accomplished many ways. This section offers one possible model. You can use this example as an outline for creating a model that works for your environment. Alerts in HP Insight CMU are similar to events in SIM. However, alerts and events are defined and responded to differently in each product. You define alerts in HP Insight CMU in the ActionAndAlertFile.txt file.
PAGE 110
[root@cmumaster ~]# sinfo -t alloc -o "%N" -h node[10-12,14,20-21,33-39,41-48,50-55] [root@cmumaster ~]# To use an HP Insight CMU tool for expanding names to create a space-separated list of allocated nodes: [root@cmumaster ~]# sinfo -t alloc -o "%N" -h | /opt/cmu/tools/cmu_expand_names -s " " node10 node11 node12 node14 node20 node33 node34 node35 node36 node37 node38 node39 node41 node42 node43 node44 node45 node46 node47 node48 node50 node51 node52 node53 node54 node55 [root@cmumaster ~]# To apply this
PAGE 111
echo “BEGIN_NODE $free_nodes” echo “allocated 0” >> $file >> $file $CMU_SUBMIT –f $file [root@cmumaster ~]# The script above obtains and submits the "allocated" metric to HP Insight CMU. The last step is to configure this new metric in the HP Insight CMU ActionAndAlertsFile.txt file: allocated "nodes allocated to users" 2 numerical Instantaneous 1 alloc EXTENDED /root/allocated_nodes.
PAGE 112
Then divide the running time by 5 to get the time multiple. In this example: 7/5=2 Note that your data gathering script may obtain, parse, and submit more than one metric to HP Insight CMU. A typical example of this is gathering multiple temperature readings from a single source, such as through IPMI or from the Onboard Administrator of an HP Blade enclosure. In this case, you only need to configure the script to run with one metric in the ActionAndAlertsFile.txt file.
PAGE 113
Figure 45 Verify AMS submenu There are three components to the HP Insight CMU AMS with the HP iLO: 1. Configuring the iLO on each server with a public SNMP read-only port and enabling AMS. 2. Requesting and displaying a full data report of all available HP iLO data. 3. Configuring HP iLO SNMP data as metrics to be monitored by HP Insight CMU. NOTE: You must configure the HP iLO with AMS to get the data report and configure the monitoring. 6.5.9.1.
PAGE 114
Figure 47 Configure iLO finished To test which iLOs are configured with AMS, select the nodes. Then select AMS→Test iLO Config. 6.5.9.1.2 Accessing and viewing the HP iLO data via SNMP Enabling the AMS functionality in the HP iLO makes it possible to query the HP iLO for data via an SNMP query: Figure 48 SNMP query The published HP MIBs, which define the SNMP strings, are available on the internet. HP Insight CMU includes a subset of these MIBs in /opt/cmu/snmp_mibs/.
PAGE 115
Figure 49 Get/Refresh SNMP data Now you can view the data by selecting the nodes in the HP Insight CMU GUI and selecting AMS→View/Compare SNMP Data. Figure 50 View/Compare SNMP data The data is piped through the CMU_Diff filter before being displayed, in case you selected more than one node to compare the data. The SNMP OID string is the first column, followed by the HP MIB definition for that SNMP OID string, and finally the value of the SNMP OID string.
PAGE 116
pre-configured SNMP OID strings and submit them to HP Insight CMU for display via the GUI. The pre-configured SNMP OID strings and their corresponding metric name in HP Insight CMU are in /opt/cmu/etc/cmu_ams_metrics: Figure 51 cmu_ams_metrics If you want to configure additional SNMP OID strings for monitoring, you can add them to this file with their corresponding metric name in HP Insight CMU. The HP Insight CMU command to gather this data and submit it to CMU is /opt/cmu/bin/cmu_get_ams_metrics.
PAGE 117
The last step is to configure the SNMP metrics in HP Insight CMU. Add the following lines to the /opt/cmu/etc/ActionAndAlertsFile.
PAGE 118
7 Managing a cluster with HP Insight CMU Cluster management tasks can be performed on one or more nodes with HP Insight CMU. These tasks depend on your privileges and the number of selected nodes. 7.1 Unprivileged user menu When the HP Insight CMU GUI is in normal mode, you can only monitor node status and visualize static data. You cannot perform any other action on the cluster nodes because of potentially destructive actions. 7.
PAGE 119
To select a terminal emulator other than the default: 1. Edit /opt/cmu/etc/cmuserver.conf. 2. Six blocks of variable names begin with CMU_REMOTE_TERMINAL. Uncomment the full block of variables for the preferred terminal emulator. 3. Verify all variables for other terminal emulators are commented out. 4. Restart cmuserver: # /etc/init.d/cmu restart 7.4 Management card connection This menu is only available when one node is selected.
PAGE 120
Figure 56 Power off dialog box 7.8 Boot When one or more nodes are selected, this task enables you to boot a collection of nodes on their own local disk or over the network. You must select nodes to be booted prior to running this command. The boot procedure uses the management card of each node. The password for the management card must be entered. Nodes to be booted must have the same management card password. IMPORTANT: If the nodes are booted, the boot procedure attempts a proper shutdown.
PAGE 121
7.11 Multiple windows broadcast This task is available when one or more nodes are selected. The following connections are available for multiple windows broadcast: • A secure shell connection through the network, when the network is up on selected nodes. • Connection through the management card, if selected nodes have a management card. The multiple windows broadcast command launches a master console window and concurrent mirrored secure shell sessions embedded in an x term on all selected nodes.
PAGE 122
Figure 60 pdsh window You can toggle the two filters on and off using dshbak or cmudiff. These two filters are mutually exclusive, so you can: • Filter with cmudiff • Filter with dshbak • Use no filter 7.12.1 cmudiff examples Example 1 date command The cmudiff output is two fields separated by dotted lines. The header displays: 122 • The number of responses, 4 in this example (This amount means a response has been received from 4 compute nodes.
PAGE 123
The output line appears below the header. In this example, the output is only 1 line: • The “m”, which appears on the left, indicates that the output from some compute node differs from the reference node. • Some details about output processing results, which are provided on the right. Characters that differ from the reference node are highlighted in red. In this example, the time drift in the “seconds” field differs.
PAGE 124
Narrow the search of the failing nodes with the -d option to display node populations: cmu_pdsh> cmudiff –d cmudiff filter is , with parameters cmu_pdsh> -d cmu_pdsh> dmidecode The comment now shows “(2 populations) o185i[040,042] are 83% similar”. This comment suggests that those two compute nodes have a different BIOS release date than all other nodes.
PAGE 125
NOTE: A nonresponsive node in the node selection for single window pdsh causes the answer from other nodes to be delayed until a timeout occurs from the nonresponsive node. You can reduce this delay by setting the value in the ConnectTimeout in .ssh/config variable. For example: # vi /root/.ssh/config Host * StrictHostKeyChecking no ConnectTimeout 1 7.13 Parallel distributed copy (pdcp) The pdcp task enables you to copy a file from the HP Insight CMU administration server to multiple nodes simultaneously.
PAGE 126
Figure 62 User group management Select any number of nodes from the list of “Nodes in Cluster” on the left and use the arrows to move the nodes to the list of “Nodes in User Group” on the right. 7.14.2 Deleting user groups 1. 2. 3. In the User Group Management window, select the user group to delete. Click Delete. Click OK. 7.14.3 Renaming user groups 1. 2. 3. 4. In the User Group Management window, select the user group to rename. Click Rename. Enter the new name. Click OK. 7.
PAGE 127
HP Insight CMU provides the latest conrep kit available at release time. If a different or newer version of conrep is required for the servers in your cluster, you can configure the full path and file name of the correct conrep binary by editing the CMU_BIOS_SETTINGS_TOOL variable in /opt/cmu/etc/cmuserver.conf. The conrep tool also requires an XML file containing the information necessary to interpret the BIOS flash memory data on your server into human-readable text.
PAGE 128
1. In the /opt/cmu/etc/cmu_custom_menu file, uncomment the following line: SERVER;audit|dmidecode;/opt/cmu/bin/cmu_dsh -f CMU_TEMP_NODE_FILE -c "dmidecode" -e "-b -n -v0 -R0" 2. 3. Run the CLI. cmu> custom_run Title Command -------------------|------audit|dmidecode /opt/cmu/bin/cmu_dsh -f CMU_TEMP_NODE_FILE -c "dmidecode" -e "-b -n -v0 -R0" cmu> The available custom commands are displayed. 4. Run the dmidecode command on node10 from the CLI. cmu> custom_run "audit|dmidecode" node10 7.16.
PAGE 129
Help commands To get help during a CLI session, use the help command. This command displays all available commands of HP Insight CMU CLI.
PAGE 130
halt halt nodes of logical group group_1 except node_exp delay "mesg" all group_1 group_2 halt nodes of group_1 and group_2 cmu> Displaying logical groups of a cluster The groups command displays the list of the logical groups. cmu> groups list of group(s) with active nodes : debian default nodevmap pfmon sfs2 list of available group(s) for backup and cloning : default sfs2 suse10 pfmon testrh3u4 debian nathclontest nodevmap cmu> You can also call this command followed by a group name.
PAGE 131
Executing a command on a list of nodes To execute a command on multiple nodes, you must specify the names of nodes. cmu> boot o185i222 o185i233 o185i243 active node list selected: cmu> o185i222 o185i233 o185i243 Executing a command on a range of nodes To execute a command on a range of nodes, you must specify the range using their attributes. Commands are executed on all nodes within the range.
PAGE 132
Executing a command on specific nodes of a logical group You can use the but option to exclude active nodes of a group from the selection. Nodes to exclude can be specified with any combination of regular expressions. cmu> boot all default but o185i222 - o185i252 active node list selected: cmu> o185i194 o185i202 o185i216 o185i253 o185i254 7.17.4 Administration and cloning commands Booting a set of nodes You can boot any number of nodes in the cluster.
PAGE 133
To broadcast on all nodes of the cluster: cmu> broadcast all selected o185i202 o185i214 o185i226 o185i238 o185i250 nodes: o185i192 o185i193 o185i194 o185i195 o185i196 o185i197 o185i198 o185i199 o185i200 o185i201 o185i203 o185i204 o185i205 o185i206 o185i207 o185i208 o185i209 o185i210 o185i211 o185i212 o185i213 o185i215 o185i216 o185i217 o185i218 o185i219 o185i220 o185i221 o185i222 o185i223 o185i224 o185i225 o185i227 o185i228 o185i229 o185i230 o185i231 o185i232 o185i233 o185i234 o185i235 o185i236 o185i237 o1
PAGE 134
active node list selected: o185i192 Please read /opt/cmu/log/PowerOff.log for errors. cmu> Setting the locator LED on or off Sets the locator LED of any number of nodes on or off. You can use the regular expressions previously described.
PAGE 135
Total | 1 | 0 | 0 Detailed logs are in /opt/cmu/log/cmucerbere.log and/opt/cmu/log/cmucerbere-*.log [INFO] CMU does not seem to be running /opt/cmu/tmp/GUI/config.txt was rewritten cmu> Adding a new logical group The add_logical_group command creates a new logical group. Parameters are specified on one line: cmu> add_logical_group image_name "device" For example: cmu> add_logical_group my_logical_group "cciss/c0d0" processing 1 logical group ...
PAGE 136
[16:15:13] OSTYPE:Linux-CMU [16:15:13] [DollyClient] Starting to get fstab files [16:15:13] [DollyClient] Getting "/opt/cmu/tmp/fstab.txt" [16:15:14] [DollyClient] fstab of /dev/sda1 received and stored into /opt/cmu/tmp/fstab.txt [16:15:14] [DollyClient] Executing: /bin/grep "LABEL" /opt/cmu/tmp/fstab.txt | /usr/bin/wc -l >/opt/cmu/tmp/number_of_label [16:15:14] [DollyClient] No label in /opt/cmu/tmp/fstab.
PAGE 137
[16:25:06] [DollyClient] Device is sda [16:25:06] [DollyClient] Asking for partition table of "/dev/sda" [16:25:06] [DollyClient] Getting /opt/cmu/image/test_julien/parttbl-sda.txt [16:25:07] [DollyClient] Getting /opt/cmu/image/test_julien/parttbl-sda.raw [16:25:07] [DollyClient] Getting /opt/cmu/image/test_julien/partarchi-sda1.tgz [16:25:17] [DollyClient] Getting /opt/cmu/image/test_julien/partarchi-sda5.tgz [16:25:38] [DollyClient] Getting /opt/cmu/image/test_julien/partarchi-sda6.
PAGE 138
7.17.5 Administration utilities pdcp and pdsh HP Insight CMU includes the open source software pdcp and pdsh. Usage example of pdcp: # /opt/cmu/bin/pdcp -w cn0001,cn0002 source /tmp/dest where: source is a file on the management node. dest is the name of the destination file copied to compute nodes cn0001 and cn0002. Usage example of pdsh: # /opt/cmu/bin/pdsh -w cn0001,cn0002 ls cn0001: cn0001: cn0002: cn0002: cn0002: cn0002: bin inst-sys anaconda-ks.cfg CMU_CLONING_INFO install.log.syslog install.
PAGE 139
8 Advanced topics 8.1 Accessing the GUI for non-root users HP Insight CMU allows non-root users to log into the GUI and access some or all of the privileged HP Insight CMU functionality available through the GUI. The GUI supports non-root user accounts that exist either as local accounts or as NIS accounts on the HP Insight CMU management node. The high-level operational goal of this support is to make the HP Insight CMU GUI a graphical extension of logging into the head node.
PAGE 140
Table 4 Operational HP Insight CMU GUI features available by default for non-root users (continued) Cloning (Deploy Image) user (requires sudo) Autoinstall (kickstart|autoyast|preseed) user (requires sudo) Update→Get Nodes Static Info user (requires sudo) Update→Install CMU Monitoring Client user (requires sudo) Update→Rescan MAC root Insight→Show BIOS Settings user (requires sudo) Insight→Show BIOS Version user (requires sudo) Insight→Upgrade Firmware user (requires sudo) Any configured HP
PAGE 141
Table 5 HP Insight CMU GUI features and their corresponding commands HP Insight CMU GUI feature (right-click node selection) HP Insight CMU management node command Management Card Connection /opt/cmu/bin/cmu_console Shutdown /opt/cmu/tools/halt.exp Power Off /opt/cmu/bin/cmu_power Boot /opt/cmu/tools/boot.exp Reboot /opt/cmu/tools/reboot.
PAGE 142
In this context, the term "diskless" refers to any OS image that can be created and prepared locally on the HP Insight CMU management server and then served over the network to a PXE-booted set of compute nodes. A few different implementations of "diskless" OS images are: • stateful NFS-root — All reads and writes from the target compute nodes occur on the central NFS server. • stateless NFS-root — Reads occur from the central NFS server, but writes occur in memory (in a tmpfs filesystem).
PAGE 143
8.2.2 Delete diskless image The delete_image program is called when an HP Insight CMU diskless logical group of type is deleted. This program is called with the following argument: -l The name of the logical group to delete. The delete_image program is expected to delete everything related to the diskless OS in /opt/ cmu/image//. 8.2.
PAGE 144
-i The IP address of the target node to unconfigure. The unconfigure_node program is expected to unconfigure and/or remove anything related to the given node. This may include calling /opt/cmu/tools/cmu_remove_node_from_dhcp to prevent the node from PXE-booting, and to delete any PXE-boot file related to the given compute node. 8.2.
PAGE 145
8.3 HP Insight CMU remote hardware control API As of version 5.0, HP Insight CMU supports a remote hardware control API. This hardware API makes it possible to integrate HP Insight CMU power and UID control with any computer that has remote power control capability. The /opt/cmu/bin/cmu_power command interacts with this API to provide remote power and UID control for HP Insight CMU. The existing hardware APIs are: ILO The most common method of interacting with HP BL/DL/SL servers.
PAGE 146
/opt/cmu/hardware/FOO/cmu_FOO_power_osoff /opt/cmu/hardware/FOO/cmu_FOO_power_uid_off /opt/cmu/hardware/FOO/cmu_FOO_power_uid_on All of these programs are invoked with the following arguments: -n -i -e Where: The host name of the target server. The IP address of the management card for the target server. The name of the file to log any errors.
PAGE 147
An HP Insight CMU compute node (login1) is connected to the lab network (eth0) and the private cluster network (eth1). Depending on the hardware configuration and wiring for that node, the kernel might send the DHCP IP request over the lab network (eth0), causing the kernel to hang. To avoid this, the system administrator can: 1. Copy /opt/cmu/etc/bootopts/default to /opt/cmu/etc/bootopts/login1. 2. Edit the new login1 file by changing the existing ip=::::::bootp to ip=:::::eth1:bootp.
PAGE 148
8.5 Support for ScaleMP HP Insight CMU can be integrated to work with ScaleMP. To enable support for ScaleMP, add the following variable and setting to the /opt/cmu/etc/cmuserver.conf file: CMU_vSMP_PREFIX=vSMP_ This setting configures the prefix that is used to identify HP Insight CMU logical group nodes that can be pxe-booted into the virtual SMP environment.
PAGE 149
then starts to transfer the image to a group member, while the image server uploads a third one. This process is called the tree propagation algorithm. After a node has received a completed image, it attempts to upload to another node within the entity. This mechanism speeds up the propagation process and takes advantage of the available network bandwidth. Each time a node receives a clone image, the node uncompresses the image on the local disk. This is designed to speed up the cloning process.
PAGE 150
8.7 Support for Intel Xeon Phi cards HP Insight CMU can be configured to support cloning the OS image when Intel Xeon Phi cards are present in the compute nodes and the Intel Xeon Phi software is installed. HP Insight CMU also supports booting a oneSIS diskless OS image to all compute nodes that have Intel Xeon Phi cards installed and the Intel Xeon Phi software installed. However, in both cases HP Insight CMU is only providing the OS file system.
PAGE 151
• A network bridge must be configured with the IP address of the local compute node (/etc/ sysconfig/network-scripts/ifcfg-br0). • The network device must be associated with the bridge (/etc/sysconfig/ network-scripts/ifcfg-ethX). • Each host-side Xeon Phi card Ethernet device must be associated with the bridge (/etc/ sysconfig/network-scripts/ifcfg-micX).
PAGE 152
8.7.2 Cloning an image with Intel Xeon Phi cards configured with independent IP addresses The Intel Xeon Phi host names and IP addresses must be determined and configured in the /etc/ hosts file on the HP Insight CMU management node for a successful post-cloning process. For more details on this configuration, see Section 8.7.1 (page 151).
PAGE 153
echo echo echo echo echo DELAY=0 IPADDR=${CMU_RCFG_IP} NETMASK=${CMU_RCFG_NTMSK} NM_CONTROLLED=no MTU=${NETWORK_MTU} >> >> >> >> >> $BRIDGE_CONF $BRIDGE_CONF $BRIDGE_CONF $BRIDGE_CONF $BRIDGE_CONF fi # this for loop expects all Intel Xeon Phi cards to be present in the output # of ‘lspci’. If this is unreliable then change this to look for existing # ${CMU_RCFG_PATH}/etc/sysconfig/mic/micX.
PAGE 154
rm rm rm if -f ${CMU_RCFG_PATH}/var/log/spooler* -f ${CMU_RCFG_PATH}/var/log/lastlog* -f ${CMU_RCFG_PATH}/var/log/cron* [ -d ${CMU_RCFG_PATH}/var/log/collectl ]; then rm -f ${CMU_RCFG_PATH}/var/log/collectl/* fi rm -f ${CMU_RCFG_PATH}/opt/cmu/log/* exit 0 8.7.3 HP Insight CMU oneSIS diskless file system support for independent addressing of Intel Xeon Phi cards HP Insight CMU oneSIS diskless support provides two scripts for modifying the oneSIS single system image. The reconf-onesis-image.
PAGE 155
compute" # # # When you are finished editing this file, make sure you run the 'mk-sysimage' # command with the path to the diskless image, like this: # # # /sbin/mk-sysimage ${CMU_RCFG_PATH} # # #-- custom code starts here -- # # Xeon Phi support # INTEL_MIC_FS=/opt/intel/mic/filesystem # sync hosts file cp /etc/hosts ${CMU_RCFG_PATH}/etc/ cp /etc/hosts ${CMU_RCFG_PATH}/${INTEL_MIC_FS}/mic0/etc/ cp /etc/hosts ${CMU_RCFG_PATH}/${INTEL_MIC_FS}/mic1/etc/ ## configure a NODECLASS_REGEXP so that we can configur
PAGE 156
/ram${INTEL_MIC_FS}/mic0/etc/sysconfig/hostname" >> ${file} echo "mv -f /ram${INTEL_MIC_FS}/mic1/etc/sysconfig/.hostname.\$host /ram${INTEL_MIC_FS}/mic1/etc/sysconfig/hostname" >> ${file} echo "mv -f /ram${INTEL_MIC_FS}/mic0/etc/sysconfig/network/.ifcfg-mic0.\$host /ram${INTEL_MIC_FS}/mic0/etc/sysconfig/network/ifcfg-mic0" >> ${file} echo "mv -f /ram${INTEL_MIC_FS}/mic1/etc/sysconfig/network/.ifcfg-mic0.
PAGE 157
#-- custom code starts here -# # Xeon Phi support # # add the 'BRIDGEIPMASK' setting to the PXE-boot file # to enable the bridge network for the MICs iphex=`/opt/cmu/tools/cmu_dl_ip_to_hexa -i ${CMU_RCFG_IP}` imageDir="/opt/cmu/image/${CMU_RCFG_IMAGENAME}" pxeFile="${imageDir}/onesis_pxeboot/pxelinux.cfg/${iphex}" sed -e "s|ONESIS=true$|ONESIS=true BRIDGEIPMASK=${CMU_RCFG_IP}/16|" < $pxeFile > ${pxeFile}-tmp mv -f ${pxeFile}-tmp ${pxeFile} # # REMEMBER: all per-node files are "hidden" and end with .
PAGE 158
# INTEL_MIC_FS=${CMU_RCFG_PATH}/opt/intel/mic/filesystem.default # configure micX/etc/sysconfig/hostname files MIC0_HOSTFILE=${INTEL_MIC_FS}/mic0/etc/sysconfig/.hostname.${CMU_RCFG_HOSTNAME} MIC1_HOSTFILE=${INTEL_MIC_FS}/mic1/etc/sysconfig/.hostname.
PAGE 159
9 Support and other resources 9.1 Contacting HP 9.1.1 Before you contact HP Be sure to have the following information available before you contact HP: • Technical support registration number (if applicable) • Product serial number • Product identification number • Applicable error message • Add-on boards or hardware • Third-party hardware or software • Operating system type and revision level 9.1.
PAGE 160
• Installation and user guides for your specific operating system. 9.3 Typographic conventions This document uses the following typographical conventions: %, $, or # A percent sign represents the C shell system prompt. A dollar sign represents the system prompt for the Bourne, Korn, and POSIX shells. A number sign represents the superuser prompt. audit(5) A manpage. The manpage name is audit, and it is located in Section 5. Command A command name or qualified command phrase.
PAGE 161
CAUTION A caution calls attention to important information that if not understood or followed will result in data loss, data corruption, or damage to hardware or software. IMPORTANT This alert provides essential information to explain a concept or to complete a task. NOTE A note contains additional information to emphasize or supplement important points of the main text. 9.
PAGE 162
A Troubleshooting Issues encountered while using HP Insight CMU can be classified as: • Network boot issues which affect cloning and backup • Backup specific issues • Cloning specific issues • Administration command issues • GUI specific issues A.1 HP Insight CMU logs Every HP Insight CMU command logs information in a dedicated log file. All log files are available in /opt/cmu/log. A.1.
PAGE 163
• An incorrect MAC address in the HP Insight CMU database • The HP Insight CMU configuration on the management node is lost. Troubleshooting switch issues 1. Verify that the management node pings the iLO and the nodes. 2. Verify that broadcast is enabled and is redirected to the switch. 3. Verify that the spanning tree is disabled on all ports connected to a node. 4. Verify that « multicast IGMP snoop loop » is disabled on the switch.
PAGE 164
A.4 Cloning issues If only one node cannot be cloned: 1. Verify that you can boot in network mode. 2. Verify that the node has the same hardware as other nodes. 3. Verify that the node does not have a hardware problem. 4. Power off manually, then relaunch cloning. If no nodes in a network entity can be cloned: 1. Clone all nodes except the first node in the network entity again. 2. Verify that you can boot in network mode. If no node in the cluster can be cloned: 1. Verify that you can boot in network mode.
PAGE 165
3. Verify that rsh or ssh is enabled between all nodes of the cluster and the management node. All nodes must be able to execute commands as root for any other node without needing a password 4. Verify that the HP Insight CMU rpm is properly installed on all nodes. If the HP Insight CMU GUI is unable to start, with the message "Failed to validate certificate": Figure 63 Certificate error The detailed Java exception is: java.security.cert.CertPathValidatorException: java.security.
PAGE 166
On Windows, go to System Preferences→Other→Java→Advanced→Enable online certificate validation. On Linux, run javaws -viewer in a shell, click the Advanced tab, then Enable online certificate validation. TIP: If you still encounter problems, try toggling the setting.
PAGE 167
HP Insight CMU manpages 167
PAGE 168
cmu_boot(8) NAME cmu_boot -- Boot nodes. SYNOPSIS # /opt/cmu/bin/cmu_boot -a | -n | -f | -g [-d ] DESCRIPTION Boot HP Insight CMU nodes. OPTIONS -a Boot all nodes. Nodes that are active in a diskless logical group boot diskless by default. All other nodes boot to disk by default. -n Boot the given node or nodelist expression. cmu_expand_nodes must be able to parse the expression. -f A file containing the list of nodes to boot.
PAGE 169
cmu_show_nodes(8) NAME cmu_show_nodes -- Display a list of nodes and node attributes. SYNOPSIS # /opt/cmu/bin/cmu_show_nodes [-a | -n ] [-i] [-d] [-f ] [-o ] DESCRIPTION Display a list of HP Insight CMU nodes and node attributes.
PAGE 170
%c (ILOCM only) cartridge number %N (ILOCM only) node number %p Platform %s Serial Port %S Serial Port Speed %v Vendor Args %d Cloning Block Device EXAMPLES Default behavior: # /opt/cmu/bin/cmu_show_nodes cn0004 n01 n02 n03 n04 To show details for a specific node: In this example, the string "default" is added into the output in the position of the logical group.
PAGE 171
cmu_show_logical_groups(8) NAME cmu_show_logical_groups -- Show nodes belonging to a logical group. SYNOPSIS # /opt/cmu/bin/cmu_show_logical_groups <-h | [logical_group_name]> DESCRIPTION Show nodes belonging to an HP Insight CMU logical group.
PAGE 172
cmu_show_network_entities(8) NAME cmu_show_network_entities -- Show network entities. SYNOPSIS # /opt/cmu/bin/cmu_show_network_entities <-h | [network_entity]> DESCRIPTION Show network entities.
PAGE 173
cmu_show_user_groups(8) NAME cmu_show_user_groups -- Show user groups. SYNOPSIS # /opt/cmu/bin/cmu_show_user_groups <-h | [user_group]> DESCRIPTION Show user groups.
PAGE 174
cmu_show_archived_user_groups(8) NAME cmu_show_archived_user_groups -- Show archived user groups. SYNOPSIS # /opt/cmu/bin/cmu_show_archived_user_groups [-h] | [-p] [-H] [-c] [-s separator] [-f] [-w width] DESCRIPTION Show archived user groups.
PAGE 175
cmu_add_node(8) NAME cmu_add_node -- Add node(s) to the HP Insight CMU database.
PAGE 176
-e|--serial-port serial_port node serial port -E|--serial-port-speed serial_port_speed node serial port speed -V|--vendor-args vendor_args node vendor args -D|--cloning-block-device cloning_block_device node cloning block device -C|--cartridge num (ILOCM only) cartridge number within chassis -N|--node-number num (ILOCM only) node number within the cartridge EXAMPLES Command-line mode: # /opt/cmu/bin/cmu_add_node -H cn0006 -I 16.16.184.116 -M 255.255.254.0 -A 00-02-A5-52-EB-F8 -L default -G 192.168.0.
PAGE 177
cmu_add_network_entity(8) NAME cmu_add_network_entity -- Add network entities. SYNOPSIS # /opt/cmu/bin/cmu_add_network_entity <-f filename | -h> # /opt/cmu/bin/cmu_add_network_entity DESCRIPTION Add HP Insight CMU network entities.
PAGE 178
cmu_add_logical_group(8) NAME cmu_add_logical_group -- Add logical groups. SYNOPSIS # /opt/cmu/bin/cmu_add_logical_group <-n | -i | -f filename | -s> # /opt/cmu/bin/cmu_add_logical_group <-n name -d devicename> # /opt/cmu/bin/cmu_add_logical_group <-n name -d diskless -I golden_node_ip -k kernel_version> DESCRIPTION Add HP Insight CMU logical groups.
PAGE 179
cmu_add_to_logical_group_candidates(8) NAME cmu_add_to_logical_group_candidates -- Add nodes as candidates for logical groups. SYNOPSIS # /opt/cmu/bin/cmu_add_to_logical_group_candidates<-h | -t logical_group nodename> # /opt/cmu/bin/cmu_add_to_logical_group_candidates<-t logical_group nodename -f nodenamefile> DESCRIPTION Add nodes as a candidates for an HP Insight CMU logical group.
PAGE 180
cmu_add_user_group(8) NAME cmu_add_user_group -- Add user groups. SYNOPSIS # /opt/cmu/bin/cmu_add_user_group <-f filename | -h> # /opt/cmu/bin/cmu_add_user_group DESCRIPTION Add user groups.
PAGE 181
cmu_add_to_user_group(8) NAME cmu_add_to_user_group -- Add nodes to user groups. SYNOPSIS # /opt/cmu/bin/cmu_add_to_user_group <-h | -t user_group nodename> # /opt/cmu/bin/cmu_add_to_user_group <-t user_group nodename -f nodenamefile> DESCRIPTION Add nodes to user groups.
PAGE 182
cmu_change_active_logical_group(8) NAME cmu_change_active_logical_group -- Change the active logical group for a node. SYNOPSIS # /opt/cmu/bin/cmu_change_active_logical_group <-h | -t logical_group nodename1 [nodename2] [...] # /opt/cmu/bin/cmu_change_active_logical_group < -t logical_group nodename -f nodenamefile > DESCRIPTION Change the active logical group for a node or a group of nodes.
PAGE 183
cmu_change_network_entity(8) NAME cmu_change_network_entity -- Change the network entity for a node. SYNOPSIS # /opt/cmu/bin/cmu_change_network_entity <-h | -t network_entity nodename1 [nodename2] [...]> DESCRIPTION Changing the network entity for a node. A node can belong to only one network entity. A newly added node does not belong to any network entity.
PAGE 184
cmu_del_from_logical_group_candidates(8) NAME cmu_del_from_logical_group_candidates -- Delete nodes from logical groups. SYNOPSIS # /opt/cmu/bin/cmu_del_from_logical_group_candidates <-h | -t logical_group nodename1 [nodename2] [...]> # /opt/cmu/bin/cmu_del_from_logical_group_candidates <-t logical_group nodename -f nodenamefile> DESCRIPTION Delete one or more nodes from a logical group.
PAGE 185
cmu_del_from_network_entity(8) NAME cmu_del_from_network_entity -- Delete nodes from network entities. SYNOPSIS # /opt/cmu/bin/cmu_del_from_network_entity <-h | -t network_entity nodename1 [nodename2] [...]> # /opt/cmu/bin/cmu_del_from_network_entity <-t network_entity nodename -f nodenamefile> DESCRIPTION Delete one or more nodes from a network entity.
PAGE 186
cmu_del_archived_user_groups(8) NAME cmu_del_archived_user_groups -- Delete an archived user group.
PAGE 187
cmu_del_from_user_group(8) NAME cmu_del_from_user_group -- Delete one or more nodes from a user group. SYNOPSIS # /opt/cmu/bin/cmu_del_from_user_group <-h | -t user_group nodename1 [nodename2] [...]> # /opt/cmu/bin/cmu_del_from_user_group <-t user_group nodename -f nodenamefile> DESCRIPTION Delete one or more nodes from a user group.
PAGE 188
cmu_del_logical_group(8) NAME cmu_del_logical_group -- Delete a logical group. SYNOPSIS # /opt/cmu/bin/cmu_del_logical_group <-f filename | -h> # /opt/cmu/bin/cmu_del_logical_group DESCRIPTION Delete a logical group.
PAGE 189
cmu_del_network_entity(8) NAME cmu_del_network_entity -- Delete a network entity. SYNOPSIS # /opt/cmu/bin/cmu_del_network_entity <-f filename | -h> # /opt/cmu/bin/cmu_del_network_entity DESCRIPTION Delete a network entity.
PAGE 190
cmu_del_node(8) NAME cmu_del_node -- Delete a node. SYNOPSIS # /opt/cmu/bin/cmu_del_node <-f filename | -h> # /opt/cmu/bin/cmu_del_node DESCRIPTION Delete a node.
PAGE 191
cmu_del_snapshots(8) NAME cmu_del_snapshots -- Delete monitoring snapshots from the history database. SYNOPSIS # /opt/cmu/bin/cmu_del_snapshots [-h] | <-a timestamp | -b timestamp | -z> [-v verbose] [-d dryrun] [-r] DESCRIPTION Delete monitoring snapshots from the history database.
PAGE 192
cmu_del_user_group(8) NAME cmu_del_user_group -- Delete a user group. SYNOPSIS # /opt/cmu/bin/cmu_del_user_group <-f filename | -h> [-a[name] | --archive[=name]] [-m] # /opt/cmu/bin/cmu_del_user_group DESCRIPTION Delete a user group. OPTIONS -h|--help show help -f|--filename inputfile delete user group(s) listed in inputfile -a[name]|--archive[=name]] Archive the user group. If [name] is specified, [name] is used instead of actual user group name.
PAGE 193
cmu_console(8) NAME cmu_console -- Connect to compute node management ports. SYNOPSIS # /opt/cmu/bin/cmu_console DESCRIPTION Invoke directly from the operating system shell to connect to compute node management ports (iLO/lo100i). EXAMPLES # /opt/cmu/bin/cmu_console contacting ilo_ip_address... Warning: Permanently added 'ilo_ip_address' (RSA) to the list of known hosts. cmu@x.x.x.
PAGE 194
cmu_power(8) NAME cmu_power -- Perform power actions on compute nodes. SYNOPSIS # /opt/cmu/bin/cmu_power <-h | -p action -n nodename1 [nodename2] [nodename3] | -a | -l logical_group_name | -u user_group_name | -f nodefile [ -e error_log ]> DESCRIPTION Perform iLO actions such as power on, power off, emulate power button, get power status, and UID on/off. OPTIONS -h show help -p action specifies the action to perform; valid actions are: OFF Power off.
PAGE 195
EXAMPLES To power off one node: .cmu_power -p OFF -n cn0001 To power off nodes belonging to user group user1: .cmu_power -p OFF -u user1 To boot nodes belonging to logical group rh6u0_x86_64: .cmu_power -p BOOT -l rh6u0_x86_64 To turn on the UID led on nodes belonging to user group user2: .
PAGE 196
cmu_custom_run(8) NAME cmu_custom_run -- A CLI to HP Insight CMU custom menu options. SYNOPSIS # /opt/cmu/bin/cmu_custom_run <-h | -l | -t command_title [-f nodefile]> DESCRIPTION Perform custom defined commands on a group of nodes or all nodes. The same custom defined commands are also available from the GUI.
PAGE 197
cmu_clone(8) NAME cmu_clone -- Clone nodes in a logical group. SYNOPSIS # /opt/cmu/bin/cmu_clone <-n | -f nodelistfile> <-i imagename> [-s summarylog] [-b] [-p] [-r] DESCRIPTION Clone the specified node or nodelist in the specified logical group.
PAGE 198
cmu_backup(8) NAME cmu_backup -- Issue backup commands directly from the Linux shell. SYNOPSIS # /opt/cmu/bin/cmu_backup <-h> | <-l logical_group -n compute_nodename-p "partition_list" | -r root_partition_number> [-e log_file] DESCRIPTION Create a backup image.
PAGE 199
cmu_scan_macs(8) NAME cmu_scan_macs -- Scan IP addresses and create HP Insight CMU node definitions. SYNOPSIS # /opt/cmu/bin/cmu_scan_macs -h [-p ] -i -m -t [-b [-n ] | -b ] [-f ] [-N ] [-s ] [-S ] [-o ] If no options are specified, then they are gathered through an interactive session.
PAGE 200
values to be generated for %xi and the IP since intervening slots without cartridges won't effect their values. -p hostname_prefix If this option is specified, the host name specified in -h must be a fixed string and have a numeric suffix. For example, 'n01', 'node_01', 'zeus001'. The suffix is incrementally increased to create subsequent host names. The hostname_prefix is the leading characters intended to be common to all compute node names.
PAGE 201
# /opt/cmu/bin/cmu_scan_macs -h node%i -i 1.2.3.4 -m 255.255.0.0 -t ILO -b 3.4.5.6 -n 128 node1 1.2.3.4 255.255.0.0 00-1C-C4-AB-06-56 default 3.4.5.6 ILO x86_64 -1 -1 node2 1.2.3.5 255.255.0.0 00-1F-29-66-4C-F2 default 3.4.5.7 ILO x86_64 -1 -1 . . Example 2 In this ILOCM example, each node in the chassis is connected to the HP Insight CMU management network through NIC2. Three ILOCM addresses are scanned and a node definition is created for every discovered node using the MAC address of NIC2.
PAGE 202
. . Example 4 Adding the option -S 0 to the command line of Example 3 forces %i and the IP generation to effectively ignore slots without cartridges, resulting in sequential values for %i and the IP addresses. # /opt/cmu/bin/cmu_scan_macs -h n%i_C%c_N%n -i 1.2.3.1 -m 255.255.0.0 -t ILOCM \ -b 10.84.202.42 -S 0 -o nodes.dat n01_C01_N1 n02_C01_N2 n03_C01_N3 n04_C01_N4 n05_C03_N1 n06_C03_N2 n07_C03_N3 n08_C03_N4 . . 202 1.2.3.1 1.2.3.2 1.2.3.3 1.2.3.4 1.2.3.5 1.2.3.6 1.2.3.7 1.2.3.8 255.255.0.0 255.255.0.
PAGE 203
cmu_rescan_mac(8) NAME cmu_rescan_mac -- Rescan the MAC address of a node. SYNOPSIS # /opt/cmu/tools/cmu_rescan_mac -n nodename [N NIC_num] [-h] DESCRIPTION Use this command if you replace a failing node. After node replacement, you can add the new MAC address of the node into the HP Insight CMU database using /opt/cmu/tools/ cmu_rescan_mac. OPTIONS -n nodename the node name in the HP Insight CMU database -N NIC_num (ILOCM only) Indicates which of the node's NICs is attached to the admin network.
PAGE 204
cmu_mod_node(8) NAME cmu_mod_node -- Modify node(s) in the HP Insight CMU database.
PAGE 205
-E|--serial-port-speed serial_port_speed node serial port speed -V|--vendor-args vendor_args node vendor args -D|--cloning-block-device cloning_block_device node cloning block device -C|--cartridge num (ILOCM only) cartridge number within chassis -N|--node-number num (ILOCM only) node number within the cartridge EXAMPLES Command line mode: # /opt/cmu/bin/cmu_mod_node -H cn0006 -I 16.16.184.116 -M 255.255.254.0 -A 00-02-A5-52-EB-F8 -L default -G 192.168.0.1 -R x86_64 -P generic processing 1 node ...
PAGE 206
processing 4 nodes...
PAGE 207
cmu_monstat(8) NAME cmu_monstat -- Use monitoring to list sensors and alerts.
PAGE 208
--all-lg Select all logical groups. --all-ne Select all network entities --all-ug Select all user groups --lg=lg1,lg2,... Specify the logical group(s) names or range. --ne=ne1,ne2,... Specify the network entity names or range. --nodes=node1,node2,... Specify the node(s) names or range. --ug=ug1,ug2,... Specify the user group(s) names or range.
PAGE 209
cmu_image_open(8) NAME cmu_image_open -- Open an existing backup image for modification. SYNOPSIS # /opt/cmu/bin/cmu_image_open <-h | -i imagename> DESCRIPTION Open an existing HP Insight CMU backup image for modification.
PAGE 210
cmu_image_commit(8) NAME cmu_image_commit -- Save a backup image previously expanded with cmu_image_open. SYNOPSIS # /opt/cmu/bin/cmu_image_commit <-h | -i imagename [-n new_image_name]> DESCRIPTION Saves an HP Insight CMU backup image that was previously expanded with the cmu_image_open command.
PAGE 211
cmu_config_nvidia(8) NAME cmu_config_nvidia -- Configure NVIDIA GPU monitoring. SYNOPSIS # /opt/cmu/bin/cmu_config_nvidia <-h | -r | -n numGPUs> Where numGPUs specifies the number of GPUs in each client. DESCRIPTION This command configures NVIDIA GPU monitoring metrics in the HP Insight CMU /opt/cmu/ etc/ActionAndAlertsFile.txt file. Restart HP Insight CMU monitoring after using this command.
PAGE 212
cmu_config_amd(8) NAME cmu_config_amd -- Configure AMD GPU monitoring. SYNOPSIS # /opt/cmu/bin/cmu_config_amd <-h | -n numGPUs> Where numGPUs specifies the number of GPUs in each client. DESCRIPTION This command configures AMD GPU monitoring metrics in the HP Insight CMU /opt/cmu/etc/ ActionAndAlertsFile.txt file. Restart HP Insight CMU monitoring after using this command.
PAGE 213
cmu_config_intel(8) NAME cmu_config_intel -- Configure Intel coprocessor monitoring. SYNOPSIS # /opt/cmu/bin/cmu_config_intel <-h | -r | -n> DESCRIPTION This command configures Intel coprocessor monitoring metrics in the HP Insight CMU /opt/cmu/ etc/ActionAndAlertsFile.txt file. These metrics can subsequently be removed using the -r option. Restart HP Insight CMU monitoring after using this command.
PAGE 214
cmu_mgt_config(8) NAME cmu_mgt_config -- Configure or test a set of Linux components required by HP Insight CMU. SYNOPSIS # /opt/cmu/bin/cmu_mgt_config [-c] [-t] [-d] [ IP | eth [:num2] | bond] [-h] [-i] [-n num] [-s step,...] DESCRIPTION cmu_mgt_config attempts to configure (-c) or test (-t) a collection of Linux components required by HP Insight CMU. cmu_mgt_config can be run repeatedly without adversely affecting already configured components.
PAGE 215
rpms|packages Check for all rpms/packages required by HP Insight CMU. sshd Check and configure sshd. ssh_key Check for existence of the root ssh key or create one. firewall Check and optionally disable the firewall. tftp Check and configure tftp. nfs Check and configure NFS. dhcp Check and configure DHCP listening interface. samba Check and configure Samba. (Requires HP Insight CMU Microsoft Windows support.) java Check for required Java configuration. license Check for a valid HP Insight CMU license.
PAGE 216
cmu_firmware_mgmt(8) NAME cmu_firmware_mgmt -- Verify and execute firmware SYNOPSIS # /opt/cmu/bin/cmu_firmware_mgmt [-h] [-d -f [-o"cmudiff_parameters"]] | [-c -f ] | [-v -f ] | [-u -f ] DESCRIPTION cmu_firmware_mgmt performs the following operations: • Display BIOS settings for the specified nodes • Display BIOS version for the specified nodes • Execute the firmware executable on the specified nodes OPTIONS -h Print this help text
PAGE 217
cmu_monitoring_dump(8) NAME cmu_monitoring_dump -- Dump archived monitoring data. SYNOPSIS # /opt/bin/cmu_monitoring_dump -n -m -t0 -t1 [-f ] | -a DESCRIPTION Dump archived monitoring data for a given set of nodes, metrics, and a time interval. OPTIONS -h Display this help text. -n | --nodes= List of node names. Supports condensed form 'node[42-84]'. -m | --metrics= A comma-separated list of metric names.
PAGE 218
cmu_rename_archived_user_group(8) NAME cmu_rename_archived_user_group -- Set group name to archived user group. SYNOPSIS # /opt/bin/cmu_rename_archived_user_group [-h] | -i group_id -n group_name [-t timeout] DESCRIPTION Set name to archived user group with ID . Use "cmu_show_archived_user_groups -f" to find the group ID. OPTIONS -h Display this help text. -i ID of archived user group to rename. To find the ID, use "cmu_show_archived_user_groups -f".
PAGE 219
Glossary administration disk The disk located on the image server on which HP Insight CMU is installed. A dedicated space can be allocated to the cloned images. administration network The private network within the system that is used for administrative operations. clone image The compressed image of the installation from the master disk. One clone image is needed for each logical group.
PAGE 220
2. A software package that is capable of being installed or removed with the RPM software package management. secondary server A dedicated node in a network entity where the cloned image is temporarily stored. The cloned image is propagated only to the other nodes that are defined inside the entity. target disk The hard drive on a target node where the cloned image is installed. target node A compute node that will receive the cloned image from a secondary server.
PAGE 221
Index A action files, 97 actionsandalerts.
PAGE 222
E firmware installing, 127 upgrading, 127 firmware management, 126 firmware requirements, 15 GUI client, 42 Linux API, 138 log files, 162 logical group management, 51 logical groups deleting, 52 modifying, 52 renaming, 52 login privileges, 22 G M glossary, 219 group status, 87 GUI architecture, 40 customizing menu, 127 monitoring, 86 GUI client installing, 41 Linux, 42 starting, 40 GUI menu, 40 GUI problems, 164 management card connection, 119 management cards configure, 15 modifying alert reactions,
PAGE 223
P parameters examples, 16 pdcp, 125, 138 pdsh, 121, 138 power off, 119 preconfiguration, 62 provisioning, 51 R RAID configuration, 15 reactions, 96 reboot, 120 reconfiguration, 63 related information, 159 remote hardware control API, 145 renaming logical groups, 52 renaming user groups, 126 rescan MAC, 65 restore database, 39 restoring configuration, 36 RHEL autoinstall customization, 58 deleting, 126 renaming, 126 V virtual serial port connection, 119 W Windows backup and cloning limitations, 20 clonin