HP-UX Workload Manager User's Guide

ManualsBrandsHP ManualsSoftwareHP-UX 11i Workload Management (gWLM/WLM) Software

HP-UX Workload Manager

User’s Guide

Version A.03.02.02

Manufacturing Part Number: B8844-90014

January 2007

Summary of content (536 pages)

PAGE 2
Publication history Tenth edition January 2007 B8844-90014 HP-UX 11i v3 Ninth edition September 2006 B8844-90012 HP-UX 11i v1, HP-UX 11i v2, and HP-UX 11i v3 Eighth edition March 2006 B8844-90010 HP-UX 11i v1 and HP-UX 11i v2 Seventh edition May 2005 B8844-90008 HP-UX 11i v1 and HP-UX 11i v2 Sixth edition March 2004 B8844-90006 HP-UX 11.0, HP-UX 11i v1, and HP-UX 11i v2 Fifth edition June 2003 B8844-90005 HP-UX 11.0, HP-UX 11i v1, and HP-UX 11i v2 Fourth edition June 2002 B8844-90003 HP-UX 11.
PAGE 3
First edition April 2000 B8844-90001 HP-UX 11.
PAGE 4
Notice  Copyright 2000-2006 Hewlett-Packard Development Company, L.P. All Rights Reserved. Reproduction, adaptation, or translation without prior written permission is prohibited, except as allowed under the copyright laws. The information contained in this document is subject to change without notice. Hewlett-Packard makes no warranty of any kind with regard to this material, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose.
PAGE 5
or profits, whether in an action of contract, negligence or other tortious action, arising out of or in connection with the use or performance of this software.
PAGE 6
PAGE 7
Contents Preface System platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Associated software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Document Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . New in this edition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
PAGE 8
Contents Using WLM on multiple servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Using WLM with HP Integrity Virtual Machines (Integrity VM) . . . . . . . . . . . . . . . . WLM and Process Resource Manager (PRM). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . WLM product directory structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 58 59 60 2.
PAGE 9
Contents Optimize use of Temporary Instant Capacity (TiCAP). . . . . . . . . . . . . . . . . . . . . . . Integrate with various third-party products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Status information WLM provides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Monitoring WLM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
PAGE 10
Contents Controlling system resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Specifying the WLM parser version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 Notification of ‘Instant Capacity needed’ / Pay per use optimization . . . . . . . . . . . . . 144 System-wide settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 Defining the PRM components (optional). .
PAGE 11
Contents Specifying when the SLO is active (optional) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Tuning the metrics and the SLOs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 Tuning WLM using the WLM GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 Specifying a data collector (optional) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 Specifying the WLM interval (optional) . . . . . .
PAGE 12
Contents 6. Auditing and billing Example wlmaudit report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 Audit data files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 Enabling auditing at reboot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 7. Managing SLOs across partitions Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
PAGE 13
Contents metric_condition.wlm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . par_manual_allocation.wlm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . par_manual_allocation.wlmpar. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . par_usage_goal.wlm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . par_usage_goal.wlmpar . . . .
PAGE 14
Contents wlmgui . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . wlminfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . wlmpard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . wlmrcvdc. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
PAGE 15
Contents Install WLM Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Restart WLM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Retrieve WLM Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rotate WLM Log Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rotate Statistics Log Files . . . . . . . . . . . .
PAGE 16
Contents For more SAPTK information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Integrating with SAS software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Why use SASTK? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tools in SASTK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . How do I get started with SASTK? . .
PAGE 17
Contents Specifying a shares-per-metric allocation request (optional) . . . . . . . . . . . . . . . . . . Providing CPU resources in proportion to a metric . . . . . . . . . . . . . . . . . . . . . . . Specifying a data collector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Capturing your collectors’ stderr (optional) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Smoothing metric values (optional) . . . . . . . . . . . . . . . . . . . . . . . . .
PAGE 18
Contents 18
PAGE 19
Tables Table 1-1. Performance and resource utilization monitoring methods . . . . . . . . . . . .34 Table 1-2. Performance controlling methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .36 Table 1-3. WLM directories and files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .60 Table 2-1. WLM installation directories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .75 Table 2-2. Example WLM configurations . . . . . . . . . . . . . . . . . .
PAGE 20
Tables 20
PAGE 21
Figures Figure 3-1. WLM overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .115 Figure 3-2. CPU allocation: the rising tide model . . . . . . . . . . . . . . . . . . . . . . . . . . .123 Figure 3-3. Server without WLM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .124 Figure 3-4. Server with WLM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .125 Figure 5-1. HP-UX WLM Configuration Wizard . . .
PAGE 22
Figures 22
PAGE 23
Preface This document describes the Version A.03.02.02 release of HP-UX Workload Manager (HP-UX WLM). The intended audience for this document is system administrators. System platform HP-UX WLM A.03.02 runs under the following HP-UX operating systems and hardware: Operating Systems Hardware HP-UX 11i v1 (B.11.11) HP 9000 servers HP-UX 11i v2 (B.11.23) HP 9000 servers and Integrity servers HP-UX 11i v1 (B.11.11) and HP-UX 11i v2 (B.11.
PAGE 24
If you plan to use configuration files based on Process Resource Manager (PRM), ensure that version C.03.00 or later of PRM is installed. To take advantage of the latest updates to WLM, use the latest version of PRM (C.03.02 or later). PRM is necessary for managing processor sets (PSETs) or Fair Share Scheduler (FSS) groups, which are confined within a specific instance of HP-UX. If you plan to use WLM to manage host-based configurations only, PRM is not necessary.
PAGE 25
• Chapter 5 explains how to configure WLM. It describes the WLM configuration file and the general syntactic conventions to use, and it explains how to define the various WLM components. This chapter also explains how to use the WLM graphical user interface to configure WLM and how to activate the configuration file so that WLM manages the system. • Chapter 6 describes WLM auditing and billing. It explains how to enable WLM to audit data and how you can display that data.
PAGE 26
• Appendix G explains how to convert PRM configuration files to WLM configuration files if you want to migrate from PRM to WLM. • Appendix H explains advanced WLM usage pertaining to metric goals and collecting performance data. • The Glossary defines key terms used in this document.
PAGE 27
New in this edition This section lists the new or changed functionality for WLM A.03.02 and WLM A.03.02.02. WLM A.03.02 supports HP-UX 11i v1 (B.11.11) and HP-UX 11i v2 (B.11.23). WLM A.03.02.02 supports HP-UX 11i v3 (B.11.31). • WLM A.03.02.02 supports HP-UX 11i v3 (B.11.31). • WLM A.03.02.02 supports the logical CPU (Hyper-Threading) feature, which is available starting with HP-UX 11i v3 (B.11.31) for processors designed to support the feature and that have the appropriate firmware installed.
PAGE 28
With Hyper-Threading disabled, each core is seen as a CPU. With Hyper-Threading enabled, each core can be seen as multiple, logical CPUs. NOTE • The wlminfo par and wlminfo host commands now explicitly display core statistics, such as in the following display for the wlminfo par command: Hostname • Intended Cores Cores Cores Used Interval The wlminfo group command now displays memory utilization of all groups in the current deployed configuration.
PAGE 29
• Temporary Instant Capacity (TiCAP) activates capacity in a temporary “calling-card fashion,” such as in 20-day or 30-day increments (where a day equals 24 hours for one core). With Temporary Instant Capacity on the system, any number of Instant Capacity cores can be activated as long as your prepaid temporary capacity time has not expired. By default, if 15 or fewer processing days are available, WLM stops activating Temporary Instant Capacity.
PAGE 30
Support and patch policies The following Web site site provides information on WLM’s support policy and patch policy: http://www.hp.com/go/wlm These policies indicate the time periods for which this version of WLM is supported and patched. Training HP offers a course in HP-UX resource management using WLM. For information, including a course outline, visit: http://www.hp.com/education/courses/u5447s.html Notational conventions This section describes notational conventions used in this book.
PAGE 31
Curly brackets ({}), Pipe (|) In command syntax diagrams, text surrounded by curly brackets indicates a choice. The choices available are shown inside the curly brackets, separated by the pipe sign (|). The following command example indicates that you can enter either a or b: command {a | b} Horizontal ellipses (...) In command examples, horizontal ellipses show repetition of the previous items. wlminfo(1M) A manpage. The manpage name is wlminfo, and it is located in Section 1M.
PAGE 32
• Managing MC/ServiceGuard manual • GlancePlus User’s Guide manual (available through the Help in the gpm interface to GlancePlus) • Managing Systems and Workgroups: A Guide for HP-UX System Administrators manual These manuals, along with many other Hewlett-Packard manuals, are available at the following Web site: http://docs.hp.com NOTE WLM manpages are also available at the following Web site: http://www.hp.
PAGE 33
1 Introduction This chapter introduces the basic concepts of performance management, workload management, and HP-UX Workload Manager (WLM).
PAGE 34
Introduction Performance and resource management overview Performance and resource management overview Performance management is necessary to keep users satisfied and to ensure that business-critical applications and transactions have the resources they need. Resource management is necessary to help companies use computing resources more efficiently and effectively, and to reduce administration costs.
PAGE 35
Introduction Performance and resource management overview Table 1-1 Performance and resource utilization monitoring methods Method Predict performance by monitoring a particular application’s resource usage Directly monitor through performance metrics (for example, response time or throughput) obtained from the application Advantages • Detects faulty applications that are over-consuming resources • Usage trends can be used to predict future loads • Exact performance metrics • Can be used pro-acti
PAGE 36
Introduction Performance and resource management overview After determining which methods you want to use for monitoring performance and resource utilization, decide how to control performance and resource utilization. Table 1-2 examines the advantages and disadvantages of various control methods.
PAGE 37
Introduction Performance and resource management overview Table 1-2 Performance controlling methods (Continued) Method Multiple workloads: Variable resource allocations based on usage Example: PRM without capping enabled Multiple workloads: Variable resource allocations based on actual, reported performance Advantages • • Increased level of work is handled automatically • Excess resources can easily be tracked with PRM and GlancePlus tools • Consistent performance levels are maintained automatical
PAGE 38
Introduction Performance and resource management overview Table 1-2 Performance controlling methods (Continued) Method Advantages Multiple workloads: Variable resource allocations based on CPU utilization per workload • Optimal resource utilization is maintained automatically • Workloads can be prioritized to ensure that high-priority workloads are guaranteed CPU resources as needed • Excess resources can easily be tracked with PRM and GlancePlus tools enabling reserve capacity to be deployed to sa
PAGE 39
Introduction What is workload management? What is workload management? System management typically focuses on monitoring the availability of systems. While system availability is certainly important, it often neglects the complexity of modern systems and computing environments, such as partitioning capabilities, utility pricing resources, and clustering.
PAGE 40
Introduction What is workload management? Workload management ensures a service’s availability through service-level management. This type of management is based on the following components: IT service management Strategy for defining, controlling, and maintaining required levels of IT (Information Technology) service to the end user. Service-level agreements (SLAs) SLAs define the service levels IT is expected to deliver to the end user. Service-level objectives (SLOs) Goals Derived from the SLAs.
PAGE 41
Introduction What is HP-UX Workload Manager? The following steps outline how to define a service-level agreement: 1. Establish an inventory of system resources (people, CPU resources, disk space, and so forth) that serve the end users. 2. Determine how much of the inventory of system resources is currently being consumed to support the present set of applications. 3. Determine what the end users require to maintain the status quo if they are already receiving acceptable service. 4.
PAGE 42
Introduction What is HP-UX Workload Manager? Environment (VSE), WLM integrates virtualization techniques—including partitioning, resource management, clustering, and utility pricing resources—and links them to the SLOs and business priorities. WLM enables a virtual HP-UX server to grow and shrink automatically based on the demands and SLOs for each application it hosts.
PAGE 43
Introduction What is HP-UX Workload Manager? feature for PSET-based groups. WLM automatically sets the Hyper-Threading state for the default PSET to optimize performance. (The default PSET, also known as PSET 0, is where all FSS groups reside.
PAGE 44
Introduction What is HP-UX Workload Manager? over SLOs with a lower priority. Once configured, WLM then automatically manages CPU resources to satisfy the SLOs for the workloads. In addition, you can integrate WLM with HP Serviceguard to allocate resources in a failover situation according to defined priorities (for more information on integrating with HP Serviceguard, see “Integrating with Serviceguard” on page 414).
PAGE 45
Introduction What is HP-UX Workload Manager? You can assign secure compartments to workload groups, creating the secure compartments with the HP-UX feature Security Containment. Secure compartments isolate files and processes. WLM can then automatically allocate resources for these secure compartments. When you configure WLM, you define one or more SLOs for each workload group and prioritize them.
PAGE 46
Introduction Why use Workload Manager? • Disk bandwidth (within single HP-UX instances only) Ensures that each workload group is granted at least its share of disk bandwidth. • Memory (within single HP-UX instances only) Ensures that each workload group is granted at least its minimum, but (optionally) no more than its capped amount of real memory.
PAGE 47
Introduction Why use Workload Manager? • Run multiple workloads on a single system and maintain performance of each workload • Prioritize workloads on a single system, adjusting the CPU allocations based on each workload’s goals • Ensure that critical workloads have sufficient resources to perform at desired levels • Manage by service-level objectives (SLOs) within and across virtual partitions or nPartitions • Adjust resource allocations by automatically enabling or disabling SLOs based on time of
PAGE 48
Introduction Why use Workload Manager? page 135. Information about configuring WLM to manage resources and application performance across partitions is provided in Chapter 7, “Managing SLOs across partitions,” on page 255. Service-level objectives (SLOs) A key reason for using WLM is its ability to manage service-level objectives. After defining a workload, you can specify one or more SLOs for each workload.
PAGE 49
Introduction Why use Workload Manager? • Optional conditions, such as time of day or a particular event • Optional lower and/or upper bounds for CPU resources A shares-based SLO consists of the same elements except it does not include a goal but rather a shares allocation. For more information comparing shares-based and goal-based SLOs, see “Shares-based SLOs vs goal-based SLOs” on page 118. Prioritized SLOs Another important reason for using WLM is that it allows you to prioritize the SLOs.
PAGE 50
Introduction WLM and partitions WLM and partitions The HP Partitioning Continuum offers several forms of partitioning: • Hard partitions These partitions are electronically isolated through hardware separation. One such partition is a complete server, which can be clustered in an HP Serviceguard high availability cluster. The other type of hard partition, called nPartition, is a portion of a single server.
PAGE 51
Introduction What is the ideal environment for WLM? instance of HP-UX and consolidate multiple workloads within that instance. PRM can be used within, but not across, hard partitions and virtual partitions. You can use WLM to manage resource partitions (WLM creates and manages its own PRM configuration, but PRM must be installed on the same system).
PAGE 52
Introduction Examples of solutions that WLM provides • You run Serviceguard and need to ensure proper prioritization of workloads after a failover. • You want more control over resource allocation than PRM provides. Examples of solutions that WLM provides The following sections provide examples of how WLM SLOs can provide a wide variety of business solutions. The SLOs are outlined without including the necessary configuration file syntax.
PAGE 53
Introduction Examples of solutions that WLM provides Priority. 1 CPU shares. 800 shares Condition. 15th or 28th Reserving CPU resources based on a condition or event This SLO is enabled only part time and conditionally. Rather than allocating CPU resources on a specific date or time, they are allocated when a specified condition is met, in this case when the system accounting program is running.
PAGE 54
Introduction Examples of solutions that WLM provides SLOs that dynamically allocate resources based on usage goals The solutions in this section illustrate SLOs based on usage goals. In each case, resources are allocated dynamically, based on current demand or utilization. When the demand is high enough, more resources are allocated for the workload. When the demand falls below a certain point, unused resources can be made available for other workloads.
PAGE 55
Introduction Examples of solutions that WLM provides When the Development workload is busy, the following priority 1 SLO ensures that the workload gets 30% of the server’s CPU resources (the 30 shares funded by Development). When the workload is not busy, excess resources are available for sharing as long as the workload is getting at least 15% of the resources. Workload. Development Priority. 1 Usage goal. Match CPU allocation to consumption Min CPU. 15 shares Max CPU.
PAGE 56
Introduction Examples of solutions that WLM provides Condition. Time is between 10pm and 4am Automatically resizing virtual partitions WLM enables you to automate the resizing of partitions. You can adjust the partition size by having cores dynamically added and removed in response to varying demands. Consider a system with two virtual partitions. The SLO for the Apps workload in partition 1 has a higher priority than the SLO for the Dev workload in partition 0.
PAGE 57
Introduction Examples of solutions that WLM provides Workload. Production (nPartition 0) Priority. 1 Goal. Match CPU allocation to consumption The SLO for the Test workload in nPartition 1 is outlined as follows: Workload. Test (partition 1) Priority. 2 Usage goal.
PAGE 58
Introduction Using WLM on multiple servers also that by default WLM does not activate Temporary Instant Capacity when 15 or fewer processing days of temporary capacity are available. You can change this default by setting the WLM global arbiter utility_reserve_threshold keyword. For more information on the utilitypri and utility_reserve_threshold keywords, see “Setting up your WLM global arbiter configuration file” on page 265 or see wlmparconf(4).
PAGE 59
Introduction WLM and Process Resource Manager (PRM) WLM and Process Resource Manager (PRM) When managing PSETs or FSS groups, the Workload Manager (WLM) and Process Resource Manager (PRM) products both work by modifying and enabling a PRM configuration. WLM uses PRM when a prm structure is included in the WLM configuration. With such configurations, you can use PRM’s informational and monitoring commands such as prmlist and prmmonitor. You can also use the prmrun and prmmove commands, among others.
PAGE 60
Introduction WLM product directory structure WLM product directory structure The WLM directories and files included in the product installation are described in Table 1-3. The table is not a full listing, but presents the most important files. NOTE In addition to the WLM-specific files, the WLM installation ensures that the HP-UX WLM Toolkits (WLMTK) A.01.10.xx are installed. For information on the WLMTK files, see the HP-UX Workload Manager Toolkits User’s Guide (/opt/wlm/toolkits/doc/WLMTKug.pdf).
PAGE 61
Introduction WLM product directory structure Table 1-3 WLM directories and files (Continued) Directory/file Description /opt/wlm/bin/wlmsend Command-line/scripting interface for sending metric data to WLM /opt/wlm/bin/wlmcomd Communications daemon /opt/wlm/lbin/coll/glance_app Command used with wlmrcvdc to extract GlancePlus application metrics /opt/wlm/lbin/coll/glance_gbl Command used with wlmrcvdc to extract GlancePlus global system metrics /opt/wlm/lbin/coll/glance_prm Command used with wlmr
PAGE 62
Introduction WLM product directory structure Table 1-3 WLM directories and files (Continued) Directory/file Description /sbin/init.d/wlm Start/stop script /etc/rc.config.d/wlm File to set environment variables used by the start/stop script /opt/wlm/lib/libwlm.sl WLM library that provides the API for data collectors to send data to WLM /opt/wlm/include/wlm.h Header file for data collectors /opt/wlm/share/man/man1m.
PAGE 63
Introduction WLM product directory structure Table 1-3 WLM directories and files (Continued) Directory/file /opt/wlm/share/doc/howto/ Description Contains white papers on how to perform WLM tasks, including: • perfmon.html Writing a Better WLM data collector • config.html Configuring HP-UX Workload Manager A.02.00 (although this paper was written for an older version of WLM, it may still be relevant for certain environments) • tuning.html Tuning HP-UX Workload Manager • flexcap.
PAGE 64
Introduction WLM product directory structure Table 1-3 WLM directories and files (Continued) Directory/file Description /opt/wlm/config/tunables Master tunables file (read-only: do not edit) /opt/wlm/examples/wlmconf/READMEa Information text file for the wlmconf directory /opt/wlm/examples/wlmconf/a Example WLM configuration files /opt/wlm/examples/dsi/wlmdstats/READMEa Information text file for the DSI-wlmdstats example directory; the /opt/wlm/examples/dsi/wlmdstats directory provides information
PAGE 65
WLM quick start: the essentials for using WLM 2 WLM quick start: the essentials for using WLM This chapter presents an overview of the techniques and tools available for using WLM. It serves as a condensed version of this entire user’s guide, exposing you to the essentials and allowing you to quickly get started using WLM. The terminology of WLM as well as WLM concepts and the syntax of the WLM configuration files are not covered here. That information is discussed in the remainder of the book.
PAGE 66
WLM quick start: the essentials for using WLM Network operating environment Network operating environment WLM’s network interfaces are designed to operate correctly to defend against attacks in a moderate to high threat environment, such as a DMZ. You may use network protections, such as firewalls, to provide an additional level of defense and to give you additional time to react when a security loophole is found. NOTE As of A.03.
PAGE 67
WLM quick start: the essentials for using WLM WLM shown in action WLM shown in action This section provides a quick overview of various commands associated with using WLM in a PRM-based configuration (using FSS or PSET-based workload groups and confined to a single instance of HP-UX). For information on using WLM in host-based configurations designed for moving cores across virtual partitions and nPartitions, see Chapter 7, “Managing SLOs across partitions,” on page 255.
PAGE 68
WLM quick start: the essentials for using WLM WLM shown in action Step 2. Start WLM with an example configuration file: # /opt/wlm/bin/wlmd -a /opt/wlm/examples/userguide/multiple_groups.wlm The example configuration file is multiple_groups.wlm, which does the following: 1. Defines two workload groups: g2 and g3. 2. Assigns applications (in this case, perl programs) to the groups. (With shell/perl programs, give the full path of the shell or perl followed by the name of the program.) The two programs loop2.
PAGE 69
WLM quick start: the essentials for using WLM WLM shown in action 5. Defines a priority 1 SLO for g3 that requests 20 CPU shares. # # # # # # # # # # # # # # # # # # Name: multiple_groups.wlm Version information: $Revision: 1.10 $ Dependencies: This example was designed to run with HP-UX WLM version A.01.02 or later. It uses the cpushares keyword introduced in A.01.02 and is, consequently, incompatible with earlier versions of HP-UX WLM.
PAGE 70
WLM quick start: the essentials for using WLM WLM shown in action Step 3. See what messages a WLM startup produces. Start another session to view the WLM message log: # tail -f /var/opt/wlm/msglog 08/29/06 08:35:23 08/29/06 08:35:23 08/29/06 08:35:23 08/29/06 08:35:23 ups.wlm 08/29/06 08:35:23 hpux_11.00 08/29/06 08:35:23 08/29/06 08:35:23 e_groups.
PAGE 71
WLM quick start: the essentials for using WLM WLM shown in action # /opt/prm/bin/prmlist PRM configured from file: File last modified: /var/opt/wlm/tmp/wmprmBAAa06335 Thu Aug 24 08:35:23 2006 PRM Group PRMID CPU Upper LCPU Entitlement Bound Attr ------------------------------------------------------------------OTHERS 1 65.00% g2 2 15.00% g3 3 20.
PAGE 72
WLM quick start: the essentials for using WLM WLM shown in action a. WLM checks the files /etc/shells and /opt/prm/shells to ensure one of them lists each shell or interpreter, including perl, used in a script. If the shell or interpreter is not in either of those files, WLM ignores its application record (the workload group assignment in an apps statement).
PAGE 73
WLM quick start: the essentials for using WLM WLM shown in action Use the PID for loop.pl from the last step to move loop.pl to the group g3: # /opt/prm/bin/prmmove g3 -p loop.pl_PID In this case, loop.pl_PID is 6793. Step 7.
PAGE 74
WLM quick start: the essentials for using WLM WLM shown in action This output shows that both groups are using CPU resources (cores) up to their allocations. If the allocations were increased, the groups’ usage would probably increase to match the new allocations. Step 9. Stop WLM: # /opt/wlm/bin/wlmd -k Step 10.
PAGE 75
WLM quick start: the essentials for using WLM Where WLM is installed Where WLM is installed The following table shows where WLM and some of its components are installed.
PAGE 76
WLM quick start: the essentials for using WLM Seeing how WLM will perform without actually affecting your system For example, with passive mode, you can determine: • How does a cpushares statement work? • How do goals work? Is my goal set up correctly? • How might a particular cntl_convergence_rate value or the values of other tunables affect allocation change? • How does a usage goal work? • Is my global configuration file set up as I wanted? If I used global arbitration on my production system, w
PAGE 77
WLM quick start: the essentials for using WLM Starting WLM Starting WLM Before starting WLM (activating a configuration), you may want to try the configuration in passive mode, discussed in “Seeing how WLM will perform without actually affecting your system” on page 75. Otherwise, you can activate your configuration by logging in as root and running the following command, substituting your configuration file’s name for config.wlm.: # /opt/wlm/bin/wlmd -a config.
PAGE 78
WLM quick start: the essentials for using WLM Creating a configuration file Creating a configuration file The WLM configuration file is simply a text file. To create your own WLM configuration file, use one or more of the following techniques: • Determine which example configurations can be useful in your environment and modify them appropriately. For information on example configurations, see “Where to find example WLM configurations” on page 79.
PAGE 79
WLM quick start: the essentials for using WLM The easiest way to configure WLM The easiest way to configure WLM The easiest and quickest method to configure WLM is to use the WLM configuration wizard. NOTE Set your DISPLAY environment variable before starting the wizard. Usage of the wizard requires that the appropriate version of PRM is installed on your system. To start the wizard, run the following command: # /opt/wlm/bin/wlmcw The wizard provides an easy way to create initial WLM configurations.
PAGE 80
WLM quick start: the essentials for using WLM How WLM controls applications Table 2-2 Example WLM configurations (Continued) For See example WLM configurations in the directory Using WLM with Apache web servers /opt/wlm/toolkits/apache/config/ Using WLM to manage job duration /opt/wlm/toolkits/duration/config/ Using WLM with Oracle databases /opt/wlm/toolkits/oracle/config/ Using WLM with SAP software /opt/wlm/toolkits/sap/config/ Using WLM with SAS software /opt/wlm/toolkits/sas/config/ Usin
PAGE 81
WLM quick start: the essentials for using WLM How to put an application under WLM control — Whole-core: HP-UX processor sets (PSETs) — Sub-core: Fair Share Scheduler (FSS) groups To have resources migrated among workloads as needed, you create one or more SLOs for each workload. (In the case of nPartitions, which represent hardware, the core movement is simulated using Instant Capacity to deactivate one or more cores in one nPartition and then activate cores in another nPartition.
PAGE 82
WLM quick start: the essentials for using WLM How to put an application under WLM control group. However, you can change the workload group in which a particular user’s processes run by adding user records to the WLM configuration file. You can add Unix group records to the configuration file so that the processes running in a specified Unix group are placed in a specific workload group.
PAGE 83
WLM quick start: the essentials for using WLM How to put an application under WLM control Application records: Workload separation by binary name One mechanism for separating workloads is the apps statement. This statement simply names a particular application binary and the group in which it should be placed. You can specify multiple binary-workload group combinations, separated by commas, in a single apps statement.
PAGE 84
WLM quick start: the essentials for using WLM How to put an application under WLM control curly : coders surfers, larry : testers surfers; } Besides the default OTHERS group, this example has three groups of users: testers, coders, and surfers. The user records cause processes started by users moe and curly to be run in group coders by default, and user larry’s processes to be run in group testers by default.
PAGE 85
WLM quick start: the essentials for using WLM How to put an application under WLM control Secure compartments: Workload separation by secure compartment You can place processes in workload groups according to the secure compartments the processes run in. The HP-UX feature Security Containment, available starting with HP-UX 11i v2, allows you to create secure compartments. You specify your mapping between secure compartments and workload groups in the scomp statement.
PAGE 86
WLM quick start: the essentials for using WLM How to put an application under WLM control In the prm structure that follows, the procmap statement causes the PRM application manager to place in the sales group any processes gathered by the ps command that have PIDs matching the application pid_app. The application manager places in the mrktg group any processes gathered by the external script pidsbyapp that have PIDs matching the application mrketpid_app.
PAGE 87
WLM quick start: the essentials for using WLM How to determine a goal for your workload # /opt/prm/bin/prmmove surfers -p 4065 Default: Inheriting workload group of parent process If a process is not named in an apps statement, a users statement, a uxgrp statement, or an scomp statement, or if it is not identified by a procmap statement, or has not been started with prmrun or moved with prmmove, it simply starts and runs in the same group as its parent process.
PAGE 88
WLM quick start: the essentials for using WLM How to determine a goal for your workload Using this configuration, you can directly set an entitlement (allocation) for a workload using the wlmsend command. By gradually increasing the workload’s allocation with a series of wlmsend calls, you can determine how various amounts of CPU resources affect the workload and its performance with respect to some metric that you may want to use in an SLO for the workload.
PAGE 89
WLM quick start: the essentials for using WLM Some common WLM tasks Some common WLM tasks WLM is a powerful tool that allows you to manage your systems in numerous ways. The following sections explain some of the more common tasks that WLM can do for you. Migrating cores across partitions WLM can manage SLOs across virtual partitions and nPartitions. You must use Instant Capacity cores (formerly known as iCOD CPUs) on the nPartitions for WLM management.
PAGE 90
WLM quick start: the essentials for using WLM Some common WLM tasks Step 2. Create a WLM configuration file for each partition. Each partition on the system must have the WLM daemon wlmd running. Create a WLM configuration file for each partition, ensuring each configuration uses the primary_host keyword to reference the partition where the global arbiter is running. For information on the primary_host syntax, see “Setting up your WLM configuration file” on page 264. Step 3.
PAGE 91
WLM quick start: the essentials for using WLM Some common WLM tasks file. You can set up and run the global arbiter configuration on a system that is not managed by WLM if needed for the creation of fault-tolerant environments or Serviceguard environments.) This global arbiter configuration file is required.
PAGE 92
WLM quick start: the essentials for using WLM Some common WLM tasks Activate the global arbiter configuration file configfile in passive mode as follows: # wlmpard -p -a configfile Again, to see approximately how the configuration would affect your system, use the WLM utility wlminfo. Step 7. Activate the global arbiter.
PAGE 93
WLM quick start: the essentials for using WLM Some common WLM tasks Within a single HP-UX instance, WLM allows you to allocate a fixed amount of CPU resources using: • portions of processors (FSS groups) • whole processors (PSETs) You can also allocate a fixed amount of CPU resources to virtual partitions and nPartitions. HP recommends omitting from WLM management any partitions that should have a constant size. In such cases, WLM’s capability of migrating cores across partitions is not needed.
PAGE 94
WLM quick start: the essentials for using WLM Some common WLM tasks NOTE PRM must be installed on your system for WLM to be able to manage FSS groups. Step 1. Define the workload group and assign a workload to it. In your WLM configuration file, define your workload group in a prm structure using the groups keyword. Assign a workload to the group using the apps keyword. The following example defines a group named sales.
PAGE 95
WLM quick start: the essentials for using WLM Some common WLM tasks Whole processors (PSETs) Another method for providing a workload group with a fixed amount of CPU resources is to define the group based on a PSET. The PSETs feature is available for HP-UX 11i v1 (B.11.11) by downloading software from the following Web site free of charge: http://www.hp.com/go/wlm Select the Patches/support link and search for “processor sets”. PSETs are included starting with v2 (B.11.23) of HP-UX 11i.
PAGE 96
WLM quick start: the essentials for using WLM Some common WLM tasks NOTE When WLM is managing PSETs, do not change PSET settings by using the psrset command. Only use WLM to control PSETs. Step 2. Activate the configuration as in the following example, substituting your configuration file’s name for config.wlm: # /opt/wlm/bin/wlmd -a config.
PAGE 97
WLM quick start: the essentials for using WLM Some common WLM tasks SLO is enabled SLO is disabled; workload gets its gmincpu To provide a workload group with CPU resources on a schedule: NOTE This procedure applies only to PRM-based configurations (confined within a single instance of HP-UX). PRM must be installed on your system for WLM to be able to manage PRM-based workloads.
PAGE 98
WLM quick start: the essentials for using WLM Some common WLM tasks prm { groups = sales : 2; apps = sales : /opt/sales/bin/sales_monitor; } Step 2. Define the SLO. The SLO in your WLM configuration file must specify: • A priority (pri) for the SLO • The workload group to which the SLO applies (entity) • Either a cpushares statement or a goal statement so that WLM grants the SLO’s workload group some CPU resources The condition keyword determines when the SLO is enabled or disabled.
PAGE 99
WLM quick start: the essentials for using WLM Some common WLM tasks For information on the syntax for the condition keyword, see wlmconf(4). Step 3. Activate the configuration as in the following example, substituting your configuration file’s name for config.wlm: # /opt/wlm/bin/wlmd -a config.wlm Providing CPU resources as needed To ensure a workload gets the CPU resources it needs—without preventing other workloads access to unused CPU resources—WLM allows you to define usage goals.
PAGE 100
WLM quick start: the essentials for using WLM Some common WLM tasks Here, WLM adjusts the allocation so that the workload always uses at least 60% but never more than 90% of it With a usage goal, you indicate for a workload how much of its CPU allocation it should use. If a workload is not consuming enough of its current allocation, the workload’s CPU allocation is reduced, allowing other workloads to consume more CPU resources if needed.
PAGE 101
WLM quick start: the essentials for using WLM Some common WLM tasks In your WLM configuration file, define your workload group in a prm structure using the groups keyword. Assign a workload to the group using the apps keyword. The following example shows the prm structure for the sales group. prm { groups = sales : 2; apps = sales : /opt/sales/bin/sales_monitor; } Step 2. Define the SLO.
PAGE 102
WLM quick start: the essentials for using WLM Other functions WLM provides tune { wlm_interval = 5; } Step 4. Activate the configuration as in the following example, substituting your configuration file’s name for config.wlm: # /opt/wlm/bin/wlmd -a config.wlm Other functions WLM provides Run in passive mode to verify operation WLM provides a passive mode that allows you to see how WLM will approximately respond to a given configuration—without putting WLM in charge of your system’s resources.
PAGE 103
WLM quick start: the essentials for using WLM Status information WLM provides If you have WLM on a Temporary Instant Capacity system (using v6 or later), you can configure WLM to minimize the costs of using these resources. You do so by optimizing the amount of time the resources are used to meet the needs of your workloads. For more information on this feature, see “Integrating with Temporary Instant Capacity (TiCAP)/ Pay per use (PPU)” on page 410 and wlmparconf(4) and wlmpard(1M).
PAGE 104
WLM quick start: the essentials for using WLM Status information WLM provides WLM prints errors and warnings about configuration file syntax on stderr. For messages about ongoing WLM operations, WLM logs error and informational messages to /var/opt/wlm/msglog, as shown in the following example: 05/07/02 14:12:44 [I] (p13931) "m_apache_access_2min" 05/07/02 14:12:44 [I] (p13931) 05/07/02 14:12:44 [I] (p13932) "m_list.
PAGE 105
WLM quick start: the essentials for using WLM Monitoring WLM # /opt/wlm/bin/wlmpard -a config.wlm -l vpar For information on the -l option, see wlmpard(1M). Monitoring WLM Several methods available for monitoring WLM are described in this section. ps The following ps command has options specific to PRM that WLM uses to define workload groups when dividing resources within a single HP-UX instance: ps [-P] [-R workload_group] • -P Adds the column PRMID, showing the workload group for each process.
PAGE 106
WLM quick start: the essentials for using WLM Monitoring WLM wlminfo The wlminfo command, available in /opt/wlm/bin/, displays information about SLOs, metrics, workload groups, virtual partitions or nPartitions, and the current host. To display information about workload groups, specify the group keyword as in the following example. Note that as of WLM A.03.
PAGE 107
WLM quick start: the essentials for using WLM Monitoring WLM # /opt/wlm/bin/wlminfo host Hostname localhost Cores 2 Cores Used 1.7 Interval 6 For more information on the use of the wlminfo command, see Appendix A, “WLM command reference,” on page 363 and wlminfo(1M). wlmgui The wlmgui command, available in /opt/wlm/bin/, graphically displays information about SLOs, metrics, workload groups, partitions, and the current host.
PAGE 108
WLM quick start: the essentials for using WLM Monitoring WLM prmlist The prmlist command, available in /opt/prm/bin, displays current CPU allocations plus user and application configuration information. The ‘Upper Bound’ column indicates the per-group consumption cap; the column is blank for each group because the CPU consumption cap is not available with WLM.
PAGE 109
WLM quick start: the essentials for using WLM Monitoring WLM Status and message logs WLM provides the following logs: • /var/opt/wlm/msglog • /var/opt/wlm/wlmdstats • /var/opt/wlm/wlmpardstats For information on these logs, including sample output, see the section “Status information WLM provides” on page 103. Event Monitoring Service (EMS) EMS (Event Monitoring Service) polls various system resources and sends messages when events occur.
PAGE 110
WLM quick start: the essentials for using WLM Monitoring WLM 110 Chapter 2
PAGE 111
3 How WLM manages workloads This chapter discusses how workloads can be managed through service-level objectives. It also discusses two types of WLM SLOs: shares-based and goal-based.
PAGE 112
How WLM manages workloads How WLM works How WLM works The following tasks outline how WLM works: 1. Sets initial resource allocations. WLM sets resource allocations for your workloads based on metrics for the workloads. When you first start WLM however, there are no metrics for the workloads.
PAGE 113
How WLM manages workloads How WLM works 5. Arbitrates between workloads when CPU resources are insufficient to meet the needs of all workloads. When CPU resources are not sufficient, certain workloads necessarily will not be able to reach their desired performance levels. In these cases, WLM allocates resources to the associated workloads based on their SLOs’ assigned priorities—allowing the higher-priority SLOs to better meet their goals at the expense of the lower-priority SLOs not meeting their goals. 6.
PAGE 114
How WLM manages workloads How WLM works For more information on the components shown in the figure, see the following sections: • Types of SLOs “Shares-based SLOs vs goal-based SLOs” on page 118 • Metric data collectors “Supplying data to WLM” on page 482 • WLM daemon “wlmd” on page 374 • WLM configuration file Chapter 5, “Configuring WLM,” on page 135 • WLM global arbiter Chapter 7, “Managing SLOs across partitions,” on page 255 • PRM Appendix F, “Understanding how PRM manages resources,” on pag
PAGE 115
How WLM manages workloads How WLM works Figure 3-1 WLM overview Define workloads and SLOs Event Monitoring Service (EMS) WLM configuration file WLM monitoring tools (wlminfo or wlmgui) SLO stats (pass/fail) Message log /var/opt/wlm/msglog WLM daemon (wlmd) Workload Grp A (metric goal) Workload Grp B (usage goal) Usage Data Collector Controller Usage Data Collector Controller Statistics log (optional) /var/opt/wlm/wlmdstats Arbiter Workload Grp C (no goal) Workload Grp D (no goal) Audit data
PAGE 116
How WLM manages workloads How WLM works Referring to Figure 3-1 on page 115, the main WLM functional flow is as follows: 1. The WLM configuration file specifies the goal-based or shares-based SLOs for each workload. This file also provides the pathnames for data collectors. WLM reads the configuration file and starts the data collectors. 2. For each application with a usage goal, WLM creates a controller (an internal component of WLM).
PAGE 117
How WLM manages workloads How WLM works 8. For managing resources within a single HP-UX instance, WLM then creates a new PRM configuration applying the new CPU shares and optional memory shares for the various workload groups. 9. For managing CPU resources (cores) across partitions, the WLM instance on each partition regularly requests from the WLM global arbiter a certain number of cores for its partition.
PAGE 118
How WLM manages workloads Shares-based SLOs vs goal-based SLOs • With the WLM global arbiter configuration file activated using the -l option to wlmpard, the global arbiter adds data to the /var/opt/wlm/wlmpardstats/ statistics log file. Shares-based SLOs vs goal-based SLOs WLM supports two types of SLOs: • Shares-based SLOs A shares-based SLO allows you to specify either a fixed allocation of shares or a shares-per-metric allocation for a workload.
PAGE 119
How WLM manages workloads How WLM gets application data How WLM gets application data You use a data collector for each workload that has a performance goal. You can use one of the WLM-provided data collectors or you can make your own. These collectors report their data to WLM on a regular basis. This data updates the values of metrics used in the WLM configuration.
PAGE 120
How WLM manages workloads How a workload is managed (controllers) How a workload is managed (controllers) When a configuration is activated, WLM instantiates a controller for each SLO that has a performance goal or a usage goal. For SLOs with usage goals, WLM internally tracks the workload’s actual CPU usage versus its CPU allocation. With performance goals, controllers receive metric updates in the form of performance data from data collectors.
PAGE 121
How WLM manages workloads How a workload is managed (controllers) seconds varies in the wrong direction. Similarly, for a goal to have greater than 100 transactions/minute, a reported performance of 80 transactions/minute varies in the wrong direction. Regardless of the direction of underperformance or overperformance, WLM adjusts CPU allocations to more closely match the SLO’s goal. In the case of an SLO violation, however, WLM also sets EMS resources to alert persons monitoring the system.
PAGE 122
How WLM manages workloads Allocating CPU resources: The rising tide model Allocating CPU resources: The rising tide model If all workloads’ demand for CPU resources can be met with current resources, WLM satisfies that demand. If however, demand exceeds supply, WLM uses the “rising tide” model to allocate CPU resources: At a given priority, WLM attempts to raise the allocation of the workload with the lowest CPU allocation to the level of the next lowest allocation.
PAGE 123
How WLM manages workloads Allocating CPU resources: The rising tide model Figure 3-2 illustrates the rising tide model. Moving from left to right within a single priority shows how WLM grants additional CPU resources to the workloads.
PAGE 124
How WLM manages workloads Example of WLM in use Example of WLM in use Consider a server that runs two workloads: • Accounts payable • Accounts receivable The accounts payable and accounts receivable workloads run constantly. Without WLM, the performance of these workloads varies greatly throughout the day, based mainly on the amount of work competing workloads have at any given time.
PAGE 125
How WLM manages workloads Example of WLM in use Figure 3-4 Server with WLM Response time in seconds Accounts payable (AP) Accounts receivable (AR) 4 3 AP goal 2 AR goal 1 Number of transactions In this example, the accounts receivable workload has priority 1 and a response time goal of less than 1 second. The accounts payable workload has priority 2 and a response time goal of less than 2.5 seconds.
PAGE 126
How WLM manages workloads Example of WLM in use 126 Chapter 3
PAGE 127
4 How do I use WLM? This chapter the basic steps needed to use WLM. The remaining chapters explain these steps in detail. The WLM configuration allows you to: • Treat an entire vPar or nPar as a workload • Create workloads based on PSETs or FSS groups to share a system or partition among several workloads You then assign shares-based or goal-based SLOs to the workloads. Optionally, you indicate when each SLO is active.
PAGE 128
How do I use WLM? Steps for using WLM Evaluate the system and workload performance under this configuration from one to seven days and fine-tune the configuration. Step 2. Implement a configuration with metric goals for the workloads. Again, evaluate the system and workload performance under the configuration from one to seven days and fine-tune the configuration. Step 3. If applicable, implement a time-based configuration with goals.
PAGE 129
How do I use WLM? Steps for using WLM NOTE Running the wizard requires Java Runtime Environment version 1.4.2 or later and, for PRM-based configurations, PRM C.03.00 or later. (To take advantage of the latest updates to WLM, use the latest version of PRM available.) The WLM GUI, at /opt/wlm/bin/wlmgui, also allows you to configure WLM without directly editing a configuration file; however, you do need to be familiar with the configuration file syntax.
PAGE 130
How do I use WLM? Steps for using WLM For information on the types of SLOs, see “Shares-based SLOs vs goal-based SLOs” on page 118. Step 4. For workloads with CPU usage goals, add goal-based SLOs to your configuration. For information on usage goals, see “Specifying a goal (optional)” on page 199. Step 5. For workloads with performance goals, add goal-based SLOs to your configuration, as explained in “Configuring WLM for metric-based SLOs” on page 467. Step 6. (Optional) Tune the controllers’ behavior.
PAGE 131
How do I use WLM? Steps for using WLM NOTE When you start WLM by using the /sbin/init.d/wlm script, WLM runs in secure mode by default. However, if you are upgrading WLM and the /etc/rc.config.d/wlm script had been modified prior to the upgrade, ensure that the secure mode variables discussed in “Securing WLM communications” on page 244 are enabled. You also must have set up security certificates and distributed them to all systems or partitions being managed by the same WLM global arbiter (wlmpard).
PAGE 132
How do I use WLM? Steps for using WLM Alternatively, configure EMS monitoring requests that notify you on the death of a data collector. The SLO’s EMS resource: /applications/wlm/slo_status/SLONAME changes to: WLM_SLO_COLLECTOR_DIED (5) Use the EMS configuration interface (available in the SAM or SMH “Resource Management” application group) to set up monitoring requests to watch for this situation. For information about using SMH, see “Configuring EMS notification” on page 360. Step 12.
PAGE 133
How do I use WLM? Reconfiguring WLM Edit the /etc/rc.config.d/wlm file as explained in the sections “Setting WLM to start automatically at reboot” on page 242 and “Setting WLM global arbitration to start automatically at reboot” on page 242. You can also set variables in /etc/rc.config.d/wlm to start logging statistics and generating audit data automatically at reboot. Reconfiguring WLM To fine-tune an existing configuration, follow these steps: Step 1. Edit the WLM configuration file.
PAGE 134
How do I use WLM? Disabling WLM and its global arbiter Disabling WLM and its global arbiter If you want to temporarily return control of your system to the regular HP-UX resource scheduling, enter the following command to kill the WLM daemon: # wlmd -k After a de-activation, you can restart WLM using the last active configuration with the command: # wlmd -A To prevent WLM from starting automatically at reboot, set the WLM_ENABLE variable in the file /etc/rc.config.
PAGE 135
5 Configuring WLM This chapter introduces the WLM configuration file and how to activate the configuration so that WLM manages the system. It covers the creation and activation of the WLM configuration file. The WLM configuration file is the main user interface for controlling WLM. This file is an ASCII file that you can edit with a text editor. WLM does not have a default configuration file. NOTE WLM and PRM have separate configuration files, each with its own syntax.
PAGE 136
Configuring WLM Configuration file syntactic conventions For information on these items, see the following sections: • “Specifying the WLM parser version” on page 144 • “Defining the PRM components (optional)” on page 149 • “Defining SLOs” on page 186 • “Tuning the metrics and the SLOs” on page 210 Configuration file syntactic conventions The following syntactic conventions are used in the WLM configuration file: 136 • SLOs, tunables, and PRM information are represented as structures.
PAGE 137
Configuring WLM Configuration file syntactic conventions • The names you supply can consist of: — Uppercase letters (A-Z) — Lowercase letters (a-z) — Digits (0-9) — The underscore character (_) Do not use an underscore (_) to start the name of a workload group, an slo structure, or a metric. — The plus character (+) — The hyphen character (-) — The forward slash (/) Do not use slashes in the names of slo structures or metrics. — The period (.
PAGE 138
Configuring WLM Using the WLM configuration wizard Using the WLM configuration wizard If you prefer not to work directly with a configuration file, use the WLM Configuration Wizard. Invoke the wizard using the following command: # /opt/wlm/bin/wlmcw The wizard does not provide all the functionality available through a configuration file, but it does greatly simplify the process of creating a configuration.
PAGE 139
Configuring WLM Using the WLM configuration wizard Figure 5-1 Chapter 5 HP-UX WLM Configuration Wizard 139
PAGE 140
Configuring WLM Using the WLM GUI Using the WLM GUI The WLM graphical user interface, or GUI, allows you to edit existing configurations and create new ones. You can then deploy the configurations locally or remotely. The GUI also provides current and historical data. However, using the GUI does require you to use configuration file syntax. To configure WLM using the GUI, use the Modify tab to create a file then go to the Deploy tab to activate the configuration. The following is WLM GUI’s initial screen.
PAGE 141
Configuring WLM Using the WLM GUI Tips on using the WLM GUI’s tabs Here are some tips to get you started using the various tabs.
PAGE 142
Configuring WLM Using the WLM GUI Controlling system resources Using the WLM GUI, configure WLM to control resource allocation as follows: 1. Select the Modify tab. 2. Import an existing configuration (one that you had been monitoring), open a configuration file from the disk, or create a new one. 3. Make any desired changes. 4. Select the [Commit changes] button if you have made any modifications to the configuration. 5. Select the [Validate] button. 6. Select the Deploy tab. 7.
PAGE 143
Configuring WLM Using the WLM GUI Chapter 5 • Start WLM • Stop WLM • Try WLM without it taking control of my system 143
PAGE 144
Configuring WLM Specifying the WLM parser version Specifying the WLM parser version Use the optional version keyword at the beginning of your configuration file to specify which version of the WLM configuration file parser to use with a particular configuration file. This keyword is useful when future versions of the parser are not able to maintain backward-compatibility. However, WLM will be able to parse all configuration files properly once the correct parser version is known.
PAGE 145
Configuring WLM Notification of ‘Instant Capacity needed’ / Pay per use optimization WLM does not enable Instant Capacity or Pay per use (PPU) reserves. It merely informs you that the reserves could help you meet your SLOs; manually activate the reserves if you feel it is appropriate. (Independent of this keyword though, you can use wlmpard to automate activation of these reserves.) To enable notification, use the icod_thresh_pri keyword in your WLM configuration.
PAGE 146
Configuring WLM Notification of ‘Instant Capacity needed’ / Pay per use optimization When Instant Capacity reserves exist and are needed, the EMS resource is set to: ICOD_NEEDED_TRUE (1) In this case, it may be possible to reduce SLO failures by activating some Instant Capacity reserves. When using icod_thresh_pri, you can also use icod_filter_intervals, another global keyword that must be outside all prm, slo, and tune structures.
PAGE 147
Configuring WLM System-wide settings System-wide settings Use one or more of the following keywords outside all structures in your configuration to set values for the host (typically a virtual partition or nPartition). To specify the minimum number of CPU shares a host receives, use the hmincpu keyword and the following syntax: hmincpu = min; where hmincpu Is an optional keyword. You cannot specify this keyword in a configuration that includes a prm structure.
PAGE 148
Configuring WLM System-wide settings NOTE For information on the effect of the hmincpu and hmaxcpu keywords in passive mode, see “Passive mode versus actual WLM management” on page 238. The larger a host CPU weight value you assign a host, the more CPU resources it receives when not enough CPU resources are available to satisfy all requests at a given priority level.
PAGE 149
Configuring WLM Defining the PRM components (optional) Defining the PRM components (optional) Use a prm structure to define Process Resource Manager, or PRM, components—excluding the CPU allocations. The CPU allocations are controlled by WLM, as determined by the entries in the slo structures. NOTE If you plan on managing only virtual partitions or nPartitions—with no FSS groups or PSETs inside them, you need not specify a prm structure. You can go immediately to the section, “Defining SLOs” on page 186.
PAGE 150
Configuring WLM Defining the PRM components (optional) • Specifying a group’s minimum CPU resources (optional) • Specifying a group’s maximum CPU resources (optional) • Weighting a group so it gets more CPU resources (optional) • Specifying a group’s minimum memory (optional) • Specifying a group’s maximum memory (optional) • Weighting a group so it gets more memory (optional) A prm structure takes the following form: prm { groups = { FSS_group_def | PSET_group_def [: LCPU = {ON | OFF} ][, ...
PAGE 151
Configuring WLM Defining the PRM components (optional) Here is an example prm structure: prm { groups = finance : 2, sales : 3, marketing : PSET : LCPU = ON; users = jdoe : finance, pdoe : sales, admin : finance sales marketing; apps = finance : /bin/dbase “dbase*Finance”, sales : /bin/dbase “dbase*Sales”; procmap = finance : /bin/env/ UNIX95= /bin/ps -C pid_app -o pid=, sales : /scratch/pidsbyapp salespid_app, marketing : /scratch/pidsbyapp mrketpid_app; gmincpu = finance : 20, sales : 10; gmaxcpu = sales
PAGE 152
Configuring WLM Defining the PRM components (optional) # /opt/wlm/bin/wlmgui Step 3. Select the Modify tab. Step 4. Select the [New] button to start a new configuration. The left and right panes change as shown in the next step.
PAGE 153
Configuring WLM Defining the PRM components (optional) Step 5. Select the [Add] button. A new host name appears in the right pane and in the left pane.
PAGE 154
Configuring WLM Defining the PRM components (optional) Step 6. Select the name of the new host, wlmhost0 in this case, in the left pane. The right pane changes, allowing you to set the configuration elements you would like to see in the new configuration.
PAGE 155
Configuring WLM Defining the PRM components (optional) Step 7. Click “PRM” in the left pane. This changes the right pane to show a “Workload groups” item. Check the “Workload groups” box.
PAGE 156
Configuring WLM Defining the PRM components (optional) Step 8. Select the “Workload groups” item in the left pane. The right pane changes, allowing you to add a group.
PAGE 157
Configuring WLM Defining the PRM components (optional) Step 9. Select the [Add group] button. A new dialog appears. Accept the default (FSS workload group) by selecting the [OK] button.
PAGE 158
Configuring WLM Defining the PRM components (optional) Step 10. The right pane changes again, with the GUI starting to define a workload group for you. The GUI fills in the required fields for you, although you can change the values if you like. Fill in other fields if you like. Step 11. Select the [Commit changes] button to save the current configuration in memory. To save changes to disk, go to the Deploy tab.
PAGE 159
Configuring WLM Defining the PRM components (optional) Specifying workload groups (optional) A workload group can be one of two types: FSS or PSET. An FSS group is allocated CPU resources by the Fair Share Scheduler (FSS) in the HP-UX kernel. WLM can automatically adjust the CPU allocation for an FSS group based on the group’s progress toward an SLO. You can define multiple SLOs for a single FSS workload group. A PSET group is based on a processor set.
PAGE 160
Configuring WLM Defining the PRM components (optional) Use PRM_SYS only as needed. Do not load processes in PRM_SYS indiscriminately. NOTE The OTHERS group is the default group for users who are not assigned to groups. It is also the default group for applications that are not assigned to groups or are started by users who do not have assigned groups. You can specify OTHERS as an FSS group (group_ID must be 1), but not as a PSET group.
PAGE 161
Configuring WLM Defining the PRM components (optional) where group Is the workload group name. Use names that are less than eight characters long for proper display by the ps -P command. Do not start the name with an underscore (_). group cannot be either the PRM_SYS or OTHERS group.
PAGE 162
Configuring WLM Defining the PRM components (optional) reside.) When new PSETs are created, they inherit the Hyper-Threading state that the system had before WLM was activated (inheritance is based on the system state prior to WLM activation because WLM may change the Hyper-Threading setting for the default PSET to optimize performance). Cores can be moved from one partition to another and will take on the Hyper-Threading state of their destination PSET.
PAGE 163
Configuring WLM Defining the PRM components (optional) must install PSET (PROCSETS) software to obtain PSET functionality; see the HP-UX WLM Release Notes. PSET functionality comes with HP-UX 11i v2 (B.11.23) and later. NOTE It is possible to create a WLM configuration that has more PSET-based workload groups than the underlying system has cores. However, if all the groups are active at the same time, some groups will necessarily not have a core assigned and their processes would be placed in OTHERS.
PAGE 164
Configuring WLM Defining the PRM components (optional) processors, applications, and users to those groups. Once processors are assigned to a PSET workload group, they cannot be used by another group until a new configuration is loaded. Applications and users that are assigned to a PSET group have dedicated CPU resources from the CPU resources assigned to the group. Competition for CPU resources within the processor set are handled using the HP-UX time-share scheduler.
PAGE 165
Configuring WLM Defining the PRM components (optional) init_group Is the name of the initial workload group for the user or netgroup. This is the group login chooses when launching the user’s login shell, and the group cron chooses when scheduling jobs for that user. For information on a user’s group placement when the user belongs to multiple netgroups, see the Process Resource Manager User’s Guide. alt_groupX NOTE (Optional) Is the name of one of the alternate workload groups for the user or netgroup.
PAGE 166
Configuring WLM Defining the PRM components (optional) Place Unix groups in workload groups by defining the uxgrp keyword in the prm structure. The uxgrp keyword can appear, at most, once in a configuration. To assign Unix groups to workload groups, use the following syntax: uxgrp = uxgrp_name : group [, ... ]; where uxgrp_name Is the name of a Unix group. group Is the name of the workload group in which the Unix group should be placed. A Unix group can be assigned to one workload group only.
PAGE 167
Configuring WLM Defining the PRM components (optional) NOTE The system is polled every 30 seconds to ensure that processes are running in the appropriate workload groups. If a process forks child processes and immediately exits, the polling will likely miss the parent process. As a result, the parent process is never placed in its workload group. When the child processes are found during polling, WLM will not be able to determine the parent of the processes.
PAGE 168
Configuring WLM Defining the PRM components (optional) If you specify an application file name using wildcard characters, all valid executables—without explicit application records—that match the pattern assume the group for the application. For information on how to specify scripts in an apps statement, see the “Script example” on page 169. alt_nameX (Optional) Is an alternate name the application is assigned when executed.
PAGE 169
Configuring WLM Defining the PRM components (optional) If alt_nameX is not specified for an application, that application’s group assignment is used for all processes with a file ID that matches the file ID of application. (The file ID is based on the file system device and the inode number.) For an example showing how to specify scripts in an apps statement, see the “Script example” on page 169.
PAGE 170
Configuring WLM Defining the PRM components (optional) NOTE Because the full pathname is not required for the script, a rogue user can get access to workload groups—that would otherwise not be accessible—by using the name of the script for new scripts or wrappers. Step 3. Ensure the shell or interpreter is listed in either /etc/shells or /opt/prm/shells. For example, for a perl script named myscript.
PAGE 171
Configuring WLM Defining the PRM components (optional) Assigning secure compartments to workload groups (optional) The HP-UX feature Security Containment, available starting with HP-UX 11i v2, allows you to create secure compartments, which provide file and process isolation. You can place one or more secure compartments in a single workload group. After creating your secure compartments, you can place them in workload groups using the scomp keyword.
PAGE 172
Configuring WLM Defining the PRM components (optional) Specifying process maps to define your own criteria for workload separation (optional) You can define process maps that specify your own criteria for placing application processes in workload groups. Define process maps with the procmap keyword. Criteria defined in this manner supercede WLM’s default criteria defined by the users, uxgrps, apps, and scomp keywords). The procmap keyword can appear, at most, once in a configuration.
PAGE 173
Configuring WLM Defining the PRM components (optional) You can specify more than one PID_finder for a group, but specify each PID_finder in a separate group : PID_finder statement. Wildcards specified in the string are not expanded. NOTE The PID_finder is a single command. Pipes are not directly supported unles embedded in a shell command. See the third example in “PID finder examples” on page 173.
PAGE 174
Configuring WLM Defining the PRM components (optional) procmap = sales : /bin/env UNIX95= /bin/ps -C pid_app -o pid=; 2. Gather PIDs inside an external script named pidsbyapp. This is less desirable than the previous example because it masks the functionality of what is being run; however, you might find this method more useful because it facilitates specifying multiple or complex PID selection criteria. procmap = sales : /scratch/pidsbyapp pid_app; 3. Run a ps command with a pipe within a shell.
PAGE 175
Configuring WLM Defining the PRM components (optional) where group Is the workload group name. You cannot specify a PSET group in a disks statement. volume Names a logical volume group. volume must begin with /dev/v to be recognized. shares Is an integer value greater than or equal to 0. shares translates to a percentage of the disk bandwidth when dividing it by the number of disk bandwidth shares assigned to all the workload groups for the given volume group.
PAGE 176
Configuring WLM Defining the PRM components (optional) min Is group’s minimum number of CPU shares. The value must be an integer between 0 and the group’s gmaxcpu value, inclusive. min is out of the total CPU resources, which is 100 multiplied by the number of cores (if you set the tunable absolute_cpu_units to 1 or it is implied by other elements in your configuration)—or just 100 by default.
PAGE 177
Configuring WLM Defining the PRM components (optional) With extended_shares enabled in this scenario, the minimum allocation value would be 6.4 or 7.2, which are the two nearest multiples of .8. For more information on absolute CPU units, see the section “Using absolute CPU units” on page 217. If the sum of all the gmincpu values is greater than the system’s total CPU resources, the values are treated as CPU resource requests that are to be met before any other requests are considered.
PAGE 178
Configuring WLM Defining the PRM components (optional) max Is group’s maximum number of CPU shares. The value must be an integer greater than or equal to the group’s gmincpu value. max is out of the total CPU resources, which is 100 multiplied by the number of cores (if you set the tunable absolute_cpu_units to 1 or it is implied by other elements in your configuration)—or just 100 by default. If you specify a max greater than the total CPU resources, WLM treats it as equal to the total CPU resources.
PAGE 179
Configuring WLM Defining the PRM components (optional) extra CPU resources to go to your workload groups, set the distribute_excess tunable in your configuration file. This tunable is described in the section “Distributing excess CPU resources to your workloads (optional)” on page 218. If you do not set this tunable, all the excess CPU resources go to the default workload group OTHERS.
PAGE 180
Configuring WLM Defining the PRM components (optional) between their respective weights (in other words, the ratio of group A’s allocation to group B’s allocation equals the ratio of group A’s weight to group B’s weight). NOTE WLM uses this same policy for using weight to determine CPU allocations across partitions. For more information on how WLM manages CPU allocations across partitions, see Chapter 7, “Managing SLOs across partitions,” on page 255. Consider the following example.
PAGE 181
Configuring WLM Defining the PRM components (optional) Recall that WLM attempts to equally satisfy all SLOs at a given priority by allocating CPU in the same weight-to-allocation ratio. With B satisfied, we focus on groups A, C, and OTHERS. The weight-to-allocation ratio for A and C is 5/10, or 1/2. The ratio for OTHERS is undefined because it currently has no allocation. Consequently, WLM first allocates shares to OTHERS to bring its weight-to-allocation ratio in line with the ratios of A and C.
PAGE 182
Configuring WLM Defining the PRM components (optional) larger than the requests, the requests are used first. Thus, A, B, and C all receive a CPU allocation of 20. That leaves 40% of the CPU resources unallocated. With distribute_excess not set, all 40% goes to OTHERS.
PAGE 183
Configuring WLM Defining the PRM components (optional) Specifying a group’s minimum memory (optional) You can assign workload groups a minimum percentage of the system’s memory. This minimum is a hard lower limit; the workload group receives less than this minimum only if it has no active SLOs, is not associated with a process map, and transient_groups is set to 1.
PAGE 184
Configuring WLM Defining the PRM components (optional) Specifying a group’s maximum memory (optional) You can assign workload groups a maximum percentage of memory. This maximum is a hard upper limit, except for the OTHERS group. (The OTHERS group may receive more than its gmaxmem if all other groups have received their gmaxmem and there is still memory left.
PAGE 185
Configuring WLM Defining the PRM components (optional) group Is the workload group name. You cannot specify the PRM_SYS group in a memweight statement. Is group’s memory weight. The value must be an integer greater than or equal to 1. If you do not specify a memory weight, WLM uses the group’s CPU weight, which defaults to 1 if not specified.
PAGE 186
Configuring WLM Defining SLOs Defining SLOs Use slo structures to specify your service-level objectives and assign priorities to those objectives. You define slo structures for a host partition or based on the workload groups you defined in the prm structure. For information on specifying the prm structure, see “Defining the PRM components (optional)” on page 149.
PAGE 187
Configuring WLM Defining SLOs [ mincpu = lower_bound_request; ] [ maxcpu = upper_bound_request; ] cpushares = value { more | total } [ per metric met [ plus offset ] ]; [ condition = condition_expression; ] [ exception = exception_expression; ] } Here are several example slo structures: slo buying { pri = 1; mincpu = 50; maxcpu = 200; goal = metric stock_price_1 < 50; condition = 09:00 - 16:00; exception = Sat - Sun; } slo selling { pri = 1; mincpu = 50; maxcpu = 300; goal = metric stock_price_2 > 75; cond
PAGE 188
Configuring WLM Defining SLOs Select the “Service-level Objectives (SLOs)” box in the right pane.
PAGE 189
Configuring WLM Defining SLOs Step 2. Select the “Service-level objectives” item in the left pane.
PAGE 190
Configuring WLM Defining SLOs Step 3. Select the [Add SLO] button. A new dialog appears. The default option is “Varies with the group’s usage”. This option is the usage goal. Select the [OK] button.
PAGE 191
Configuring WLM Defining SLOs Step 4. The right pane changes, allowing you to define the SLO. The goal field defines the usage goal.
PAGE 192
Configuring WLM Defining SLOs Step 5. Fill in the fields as desired. Step 6. Select the [Commit changes] button to save the current configuration in memory. To save changes to disk, go to the Deploy tab. (The file is saved to the file specified in the file name field in the Modify tab when the system is selected in the left pane.) The previous steps highlight only one use of the WLM GUI. For additional information, see the GUI’s online help.
PAGE 193
Configuring WLM Defining SLOs • The underscore character (_), as long as it is not the first character • The hyphen character (-) • The period (.) • Quoted characters Any characters not listed here (except the double quote) can be used as long as they are enclosed in double quotes. The slash character (/) is not allowed, even when quoted. The SLO name cannot exceed 216 characters.
PAGE 194
Configuring WLM Defining SLOs Table 5-5 Capturing remaining CPU resources with stretch goals Workload SLO priority Finance 2 Finance 3 Sales 4 Sales and Finance are the only workloads in the configuration. After the priority 1 and 2 SLOs are met, the remaining CPU resources go to Finance. If there are any CPU resources left at that point, it goes to Sales.
PAGE 195
Configuring WLM Defining SLOs Now add a third workload named Payroll with an SLO at the same priority as those for the other workloads. This workload’s SLO is to calculate each employee’s overtime, commissions, bonuses, taxes, and total pay in less than 0.5 seconds. The system is now overloaded and cannot meet each SLO. Because each SLO has the same priority, WLM cannot allocate enough resources to any one workload to help it achieve its SLO. Consequently, no SLO is met, as shown in Figure 5-4.
PAGE 196
Configuring WLM Defining SLOs NOTE If you plan on managing only virtual partitions or nPartitions—with no FSS groups or PSETs inside, the entity keyword is not needed. Specify the workload group to which an SLO applies using the entity keyword. The entity keyword is required. Use the following syntax: entity = PRM group group_name; where group_name NOTE Is a workload group name. You cannot specify PRM_SYS in an slo structure.
PAGE 197
Configuring WLM Defining SLOs NOTE For information on the effect of these keywords in passive mode, see “Passive mode versus actual WLM management” on page 238. Specify the lower and upper bounds on CPU resources for an SLO using the following syntax: mincpu = lower_bound_request; maxcpu = upper_bound_request; where lower_bound_request Is an integer from 0 to upper_bound_request, inclusive. lower_bound_request is the minimum number of CPU shares the SLO’s controller can request.
PAGE 198
Configuring WLM Defining SLOs When configuring WLM on a partition, be sure to select a value for lower_bound_request that makes sense in terms of the limits placed on the partition when it was created—namely its minimum and maximum number of CPU resources. NOTE The lower_bound_request value is not a hard limit: Higher priority SLOs may consume all CPU resources before all SLOs are granted CPU resources. However, the associated workload group’s gmincpu value is a hard limit.
PAGE 199
Configuring WLM Defining SLOs When configuring WLM on a partition, be sure to select a value for upper_bound_request that makes sense in terms of the limits placed on the partition when it was created—namely its minimum and maximum number of CPU resources. An upper_bound_request may be ignored if the associated workload’s CPU resources are already limited by the group’s gmaxcpu value.
PAGE 200
Configuring WLM Defining SLOs You cannot specify both a goal statement and a cpushares statement in the same SLO. Similarly, you cannot have a workload with one SLO that has a goal statement and another SLO that has a cpushares statement that includes more. Specify the goal of an SLO using the following syntax: goal = goal_expression; where goal_expression Indicates either a usage goal or a metric goal (performance goal).
PAGE 201
Configuring WLM Defining SLOs Is an integer less than or equal to high_util_bound, but greater than or equal to 0. WLM attempts to keep the utilization percentage above this threshold value, which is 50 by default. It does so by reducing the requested allocation whenever utilization drops below this value. NOTE Setting low_util_bound at or near 0 can result in an SLO that never reduces its requested share allocation. This results in the SLO not giving up its unused shares.
PAGE 202
Configuring WLM Defining SLOs Figure 5-5 Usage goal conceptually Goal: Keep utilization between 50% and 75% ➊ With utilization above 75%, the workload is pretty Utilization (CPU used) / (CPU allocation) 100 75 ➊ ➋ busy and is using most of its allocation. WLM increases its allocation to ensure it gets enough CPU. This moves the utilization to less than 75%. ➋ In this range, the workload is using a reasonable amount of its allocation.
PAGE 203
Configuring WLM Defining SLOs Goals vs stretch goals You can specify one or more goals for a single workload. The highest priority goal should represent the minimum acceptable service level. All other goals should indicate stretch goals—goals that may be hard to achieve, but are desired if possible. Assign stretch goals lower priorities than the primary goal. Think of goals as “needs” and stretch goals as “wants”. Needs (goals) are required to complete the task at the desired performance level.
PAGE 204
Configuring WLM Defining SLOs Specifying a fixed or additive allocation of CPU shares (optional) An SLO can directly express an allocation request using the cpushares keyword. This keyword allows you to make fixed or additive allocation requests, where a fixed allocation is a specific number of CPU shares, while an additive allocation provides shares that are added to the allocation from SLOs of higher or equal priority.
PAGE 205
Configuring WLM Defining SLOs If the workload group can get a larger allocation from an SLO with an absolute allocation request at that priority, it does so. This absolute request can come from an SLO that uses cpushares with total or from an SLO that uses only the mincpu and maxcpu keywords. total Makes absolute allocation requests starting from 0. The request is exactly equal to value, within the bounds formed by the SLO’s mincpu and maxcpu values, if specified.
PAGE 206
Configuring WLM Defining SLOs If a workload has no active SLOs, it receives the following allocation of CPU resources at minimum, unless the gmincpu value is greater, in which case the workload receives the allocation specified by gmincpu: • For an FSS group, at least 1% (or 0.
PAGE 207
Configuring WLM Defining SLOs Is an expression that must be true for the SLO to be active. Do not create a condition statement that attempts to detect processes in a transient group using tools such as glance or ps. Whenever the group is deleted (FSS group) or assigned zero CPU resources (PSET-based group), it is impossible for the system to place processes in the group. The condition will then never detect the processes it is looking for.
PAGE 208
Configuring WLM Defining SLOs weekday Is Mon, Tue, Wed, Thu, Fri, Sat, or Sun. mm/dd/ccyy Takes mm values 1-12, dd values 1-31, and ccyy values as four-digit years. Use an asterisk (*) for a component to indicate that all valid values are accepted. hh:mm Is hours and minutes on a 24-hour clock. Use an asterisk (*) for hh and/ or mm to indicate that all valid values are accepted.
PAGE 209
Configuring WLM Defining SLOs exception = metric metricA; # SLO is active only when # metricA is 0 condition = (metric metricA) && !(metric metricB); # SLO is active when metricA is nonzero, # and metricB is zero condition = */01/2003; # SLO is active on the 1st of every month in the # year 2003 Chapter 5 209
PAGE 210
Configuring WLM Tuning the metrics and the SLOs Tuning the metrics and the SLOs You can tune metrics and SLOs using tune structures. Among other features, these structures allow you to specify the data collector and its command-line arguments, the frequency at which WLM checks for new performance data and adjusts CPU allocations, and the controllers’ variables. There are three types of tune structures: • Global structure This tune structure applies to all metrics and SLOs.
PAGE 211
Configuring WLM Tuning the metrics and the SLOs • Metric-specific structure A metric-specific tune structure applies to all SLOs using metric. This structure can be specified, at most, once per metric.
PAGE 212
Configuring WLM Tuning the metrics and the SLOs Defining a tune structure consists of the following tasks: • Specifying a data collector (optional) • Specifying the WLM interval (optional) • Using absolute CPU units • Distributing excess CPU resources to your workloads (optional) • Refining granularity of CPU (and memory) allocation by increasing shares per core (optional) • Temporarily removing groups with inactive SLOs (optional) • Capturing your collectors’ stderr (optional) • Smoothing me
PAGE 213
Configuring WLM Tuning the metrics and the SLOs Chapter 5 213
PAGE 214
Configuring WLM Tuning the metrics and the SLOs Step 2. Select the “Global tunables” item in the left pane. The right pane changes, allowing you to set various tunables. Set any tunables as desired. Step 3. Select the [Commit changes] button to save the current configuration in memory. To save changes to disk, go to the Deploy tab. (The file is saved to the file specified in the file name field in the Modify tab when the system is selected in the left pane.
PAGE 215
Configuring WLM Tuning the metrics and the SLOs Specifying a data collector (optional) Whenever you use a metric in your WLM configuration, you need to supply a value for that metric. The coll_argv statement launches a data collector to gather values for a metric. You can specify a data collector in a metric-specific tune structure to gather values for that one metric.
PAGE 216
Configuring WLM Tuning the metrics and the SLOs The shorter the interval, the more often WLM checks for new performance data and alters the CPU allocation if workloads are not meeting their SLOs. If there is no new data, WLM does nothing. With longer intervals, WLM is more likely to receive new performance data; however, a workload is also more likely to not achieve its SLO for longer periods. The wlm_interval tunable is optional.
PAGE 217
Configuring WLM Tuning the metrics and the SLOs • Minimize the effect of changes in the number of available CPU resources (due to WLM management of vPar, Instant Capacity, Temporary Instant Capacity, and Pay per use resources) For example, the following global tune structure sets the interval to 15 seconds: tune { wlm_interval = 15; } Using absolute CPU units With the absolute_cpu_units tunable, you can control whether the CPU units used by the following keywords/tunables are absolute or relative: • min
PAGE 218
Configuring WLM Tuning the metrics and the SLOs With relative CPU units (absolute_cpu_units = 0, the default), the units you specify represent a percentage of the system’s total CPU resources and are consequently relative to the number of active cores. For example, the following statement: mincpu = 50; is 50% of the system’s CPU resources, which is 50% of one core on a system with only one active core, but is eight cores on a system with 16 active cores.
PAGE 219
Configuring WLM Tuning the metrics and the SLOs tune { distribute_excess = 1; } Only workloads with active SLOs receive the excess CPU resources. This distribution is based on balancing the weight-to-allocation ratios for the workloads. These ratios are discussed in “Weighting a group so it gets more CPU resources (optional)” on page 178. The distribution is subject to the group CPU maximum values specified by the gmaxcpu keyword.
PAGE 220
Configuring WLM Tuning the metrics and the SLOs extended_shares enabled, WLM can support 256 groups starting with HP-UX 11i V2 Update 2 (the 64-group limit is still in effect on HP-UX 11i V1). The default value for extended_shares is 0, where 1 core consists of 100 shares. Enable extended_shares if: • You want finer granularity of allocation to FSS groups. This might be preferred when the number of FSS groups or CPU resources is large. • You want to use more than 63 FSS workload groups.
PAGE 221
Configuring WLM Tuning the metrics and the SLOs tune { transient_groups = 1; } NOTE Setting transient_groups equal to 1 in a configuration that does not have a prm structure results in an invalid configuration. With the transient_groups keyword set to 1: FSS groups with no active SLOs are deleted and therefore use no resources; the minimum CPU allocation for PSET groups becomes 0 (or the minimum specified by gmincpu, if resources are available).
PAGE 222
Configuring WLM Tuning the metrics and the SLOs Placement of processes for inactive FSS groups With transient_groups=1, if an FSS workload group, say mygrp, has no active SLOs, but does have processes assigned to it by user records, Unix group records, application records, or compartment records, WLM moves its processes to a temporary group named _IDLE_. This group has only the minimum CPU and memory resources and can greatly restrict the progress of a process.
PAGE 223
Configuring WLM Tuning the metrics and the SLOs attempted to place in the group with prmrun or prmmove are moved to the group. (For information on how WLM determines which group assignment takes precedence when the same process is identified by multiple records, see “How the application manager affects workload group assignments” on page 459.) NOTE When WLM is managing PSETs, do not change PSET settings by using the psrset command. Only use WLM to control PSETs.
PAGE 224
Configuring WLM Tuning the metrics and the SLOs Trimming the statistics log file automatically (optional) You can automatically trim the statistics log file /var/opt/wlm/wlmdstats. This file is created when you use the -l option to wlmd. Enable automatic trimming of the file by using the wlmdstats_size_limit tunable. The syntax is: wlmdstats_size_limit = number_of_megabytes; where wlmdstats_size_limit Is an optional tunable. Specify this tunable in a global tune structure.
PAGE 225
Configuring WLM Tuning the metrics and the SLOs WLM starts a performance controller for each goal-based SLO. It also starts a usage controller for each usage goal. Each controller calculates new CPU shares requests if the reported data indicates the workload is overachieving or underachieving its goal. The controller then requests the new number of CPU shares.
PAGE 226
Configuring WLM Tuning the metrics and the SLOs P = 3 - 2 = 1 Performance goal with met > value: P = value - met Usage goal: P = Actual utilization - Target utilization Target utilization is either low_util_bound or high_util_bound.
PAGE 227
Configuring WLM Tuning the metrics and the SLOs Use the optional cntl_kp tunable to tune convergence. It can be used in global, metric-specific, and metric/SLO-specific tune structures. Its syntax is: cntl_kp = proportional_term; where proportional_term Is a floating-point value between 0 and 1,000,000 (inclusive) used on the proportional term in the controller’s expression. The default value is 1.
PAGE 228
Configuring WLM Tuning the metrics and the SLOs NOTE A proportional_term equal to 0, when applied to the formula for determining new allocations New CPU allocation = (Allocation last interval) + cntl_kp * P results in the formula New CPU ent. = (Allocation last interval) which leads to no change in an SLO’s allocation request. Also, using large values for proportional_term can produce unstable behavior, causing service levels and allocations to oscillate drastically.
PAGE 229
Configuring WLM Tuning the metrics and the SLOs Tuning a workload’s SLO convergence: cntl_convergence_rate (optional) NOTE You can also tune convergence with the cntl_kp tunable discussed in the section “Tuning a workload’s SLO convergence: cntl_kp (optional)” on page 224. (If cntl_convergence_rate is not zero, it is used instead of cntl_kp.
PAGE 230
Configuring WLM Tuning the metrics and the SLOs To determine the new CPU shares allocation, each controller effectively executes an algorithm that, when plugging in cntl_convergence_rate (as opposed to cntl_kp), is represented as follows: New CPU allocation = (Allocation last interval) + (cntl_convergence_rate / 0.
PAGE 231
Configuring WLM Tuning the metrics and the SLOs smaller than number_of_shares. Similarly, when the difference is greater than 10%, the adjustment is proportionally larger than number_of_shares. The larger number_of_shares is, the larger will be adjustments made by WLM to the workload’s CPU allocation are. This generally produces faster convergence on the SLO goal. If WLM changes allocations too rapidly, resulting in instability, decrease number_of_shares.
PAGE 232
Configuring WLM Tuning the metrics and the SLOs WLM uses this value to re-target the goal it is trying to converge on. For example, with a margin_value of 0.1 and a goal of X, the re-targeted goal is X - (0.1*X). The controller adjusts the goal up or down (based on whether the goal is > or <) by this percentage to create a buffer zone. This adjustment prevents small fluctuations in the service level from causing SLO violations. Consider Figure 5-6.
PAGE 233
Configuring WLM Example configuration For more information on how to use the cntl_margin tunable, see the white paper “Tuning HP-UX Workload Manager” at /opt/wlm/share/doc/howto/tuning.html. Releasing cores properly (optional) By default, the cntl_base_previous_req tunable is set to 1, which can be beneficial when you are using the WLM global arbiter (wlmpard) to manage virtual partitions or nPartitions or when your WLM configuration has at least one PSET- based workload group with an SLO.
PAGE 234
Configuring WLM Example configuration } # This is a stretch goal for the finance query group. If all other CPU # requests of higher priority SLOs have been met, apply more CPU to # group finance, so its application runs faster. slo finance_query_stretch { pri = 5; mincpu = 20; maxcpu = 80; entity = PRM group finance; goal = metric fin_app.query.resp_time < 1.
PAGE 235
Configuring WLM Example configuration This configuration file specifies four SLOs. The finance_query SLO is active Monday through Friday. Its goal is that the metric fin_app.query.resp_time, which is provided by the executable /opt/fin_app/finance_collector, must always be less than 2.0. This goal is priority 1, the highest priority.
PAGE 236
Configuring WLM Trying a configuration without affecting the system SLOs that use this metric. The value of the proportional constant (cntl_kp) for those controllers is 2.0. All other tunables are set according to the default values in the master tunables file. For more examples files, see Chapter 9, “Example configuration files,” on page 283.
PAGE 237
Configuring WLM Trying a configuration without affecting the system Activate your configuration in passive mode then start the wlminfo utility. Use wlmsend to manipulate the metric used in the cpushares statement. What is the resulting allocation shown in the wlminfo output? • How do goals work? Is my goal set up correctly? Activate your configuration and monitor the WLM behavior in the wlminfo output.
PAGE 238
Configuring WLM Trying a configuration without affecting the system • When an application is run, which workload group does it run in? • Can I run an application in a particular workload group? • Are the alternate names for an application set up correctly? Furthermore, using metrics collected with glance_prm, passive mode can be useful for capacity planning and trend analysis. For more information, see glance_prm(1M).
PAGE 239
Configuring WLM Trying a configuration without affecting the system Figure 5-8 Passive WLM operation Usage/metrics System activity WLM Thus, in passive mode, WLM takes in data on the workloads. It even forms a CPU request for each workload based on the data received. However, it does not change the CPU allocations for the workloads on the system.
PAGE 240
Configuring WLM Trying a configuration without affecting the system rely on prmlist or prmmonitor to observe changes when using passive mode. These utilities will display the configuration WLM used to create the passive mode. However, you can use prmmonitor to gather CPU usage data. The effect of passive mode on usage goals and metric goals As noted previously, in passive mode, the WLM feedback loop is not in place. The lack of a feedback loop is most dramatic with usage goals.
PAGE 241
Configuring WLM Activating the configuration file Activating the configuration file When activating a WLM configuration file, you can run WLM in passive mode—allowing you to verify a configuration before using it to control the system’s resources. To use passive mode, specify -p with the wlmd command: # wlmd -p -a configfile Use the wlminfo utility to then monitor how WLM would have allocated CPU resoruces to your workloads in configfile.
PAGE 242
Configuring WLM Setting WLM to start automatically at reboot Setting WLM to start automatically at reboot You must either activate a valid WLM configuration or specify one with the variable WLM_STARTUP_SLOFILE in the file /etc/rc.config.d/wlm before you set WLM to start automatically at reboot. For information on activating a configuration, see “Activating the configuration file” on page 241. You can set WLM to start automatically at reboot by setting the WLM_ENABLE variable in the file /etc/rc.config.
PAGE 243
Configuring WLM Setting the WLM communications daemon to start automatically at reboot You can set the global arbiter to start automatically at reboot by setting the WLMPARD_ENABLE variable in the file /etc/rc.config.d/wlm to 1: WLMPARD_ENABLE=1 When started at reboot, the WLM global arbiter automatically uses the most recent configuration file that was activated.
PAGE 244
Configuring WLM Securing WLM communications Securing WLM communications When you start WLM using the /sbin/init.d/wlm script, the script uses secure mode by default. (However, if you are upgrading WLM and the /etc/rc.config.d/wlm script had been modified prior to the upgrade, the default might not be secure mode. Ensure that the secure mode variables discussed in the following discussion are enabled.
PAGE 245
Configuring WLM Enabling statistics logging at reboot Enabling statistics logging at reboot The following variables in /etc/rc.config.d/wlm allow you to log statistics starting at reboot: • WLM_STATS_LOGGING Set WLM to start logging statistics at reboot by setting the WLM_STATS_LOGGING variable in the following file: /etc/rc.config.
PAGE 246
Configuring WLM Disabling statistics logging • WLMPARD_STATS_LOGGING Set the WLM global arbiter to start logging statistics at reboot by setting the WLMPARD_STATS_LOGGING variable in /etc/rc.config.d/wlm as follows: WLMPARD_STATS_LOGGING="vpar" You can limit the size of /var/opt/wlm/wlmpardstats as explained in “Trimming the global arbiter statistics log automatically (optional)” on page 272. Disabling statistics logging To disable statistics logging by WLM, restart WLM without the -l.
PAGE 247
Configuring WLM WLM and kernel parameters WLM and kernel parameters WLM assumes your HP-UX kernel parameters are already set appropriately for the workloads you are running. Consequently, WLM does not change kernel parameters to meet the specified SLOs.
PAGE 248
Configuring WLM WLM and kernel parameters 248 Chapter 5
PAGE 249
6 Auditing and billing WLM produces audit information when you activate a configuration using the -t option with either the WLM daemon wlmd or the WLM global arbiter daemon wlmpard: # wlmd -t -a configfile or # wlmpard -t -a configfile Once you’ve activated a configuration using -t, use the wlmaudit command to display the audit data: # wlmaudit The wlmaudit command allows you to specify a date range for the data to display.
PAGE 250
Auditing and billing Example wlmaudit report Example wlmaudit report Here is a sample wlmaudit report. (Periods after January are removed for brevity.) Date: 08/26/2003 Host: host1 Subject: WLM Audit Report Summary for wlmd on host1 from 01/01/2003 to 08/26/2003: Duration CPU Entitlement CPU Usage Entity (hr) (CPU-hr) (CPU-hr) ----------------------------------------------------------------------OTHERS 3758.526 16638.380 1039.816 PRM_SYS 3758.526 1389.540 917.580 _IDLE_ 2928.954 159.613 0.
PAGE 251
Auditing and billing Example wlmaudit report Entity: PRM_SYS Daily records: Date Duration Entitlement Usage ------------------------------------------------------01/24/2003 1.663 3.052 0.508 01/25/2003 24.000 29.685 4.938 01/26/2003 24.000 29.448 4.877 01/27/2003 24.000 33.425 5.507 01/28/2003 24.000 35.716 5.872 01/29/2003 24.000 36.154 5.960 01/30/2003 24.000 39.523 6.469 01/31/2003 24.000 40.943 6.
PAGE 252
Auditing and billing Example wlmaudit report Entity: g_nightly Daily records: Date Duration Entitlement Usage ------------------------------------------------------01/24/2003 1.663 0.826 0.000 01/25/2003 24.000 3.671 1.035 01/26/2003 24.000 3.401 0.984 01/27/2003 24.000 4.400 0.936 01/28/2003 24.000 3.696 1.170 01/29/2003 24.000 3.785 1.125 01/30/2003 24.000 3.711 1.085 01/31/2003 24.000 3.716 1.
PAGE 253
Auditing and billing Audit data files Audit data files wlmaudit takes the comma-separated data from audit files and displays it in an easily readable report. However, if you would like to use these files directly, they are located in the directory /var/opt/wlm/audit/. The files are named based on the daemon of origin and the date: wlmd.monyyyy and wlmpard.monyyyy, with monyyyy representing the month and year, as in nov2001.
PAGE 254
Auditing and billing Enabling auditing at reboot Enabling auditing at reboot You can automatically set the WLM daemon, wlmd, and the WLM global arbiter daemon, wlmpard, to generate audit data by setting the following variables to 1 in the /etc/rc.config.d/wlm file: • WLMD_AUDIT_ENABLE • WLMPARD_AUDIT_ENABLE Both variables are set to 0 (no audit data) by default.
PAGE 255
7 Managing SLOs across partitions WLM can manage SLOs across virtual partitions and nPartitions. WLM provides a global arbiter, wlmpard, that can take input from the WLM instances on the individual partitions. The global arbiter then moves CPU resources (cores) between partitions, as needed to better achieve the SLOs specified in the WLM configuration files that are active in the partitions. These partitions can be nested—and can even contain FSS and PSET-based workload groups.
PAGE 256
Managing SLOs across partitions Overview Figure 7-1 WLM managing partitions Define workloads and SLOs Event Monitoring Service (EMS) WLM configuration file WLM monitoring tools (wlminfo or wlmgui) SLO stats (pass/fail) Message log /var/opt/wlm/msglog WLM daemon (wlmd) Workload Grp A (metric goal) Workload Grp B (usage goal) Usage Data Collector Controller Usage Data Collector Controller Statistics log (optional) /var/opt/wlm/wlmdstats Arbiter Workload Grp C (no goal) Workload Grp D (no goal)
PAGE 257
Managing SLOs across partitions Recommendations, requirements, and restrictions Recommendations, requirements, and restrictions To successfully manage WLM SLOs across partitions, observe the following: • HP recommends running WLM global arbitration in secure mode. If you do not run WLM global arbitration in secure mode, a rogue user could manipulate the communications, resulting in one or more partitions being granted an incorrect number of cores.
PAGE 258
Managing SLOs across partitions Recommendations, requirements, and restrictions • For managing nPartitions with WLM, you must use Instant Capacity cores (formerly known as iCOD CPUs). Use the Instant Capacity versions specified in the WLM Release Notes (/opt/wlm/share/doc/Rel_Notes). • If you manage virtual partitions in combination with Instant Capacity, you must use vPars A.03.01 or later. • Do not adjust any WLM-managed partition while wlmpard is running.
PAGE 259
Managing SLOs across partitions Recommendations, requirements, and restrictions with virtual partitions (vPars), Instant Capacity, and Pay per use. For more information, see the WLM Release Notes (/opt/wlm/share/doc/Rel_Notes). WLM allocates cores to a partition based on the CPU limits of the partition (physical limits for nPartitions; logical limits for virtual partitions).
PAGE 260
Managing SLOs across partitions Setting up cross-partition management Setting up cross-partition management The following steps give an overview of how to implement cross-partition management. Step 1. (Optional) Set up secure WLM communications. Follow the procedure HOW TO SECURE COMMUNICATIONS in wlmcert(1M)—skipping the step about starting/restarting the WLM daemons. You will do that later in this procedure. Step 2. Create a WLM configuration file for each partition.
PAGE 261
Managing SLOs across partitions Setting up cross-partition management This configuration specifies a usage goal for its workload group. The file is included in this book in the section “par_usage_goal.wlm” on page 310. (Be sure to use the par_usage_goal.wlmpar file for the WLM global arbiter.
PAGE 262
Managing SLOs across partitions Setting up cross-partition management After verifying and fine-tuning each partition’s WLM configuration file configfile, activate it as follows: # wlmd -a configfile To use secure communications, activate the file using the -s option: # wlmd -s -a configfile The wlmd daemon runs in secure mode by default when you use the /sbin/init.d/wlm script to start WLM. (If you upgraded WLM, secure mode might not be the default.
PAGE 263
Managing SLOs across partitions Setting up cross-partition management capacity, stop wlmpard (using the -k option). For information about how to stop wlmpard, see Appendix A, “WLM command reference,” on page 363. You can change the 15-day default by setting the WLM global arbiter utility_reserve_threshold keyword. For more information, see “Specifying the reserve threshold that determines when WLM stops activating temporary capacity resources” on page 274 or see wlmparconf(4).
PAGE 264
Managing SLOs across partitions Setting up your WLM configuration file # wlmpard -s -a configfile The global arbiter runs in secure mode by default when you use the /sbin/init.d/wlm script to start WLM. If you upgraded WLM, secure mode might not be the default. Ensure that the WLMPARD_SECURE_ENABLE variable in /etc/rc.config.d/wlm is enabled. For more information, see “Securing WLM communications” on page 244.
PAGE 265
Managing SLOs across partitions Setting up your WLM global arbiter configuration file primary_host Is an optional keyword that identifies the host name of the system running the global arbiter. The system can be any HP-UX system that has network connectivity to the partitions being managed by WLM. hostname Is the name or IP address of the host running the arbiter. port_number (Optional) Is a port number greater than 0 indicating the port that the global arbiter is to monitor.
PAGE 266
Managing SLOs across partitions Setting up your WLM global arbiter configuration file The global arbiter configuration file allows you to control settings for the HP-UX WLM global arbiter, which governs cross-partition management and management of Temporary Instant Capacity (TiCAP) and Pay per use (PPU) resources. WLM can manage cores for virtual partitions and nPartitions. (The partitions can even be nested and contain FSS and PSET-based workload groups.
PAGE 267
Managing SLOs across partitions Setting up your WLM global arbiter configuration file create or modify this configuration file. To follow is the syntax of the par structure, followed by an example structure. An explanation of the syntax specific for each keyword is provided in subsequent sections. If your configuration uses a previously supported structure type, such as vpar or npar_icod, WLM silently interprets these as a par structure.
PAGE 268
Managing SLOs across partitions Setting up your WLM global arbiter configuration file Step 3. Select the Modify tab. Step 4. Select the [New] button.
PAGE 269
Managing SLOs across partitions Setting up your WLM global arbiter configuration file Step 5. Select the “Global arbiter configuration” box at the top of the right pane, as in the following example. (The [Add], [Copy], and [Remove] buttons are for setting up the WLM configurations for the partitions themselves. You select a partition in the left pane to modify the WLM configuration for that partition.
PAGE 270
Managing SLOs across partitions Setting up your WLM global arbiter configuration file Step 6. Select the “Global Arbiter” item in the left pane. The right pane changes, allowing you to fill in the various fields for the global arbiter configuration keywords. The syntax for each of the keyword values you specify is described in the remaining sections of this chapter.
PAGE 271
Managing SLOs across partitions Setting up your WLM global arbiter configuration file interval = number_of_seconds; where interval Is a optional keyword. number_of_seconds Is an integer value greater than or equal to 1 indicating how often, in seconds, WLM should consider moving cores between partitions. The interval for your global arbiter should be larger than the largest WLM interval being used on the system.
PAGE 272
Managing SLOs across partitions Setting up your WLM global arbiter configuration file port number in the WLM global arbiter configuration file, specify the same port number in each partition’s WLM configuration file. If you do not specify a port, wlmpard searches the file /etc/services for the first line with the following format: hp-wlmpar port_number/tcp If such an entry is found, port_number is used as the port.
PAGE 273
Managing SLOs across partitions Setting up your WLM global arbiter configuration file Specifying the priority at which to use Temporary Instant Capacity or Pay per use resources (optional) While wlmpard has always managed migration of cores across partitions for WLM, it also now provides management of Temporary Instant Capacity (TiCAP) and Pay per use (PPU) resources for WLM. This management is available on standalone systems, as well as across a collection of partitions.
PAGE 274
Managing SLOs across partitions Setting up your WLM global arbiter configuration file utilitypri = integer; where utilitypri Is an optional keyword. integer Is an integer value greater than or equal to 1. If Temporary Instant Capacity or PPU resources exist, they are used whenever SLOs with a priority from 1 to integer (inclusive) are demanding more cores.
PAGE 275
Managing SLOs across partitions Setting up your WLM global arbiter configuration file (or less than) integer, you must purchase additional resources. Before adding capacity, be sure to stop wlmpard (using the -k option) and all wlmd clients on the complex. You can set the keyword to 0 to cause the global arbiter to always activate temporary capacity resources as long as any number of temporary capacity resources are available.
PAGE 276
Managing SLOs across partitions Setting up your WLM global arbiter configuration file 276 Chapter 7
PAGE 277
8 Management of nested nPars / vPars / workload groups You can manage any combination of FSS or PSET-based workload groups inside virtual partitions inside nPartitions if desired. NOTE You can use FSS and PSET-based workload groups with partition management. Certain software restrictions apply to using PSET-based groups with virtual partitions (vPars), Instant Capacity, and Pay per use. For more information, see the WLM Release Notes (/opt/wlm/share/doc/Rel_Notes).
PAGE 278
Management of nested nPars / vPars / workload groups Setting up the various configurations Figure 8-1 Nested partitions to be managed (with Instant Capacity) vpar0 vpar0 FSS1 vpar1 vpar1 FSS2 vpar2 vpar2 PSET nPar1 nPar2 nPar3 Assume you want to share cores, as indicated by the arrows in the figure, among the: • Virtual partitions within nPar1 • Virtual partitions within nPar2 • FSS and PSET-based workload groups within nPar3 • Three nPartitions Instant Capacity (iCAP, formerly known a
PAGE 279
Management of nested nPars / vPars / workload groups Setting up the various configurations NOTE Specifying this keyword ensures WLM maintains compliance with your Temporary Instant Capacity (TiCAP) usage rights. When your prepaid amount of temporary capacity expires, WLM no longer attempts to use the temporary resources. When 15 or fewer processing days (the default) of temporary capacity are available, WLM stops activating Temporary Instant Capacity; in this case, you must purchase extra capacity.
PAGE 280
Management of nested nPars / vPars / workload groups Setting up the various configurations Managing FSS and PSET-based groups inside vPars inside nPars (Instant Capacity not available) Consider the same complex as before, but without Instant Capacity. Movement of cores across the nPartitions cannot be simulated. Figure 8-2 shows this complex.
PAGE 281
Management of nested nPars / vPars / workload groups Setting up the various configurations With Instant Capacity not available, you must have one wlmpard for each nPartition containing virtual partitions. In particular: Step 1. Set up nPar1. a. Set up and start a configuration for wlmpard on some system, say mysystem1. mysystem1 can be in the complex or another system on your network. b.
PAGE 282
Management of nested nPars / vPars / workload groups Managing FSS and PSET-based groups inside vPars Managing FSS and PSET-based groups inside vPars If your system is not broken into nPartitions, you do not need to set up your configurations based on whether the complex has Instant Capacity. You just need to: Step 1. Run one wlmpard to manage the system’s virtual partitions. Each virtual partition will have a configuration for wlmd with a primary_host statement naming the system where wlmpard is running.
PAGE 283
9 Example configuration files This chapter presents the example configuration files available from the /opt/wlm/examples/wlmconf/ directory, as well as an example from the /opt/wlm/toolkits/weblogic/config/ directory. These examples show how to use the syntax discussed in Chapter 5, “Configuring WLM,” on page 135. NOTE Copy these examples to another directory before modifying them. Items kept under /opt/wlm/ may be replaced or altered by future WLM product updates.
PAGE 284
Example configuration files • “manual_cpucount.wlm” on page 296 A configuration file to help a WLM user characterize the behavior of a workload. The goal is to determine how a workload, placed in a PSET-based workload group, responds to a series of changes in the number of CPU resources in the PSET. This example is located in the directory /opt/wlm/toolkits/weblogic/config/. • “manual_entitlement.wlm” on page 298 A configuration file to help a WLM user characterize the behavior of a workload.
PAGE 285
Example configuration files brackets ([, ]). Because of the presence of the square brackets, the sample file will not pass the syntax-checking mode of wlmd (wlmd -c template). • “stretch_goal.wlm” on page 318 Example configuration file to demonstrate how to use multiple SLOs for the same workload (but at different priority levels) to specify a stretch goal for a workload.
PAGE 286
Example configuration files distribute_excess.wlm NOTE For example configuration files that you can use with Oracle instances, see the section “How do I get started with ODBTK?” on page 429. For pointers to the example configurations for the various WLM toolkits, see the EXAMPLES section of wlmtk(5). distribute_excess.wlm This example features the distribute_excess and weight keywords. # # # # # # # # # # # # # # # # # # # # # # # # Name: distribute_excess.
PAGE 287
Example configuration files distribute_excess.wlm # prm structure # # In this example we have two groups using the machine, Orders # and Sales. In the event there are additional resources available, # we’d like for Sales to get three shares for every share that Orders # receives. # # See wlmconf(4) for complete HP-UX WLM configuration information.
PAGE 288
Example configuration files distribute_excess.wlm # tune structure # # We must define a data collector for every metric used in # SLO definitions. In this example, we add a global tune structure # where we enable distribution of excess shares to groups defined in # the prm structure above. This is accomplished by simply setting # the keyword distribute_excess to 1 (TRUE). Without this setting, # HP-UX WLM would allocate all excess shares to the OTHERS group.
PAGE 289
Example configuration files enabling_event.wlm # # # tune { at least 1% of the CPU, the allocation for either the Sales group or Orders group would be reduced by 1%.
PAGE 290
Example configuration files enabling_event.wlm # Purpose: # Demonstrate the use of HP-UX WLM to enable a service-level objective # (SLO) when a certain event occurs. # # Dependencies: # This example was designed to run with version HP-UX WLM A.01.02 # or later. # # # prm structure # # See wlmconf(4) for complete HP-UX WLM configuration information. # # Define workload groups in the prm structure. Individual users are # assigned to groups in this structure as well.
PAGE 291
Example configuration files entitlement_per_process.wlm # tune structure # # Use of a metric in a SLO structure, whether it is a goal or # condition, requires that the metric have a tune structure associated # with it. The structure must specify the data collector for the # metric. In this example, the collector specified with the coll_argv # keyword is wlmrcvdc. This allows the metric backup_running to be set # with wlmsend from a command line.
PAGE 292
Example configuration files entitlement_per_process.wlm # # # # # # # # # # # # # # # # # # # # # # # # # # # # Purpose: Demonstrate the use of a shares-per-metric goal. A machine hosts a group’s web servers. However, the group would like to use the rest of the CPU cycles without impacting the web servers, which are often not busy.
PAGE 293
Example configuration files entitlement_per_process.wlm } # Grant 5 shares to servers for every active httpd process. # Never allow the allocation to fall below 10 shares, nor to # rise above 90% of the CPU resources. # slo servers_proportional { pri = 1; mincpu = 10; maxcpu = 90; entity = PRM group servers; cpushares = 5 more per metric apache.active_procs ; } # # # # Any CPU resources that remain after satisfying the above SLO is given to the OTHERS group by default.
PAGE 294
Example configuration files fixed_entitlement.wlm fixed_entitlement.wlm The following example shows a fixed entitlement for a workload group. # # # # # # # # # # # # # # # # # # # # # # # # Name: fixed_entitlement.wlm Version information: (C) Copyright 2001-2006 Hewlett-Packard Development Company, L.P. $Revision: 1.
PAGE 295
Example configuration files fixed_entitlement.wlm # # # # # # # commands. For more information, see prmmove(1) and prmrun(1). Note that the group OTHERS is Applications run by users not execute in the OTHERS group. applications in the Marketing a group created automatically. referenced in the prm structure will So, only bob and sally can execute group.
PAGE 296
Example configuration files manual_cpucount.wlm manual_cpucount.wlm This example is similar to the previous one: It allows you to easily step through a number of different CPU allocations for a workload group. The difference is that this configuration uses a PSET-based workload group. When you change CPU allocations, you are changing the number of CPU resources assigned to the group’s PSET. The file is located at /opt/wlm/toolkits/weblogic/config/manual_cpucount.wlm.
PAGE 297
Example configuration files manual_cpucount.wlm # Dependencies: # This example was designed to run with HP-UX WLM version A.02.01 or # later. It uses the CPU movement among PSET groups feature # introduced in WLM A.02.01. Consequently, it is incompatible with # earlier versions of HP-UX WLM. # # # prm structure # There is a single WebLogic instance, instA, being controlled # in a WLM workload group, wls1_grp.
PAGE 298
Example configuration files manual_entitlement.wlm # Tell wlmrcvdc to watch for metrics coming in via command lines: # % /opt/wlm/bin/wlmsend wls1_grp.desired.cpucount 1 # or # % /opt/wlm/bin/wlmsend wls1_grp.desired.cpucount 2 # tune wls1_grp.desired.cpucount { coll_argv = wlmrcvdc ; } # # Check for new metrics every 5 seconds. # Also, turn on absolute CPU units, so resources on a 4-core box are # represented as 400 shares instead of 100 shares.
PAGE 299
Example configuration files manual_entitlement.wlm # Caveats: # DO NOT MODIFY this file in its /opt/wlm/examples/wlmconf location! # Make modifications to a copy and place that copy outside the # /opt/wlm/ directory, as items below /opt/wlm will be replaced # or modified by future HP-UX WLM product updates. # # Purpose: # This example demonstrates the use of a parametric entitlement # (allocation) to characterize the behavior of a workload.
PAGE 300
Example configuration files manual_entitlement.wlm # the specific allocation. # Components: # Uses the wlmsend and wlmrcvdc tools to relay a metric from # an outside user. # # Dependencies: # This example was designed to run with HP-UX WLM version A.01.02 or # later. It uses the cpushares keyword introduced in A.01.02, so # is incompatible with earlier versions of HP-UX WLM. # # # # prm structure # Create a workload group called grp1 and define the binaries that # make it up.
PAGE 301
Example configuration files metric_condition.wlm # # # # Any CPU that remains after satisfying the above SLO is given to the OTHERS group by default. You can change this default using the distribute_excess keyword. For more information on this keyword, see the wlmconf(4) manpage. # # Relay a metric from the outside user. # tune myapp.desired.allocation { coll_argv = wlmrcvdc; } For information on the cpushares keyword, see “Specifying a shares-per-metric allocation request (optional)” on page 205.
PAGE 302
Example configuration files metric_condition.wlm # # # # # # # # # # # # # # # Purpose: Demonstrate the use of HP-UX WLM to enable a service-level objective (SLO) when a measurable metric value is reached. Also make use of a glance data collector to provide a metric value. Components: The glance toolkit (included with HP-UX WLM) is used. See glance_prm(1M) for more information on the glance data collectors. Dependencies: This example was designed to run with version HP-UX WLM A.01.01 or later.
PAGE 303
Example configuration files metric_condition.wlm } # tune structures # # These structures provide the data collector information # for the metrics used in the slo structure above. # # One data collector is a hypothetical application, written to # calculate and provide average job-completion time in minutes. # # The other metric is calculated using the glance toolkit and # a glance metric called APP_ALIVE_PROC.
PAGE 304
Example configuration files par_manual_allocation.wlm par_manual_allocation.wlm This file, in combination with the global arbiter configuration file in the next section, can migrate cores across HP-UX Virtual Partitions (vPars) and/or nPartitions (nPars) based on the number of cores you request on the command line using wlmsend. The way WLM manages cores depends on the software enabled on the complex (such as Instant Capacity, Pay per use, and Virtual Partitions).
PAGE 305
Example configuration files par_manual_allocation.wlm # # Dependencies: # This example was designed to run with HP-UX WLM version A.03.00 or # later. (A.03.00 was the first version to support strictly # host-based configurations.) # # To implement WLM’s dynamic partition resizing: # 1. Set the primary_host keyword in this file # # 2. Copy this WLM configuration to each partition in the system # # 3.
PAGE 306
Example configuration files par_manual_allocation.wlm # num_cpus value to see how the workload behaves with various # numbers of cores. With small partitions, you should be able to # step through all the available cores and evaluate workload # response quickly. # # 8. Monitor each SLO’s request for CPU shares, using the following # wlminfo command: # # wlminfo slo -v [-l] # # The output will show the shares request (“Req” column) # change for the partition as you change the num_cpus value # using wlmsend.
PAGE 307
Example configuration files par_manual_allocation.wlmpar # current partition’s applications relative to the importance of the # applications in all the other partitions. # # When managing partitions, WLM equates 1 core of CPU resources to 100 # shares. The cpushares statement causes the SLO to request 100 shares, or # 1 core, multiplied by the metric num_cpus. So, if num_cpus = 7, the SLO # requests 7 cores for the partition.
PAGE 308
Example configuration files par_manual_allocation.wlmpar # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Name: par_manual_allocation.wlmpar Version information: (C) Copyright 2003-2006 Hewlett-Packard Development Company, L.P. $Revision: 1.
PAGE 309
Example configuration files par_manual_allocation.wlmpar # # # # # # # # # # # # # # # # # # # # # # # # # # # # # wlm_interval value you use in any of the partitions’ WLM configuration files.) Several additional par structure keywords are included below but commented out. Use the port keyword to specify the port that the global arbiter should monitor for input from the various WLM instances. Specify a port number greater than 0.
PAGE 310
Example configuration files par_usage_goal.wlm par_usage_goal.wlm This file, in combination with the global arbiter configuration file in the next section, can migrate cores across HP-UX Virtual Partitions and/or nPartitions based on usage goals. The way WLM manages cores depends on the software enabled on the complex (such as Instant Capacity, Pay per use, and Virtual Partitions). Activate the WLM configuration file in each partition.
PAGE 311
Example configuration files par_usage_goal.wlm # host-based configurations. It was also the first version that does # not require mincpu/maxcpu statements; thus, this configuration file # must be run on A.03.00 or later.) # # # To implement WLM’s dynamic partition resizing: # 1. Set the primary_host keyword in this file # # 2. Copy this WLM configuration to each partition in the system # # 3.
PAGE 312
Example configuration files par_usage_goal.wlmpar # where WLM’s global arbiter will run. (This keyword has the same value # on each partition.) # # See wlmconf(4) for complete HP-UX WLM configuration information. # primary_host = myserver; # Change this value # # Set the interval on which WLM takes CPU requests and makes changes in CPU # allocations to 5 seconds. (The default interval is 60 seconds. Using a # smaller interval allows WLM to respond more quickly to changes in # workload performance.
PAGE 313
Example configuration files par_usage_goal.wlmpar per use, and Virtual Partitions). This file also shows how several additional par structure keywords (including utilitypri) are used (these are commented out). Activate the WLM global arbiter’s configuration in only one partition. However, activate the WLM configuration file in each partition to be managed by WLM. # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Name: par_usage_goal.
PAGE 314
Example configuration files par_usage_goal.wlmpar # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # configure WLM for dynamic partition resizing. The WLM global arbiter’s interval is set to 10 seconds. Every interval, the arbiter takes CPU requests from the WLM instances running on the partitions and makes changes in the partitions’ CPU allocations.
PAGE 315
Example configuration files performance_goal.template performance_goal.template The following file is a template showing how to use various components of the configuration file. The file shows how to use performance-based goals to determine whether an SLO is being met. Values that you must customize are shown in square brackets ([ ]). You must remove the brackets. # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Name: performance_goal.
PAGE 316
Example configuration files performance_goal.template ################## # PRM Components # ################## # # prm structure # # First, we define the workload groups. # # For this example, we will assume two basic workload groups: # finance and sales # Each group only has one application that we will monitor.
PAGE 317
Example configuration files performance_goal.template goal = metric [fin_app.query.resp_time < 2.0]; condition = [Mon - Fri]; # only active on weekdays } # On weekends, we do not expect any query transactions, but just in # case, we will specify a nominal, fixed CPU allocation for this # application for off-hours.
PAGE 318
Example configuration files stretch_goal.wlm # finance SLOs (above). This theoretical application # (/opt/fin_app/finance_collector) is developed or otherwise provided # by the user. # For more information on how to develop a data collector (also known as # performance monitor), please see /opt/wlm/share/doc/howto/perfmon.html. # # This structure also specifies a constant (cntl_kp), which controls the # rate of service-level convergence toward its goal.
PAGE 319
Example configuration files stretch_goal.wlm # $Revision: 1.7 $ # Caveats: # DO NOT MODIFY this file in its /opt/wlm/examples/wlmconf location! # Make modifications to a copy and place that copy outside the # /opt/wlm/ directory, as items below /opt/wlm will be replaced # or modified by future HP-UX WLM product updates.
PAGE 320
Example configuration files stretch_goal.wlm maxcpu = 50; # maximum CPU allocation (percentage) entity = PRM group finance; goal = metric fin_app.query.resp_time < 2.0; condition = Mon - Fri; # only active on weekdays } # This is a “stretch” goal for the finance query group. If all other # goals of higher priority (lower “pri” integral values) have been met, # apply more CPU to group finance, so its application runs faster # during prime time (Monday through Friday between 9am and 5pm).
PAGE 321
Example configuration files time_activated.wlm # tune fin_app.query.resp_time { coll_argv = /opt/fin_app/finance_collector -a 123 -v; } # # tune structure specifying similar information for the # sales_app.resp_time metric. # tune sales_app.resp_time { coll_argv = /opt/sales_app/monitor -threshold 2 -freq 30; } For information on the condition keyword, see “Specifying when the SLO is active (optional)” on page 205. Also see “Goals vs stretch goals” on page 203. time_activated.
PAGE 322
Example configuration files time_activated.wlm # Dependencies: # This example was designed to run with version HP-UX WLM A.01.02 # or later. # # prm structure # # See wlmconf(4) for complete HP-UX WLM configuration information. # # Define all workload groups in the prm structure. Individual # users are assigned to particular groups in this structure as well. # # In this example configuration, the user don can execute # applications in either the Payroll or OTHERS workload groups.
PAGE 323
Example configuration files transient_groups.wlm # # resources will be available to users in the Payroll group! slo Payroll_Processing { pri = 1; cpushares = 80 total; entity = PRM group Payroll; condition = */06/* || */21/*; } For information on the condition keyword, see “Specifying when the SLO is active (optional)” on page 205. transient_groups.wlm This example shows how you can reduce the resources used by groups with no active SLOs.
PAGE 324
Example configuration files transient_groups.wlm # # # # # # # # # # # # By setting the transient_groups keyword to 1: * Whenever an FSS group has no active SLOs, the group is removed from the configuration and therefore consumes no resources * Whenever a PSET-based group has no active SLOs, the group gets 0 CPU resources See the discussion of the transient_groups keyword in the wlmconf(4) manpage for information on where processes belonging to such groups are placed.
PAGE 325
Example configuration files transient_groups.wlm # Set up wlmrcvdc to pick up metrics indicating whether the Serviceguard # packages are active; set transient_groups keyword. # # Have WLM modify allocations (if necessary) every 5 seconds # because the configuration includes usage goals. # tune { coll_argv = wlmrcvdc sg_pkg_active; transient_groups = 1; wlm_interval = 5; } # # # # The SLO Paying_slo applies to a PSET-based group, so absolute CPU units are in effect.
PAGE 326
Example configuration files twice_weekly_boost.wlm } # Whenever the Paying Serviceguard package is active, this SLO’s associated # workload group is allocated 1 to 4 cores, based on a usage goal. When the # package is not active, the Paying workload group, which is based on a # PSET group, gets no CPU resources. This SLO is priority 3; it gets CPU only # after the Apache_slo and Billing_slo are satisfied.
PAGE 327
Example configuration files twice_weekly_boost.wlm # Purpose: # Demonstrate a conditional allocation with a moderately complex # condition. # A baseball park’s server runs a number of different workloads # for two groups: the front office and the scouting staff. The # basic allocations are 30 shares for front_office, 30 shares for # scouting, and the remainder (40 in this case) for OTHERS.
PAGE 328
Example configuration files twice_weekly_boost.wlm # administrator executes this command: # # % wlmsend scouting.boost_enable 0 # Manually requested boosts receive a higher priority than # the automatic date-based boosts. This is achieved with the ‘pri’ # keyword in the slo definitions. # # In the unusual case that the front office *and* the scouting # team manually boost their allocations, the front office takes # priority, and the scouting boost is disallowed.
PAGE 329
Example configuration files twice_weekly_boost.wlm ########## # # Give the front office its basic allocation of 30 shares. # slo front_office_basic { pri = 3; entity = PRM group front_office; cpushares = 30 total; } # # When the day is correct, boost the front office to 60 shares, unless # scouting has requested a manual boost. # slo front_office_date_boost { pri = 2; entity = PRM group front_office; cpushares = 60 total; condition = (Tue || Thu); exception = (metric scouting.
PAGE 330
Example configuration files twice_weekly_boost.wlm # When the day is correct, boost scouting to 60 shares, unless # scouting has requested a manual boost. # slo scouting_date_boost { pri = 2; entity = PRM group scouting; cpushares = 60 total; condition = (Mon || Wed); exception = (metric scouting.boost_enable > 0) ; } # # If the system administrator requests it, boost scouting to 70 shares.
PAGE 331
Example configuration files usage_goal.wlm usage_goal.wlm This example shows a CPU usage goal, where WLM attempts to keep a workload’s CPU utilization, defined as (CPU used) / (CPU allocated), between a certain range. # # # # # # # # # # # # # # # # # # # # # # # # Name: usage_goal.wlm Version information: (C) Copyright 2001-2006 Hewlett-Packard Development Company, L.P. $Revision: 1.
PAGE 332
Example configuration files usage_goal.wlm # slo structures # # # This SLO is defined with a CPU usage, or utilization, goal. This # is a special goal in that WLM tracks the utilization for you. # By default, WLM attempts to keep the group’s utilization of its # allocated CPU between 50% and 75%. If utilization falls below 50% # (due perhaps to fewer applications running), WLM reduces the Orders # group’s allocation, making more CPU available to the OTHERS and # Batch groups.
PAGE 333
Example configuration files usage_goal.wlm # tune structure # # Have WLM modify allocations (if necessary) every 5 seconds # because the configuration includes usage goals. # tune { wlm_interval = 5; } For more information on usage goals, see “Specifying a goal (optional)” on page 199.
PAGE 334
Example configuration files usage_stretch_goal.wlm usage_stretch_goal.wlm This example shows the use of stretch goals in addition to base CPU usage goals for several different workload groups. A priority 1 SLO specifies a base CPU usage goal for the OTHERS group to ensure that it receives between 1 and 100 CPU shares. Additional priority 1 SLOs are defined for three additional groups to ensure that they receive between 1 and 200 CPU shares.
PAGE 335
Example configuration files usage_stretch_goal.wlm # Dependencies: # This example was designed to run with version HP-UX WLM A.02.00 # or later. # # prm structure # # In this example there are four groups defined. Each group, except # OTHERS, has application records. An application record # tells WLM to run an application in a certain group. (OTHERS does # not have an application record because it is the default group: # Any nonroot processes without application records or user records # run in OTHERS.
PAGE 336
Example configuration files usage_stretch_goal.wlm # # There are seven SLOs in this example. # # This first SLO ensures that OTHERS receives between 0 and 100 # shares. (The 100 shares represent 1 core because absolute_cpu_units # is set to 1). The goal is a usage goal, attempting to ensure the # group uses between 50% and 75% of its allocated CPU shares (as # explained in the “Purpose” section above).
PAGE 337
Example configuration files usage_stretch_goal.wlm maxcpu = 200; goal = usage _CPU; } # # The following SLOs are for the groups with priority 1 SLOs above. # The SLOs below are all priority 10 (pri = 10) and are stretch goals. # These SLOs are met only if there are shares left over after the # higher priority SLOs have been satisfied. Each SLO is for a # different group and requests between 1 and 800 shares (or 8 # cores).
PAGE 338
Example configuration files user_application_records.wlm user_application_records.wlm This example shows how to place applications in workload groups. It also shows that application records take precedence when both user records and application records are in effect for an application. # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Name: user_application_records.wlm Version information: (C) Copyright 2001-2006 Hewlett-Packard Development Company, L.P. $Revision: 1.
PAGE 339
Example configuration files user_application_records.wlm # # # prm structure # Create workload groups. Designate which workload binaries # and users will be placed in each. We will be managing # two workloads, apache and netscape, and two groups of users, # testers and coders. Users not belonging to either group # are placed in OTHERS. # # The users section places individuals into a workload group based # on their login. Users not listed will run in OTHERS.
PAGE 340
Example configuration files user_application_records.wlm # # Grant 35 shares to coders. # slo coders_fixed { pri = 1; entity = PRM group coders; cpushares = 35 total; } # # Grant 35 shares to testers. # slo testers_fixed { pri = 1; entity = PRM group testers; cpushares = 35 total; } # # Grant 10 shares to servers. # slo servers_fixed { pri = 1; entity = PRM group servers; cpushares = 10 total; } # Grant 10 shares to surfers.
PAGE 341
Example configuration files user_application_records.wlm For more information on how the application manager works, see “How application processes are assigned to workload groups at start-up” on page 455 and “How the application manager affects workload group assignments” on page 459.
PAGE 342
Example configuration files user_application_records.
PAGE 343
10 Monitoring SLO compliance and WLM WLM allows you to monitor SLO compliance and much more information through wlminfo, wlmgui, and EMS, as described in the following sections: • “Monitoring WLM with the wlminfo command” on page 343 • “Monitoring WLM with the wlmgui command” on page 347 • “Monitoring WLM with EMS” on page 354 Monitoring WLM with the wlminfo command The wlminfo command provides various WLM data, with reports focusing on SLOs, metrics, or workloads.
PAGE 344
Monitoring SLO compliance and WLM Monitoring WLM with the wlminfo command A few examples of wlminfo are shown below. In the first example, we focus on SLOs. Entering wlminfo slo -v, we get output that includes the SLOs’ goals, as well as the metrics that show how the workloads are performing relative to the goal. Also, we see from the ‘Concern’ column that two SLOs are Disabled, most likely due to a condition statement. This column helps highlight information.
PAGE 345
Monitoring SLO compliance and WLM Monitoring WLM with the wlminfo command Note also that beginning with WLM A.03.
PAGE 346
Monitoring SLO compliance and WLM Monitoring WLM with the wlminfo command Tue Jun 11 16:06:45 2006 Metric Name _CPU_g_nightly m_nightly_on m_nightly_procs _CPU_g_team _CPU_OTHERS _CPU_g_nice m_apache_access_10min m_apache_access_2min m_list.cgi_procs NOTE 346 PID 2103 2107 2108 2103 2103 2103 2109 2110 2111 State NEW OLD OLD NEW NEW NEW NEW NEW NEW Value 30.549558 0.000000 6.600000 0.000000 65.095184 17.218712 7.000000 0.000000 0.
PAGE 347
Monitoring SLO compliance and WLM Monitoring WLM with the wlmgui command Monitoring WLM with the wlmgui command To display monitoring data graphically, use the wlmgui command. Running wlmgui requires Java Runtime Environment version 1.4.2 or later and, for PRM-based configurations, PRM C.03.00 or later. (To take advantage of the latest updates to WLM and the GUI, use the latest version of PRM available.
PAGE 348
Monitoring SLO compliance and WLM Monitoring WLM with the wlmgui command Monitoring the configuration You can view WLM configurations as well as WLM global arbiter configurations using the GUI. To see a configuration, in the Monitor tab, select the Configurations tab. Figure 10-1 shows parts of a configuration. Scroll bars are available to view the entire configuration.
PAGE 349
Monitoring SLO compliance and WLM Monitoring WLM with the wlmgui command Monitoring the number of CPU resources The GUI allows you to monitor how the number of CPU resources has changed over time. This feature is useful when managing partitions. To see this graph, in the Monitor tab, select the CPU resources tab. Figure 10-2 shows the system has had a constant number of CPU resources (cores) active.
PAGE 350
Monitoring SLO compliance and WLM Monitoring WLM with the wlmgui command Monitoring the workloads By default, the “Workload groups” tab allows you to monitor the CPU shares and CPU usage for one or more workloads/workload groups. The graph in Figure 10-3 shows the amount of CPU resources that are allocated to the OTHERS group as well as how much it is using.
PAGE 351
Monitoring SLO compliance and WLM Monitoring WLM with the wlmgui command In addition to the default ‘CPU Shares’ and ‘CPU Usage’ values, you can graph the selected groups’ ‘Minimum CPU’ and ‘Maximum CPU’ values. These correspond to the gmincpu and gmaxcpu keywords in the configuration. To adjust what values are being graphed, right-click in the graph area to display the menu of options. Figure 10-4 shows this menu.
PAGE 352
Monitoring SLO compliance and WLM Monitoring WLM with the wlmgui command Monitoring SLOs In the “Service-level objectives” tab, we can see graphs of metrics used in SLOs—along with the CPU allocations WLM granted in response to the changing metrics. This view provides a significant amount of other data. Figure 10-5 shows details for the SLO for the test group.
PAGE 353
Monitoring SLO compliance and WLM Monitoring WLM with the wlmgui command Monitoring items you define The Custom tab allows you to pull together any graphable items you want. Figure 10-6 shows one item graphed.
PAGE 354
Monitoring SLO compliance and WLM Monitoring WLM with EMS Monitoring WLM with EMS WLM provides an EMS monitor to track the workloads’ SLO compliance and WLM itself. EMS can also report the status and values of WLM metrics. You can configure EMS to send notification of items such as SLO performance. The monitor places this data in the standard EMS registrar for access from SAM, SMH, HP OpenView operations for unix, and other EMS clients. Figure 10-7 illustrates the role of EMS in using WLM.
PAGE 355
Monitoring SLO compliance and WLM Monitoring WLM with EMS Table 10-1 Overview of WLM EMS resources To determine Check the following EMS resource Whether the WLM daemon is up or down /applications/wlm/daemon_status When the current configuration was activated /applications/wlm/config_modify The priority for the SLO slo_name /applications/wlm/slo_config/slo_name/priority The metric that SLO slo_name uses /applications/wlm/slo_config/slo_name/metric The workload/workload group to which SLO slo_name
PAGE 356
Monitoring SLO compliance and WLM Monitoring WLM with EMS WLM status and time of last configuration EMS resources regarding WLM status and the time of the last configuration include: • /applications/wlm/daemon_status Indicates the state of the WLM daemon. Possible values are: • WLM_DOWN WLM daemon not running WLM_UP Daemon is running: PRM configuration is actively being managed by the WLM daemon /applications/wlm/config_modify Indicates the time when the last WLM configuration was activated.
PAGE 357
Monitoring SLO compliance and WLM Monitoring WLM with EMS SLO status updates EMS resources regarding the status of WLM SLOs include: • /applications/wlm/slo_status/ This class contains a resource instance for each SLO. That instance provides the status of that SLO. • /applications/wlm/slo_status/slo_name This provides the status for slo_name.
PAGE 358
Monitoring SLO compliance and WLM Monitoring WLM with EMS Metric status updates EMS resources providing the status of WLM metrics include the following. A resource instance exists for each metric, where met_name is the name of the metric as specified in the WLM configuration file. • /applications/wlm/metric_config/met_name/coll_argv This identifies a specific metric’s data collector as specified with the coll_argv keyword in the tune structure.
PAGE 359
Monitoring SLO compliance and WLM Monitoring WLM with EMS To demonstrate how these EMS resources function, consider the following WLM configuration: slo orders { pri = 1; entity = PRM group sales; cpushares = 1 total per metric more_1; } slo buying { pri = 1; entity = PRM group marktg; goal = usage_CPU; } tune { absolute_cpu_units = 1; wlm_interval = 1; } tune more_1 { coll_argv = /workforce/sales/data_collector division1; cntl_avg = 1; cntl_smooth = 0.
PAGE 360
Monitoring SLO compliance and WLM Monitoring WLM with EMS Configuring EMS notification Use the HP System Management Homepage (SMH), the enhanced version of SAM, to configure how and when you should be notified of the values of WLM resources. Using an HP-UX 11i v3 (B.11.31) host, SMH enables you to perform system administration tasks on a system through a single Web interface.
PAGE 361
Monitoring SLO compliance and WLM Monitoring WLM with EMS Step 10. Double-click the resource. Step 11. Specify the Monitoring Request Parameters to indicate how you want to receive notification of various WLM events. Step 12. Select the OK button.
PAGE 362
Monitoring SLO compliance and WLM Monitoring WLM with EMS 362 Chapter 10
PAGE 363
A WLM command reference This appendix describes the following WLM commands: • wlmaudit WLM audit report generator • wlmcert WLM certificate manager • wlmcomd WLM communications daemon • wlmcw WLM configuration wizard • wlmd WLM daemon • wlmgui WLM graphical user interface • wlminfo WLM information monitor • wlmpard WLM global arbiter (cross-partition management) • wlmrcvdc WLM built-in data collector • wlmsend Command that makes your command-line or script data available to the wlmrcvdc
PAGE 364
WLM command reference wlmaudit wlmaudit The wlmaudit command displays audit data generated by WLM and its global arbiter. Use the -t option with the WLM daemon wlmd or the global arbiter daemon wlmpard before using wlmaudit. wlmaudit uses /opt/perl/bin/perl to display data from audit files stored in the /var/opt/wlm/audit/ directory. For information on the structure of these files, see wlmaudit(1M).
PAGE 365
WLM command reference wlmaudit -e end_date Instructs wlmaudit to display audit data up to end_date. The default is the date on the current system. Use the format mm/dd/yyyy when specifying end_date. -o html Displays audit data in a formatted HTML report. The default is text.
PAGE 366
WLM command reference wlmcert wlmcert wlmcert allows you to manage your WLM security certificates. WLM uses secure communications by default when you use the /sbin/init.d/wlm script to start WLM. For information on using security certificates, see wlmcert(1M). The command syntax is: wlmcert -h [cmd] wlmcert -V wlmcert reset wlmcert install -c certificate wlmcert delete -c certificate wlmcert list wlmcert extract [-d directory] where -h [cmd] Displays usage information and exits.
PAGE 367
WLM command reference wlmcert This operation is performed automatically when you install WLM. After running this operation: • The system trusts itself • You can use the wlmcert extract command to make a copy of the system’s certificate, which you can then add to other systems’ WLM certificate repositories (truststores) to enable secure communications between the current system and those systems install -c certificate Adds the named certificate to the WLM truststore on the current system.
PAGE 368
WLM command reference wlmcert list Lists the certificates in the WLM truststore on the current system. The current system can communicate securely with any system for which it has a certificate in its truststore. When using WLM management of virtual partitions or nPartitions, each partition must have in its truststore the certificate for every other partition with which it is being managed. extract [-d directory] Extracts the WLM certificate for the current system, placing it in the named directory.
PAGE 369
WLM command reference wlmcomd wlmcomd The wlmcomd communications daemon services requests from the HP-UX Workload Manager (WLM) graphical user interface, wlmgui, allowing local and remote access to the system. You must start wlmcomd to use wlmgui. Start wlmcomd on any system running a WLM daemon (wlmd) or a WLM global arbiter daemon (wlmpard) that you would like to interact with using wlmgui. (wlmpard is needed only if you are using WLM vPar management or its Instant Capacity-based nPartition management.
PAGE 370
WLM command reference wlmcomd hp-wlmcom port_number/tcp If such an entry is found, port_number is used as the port. If such an entry is not found, the default port of 9692 is used, assuming it is not in /etc/services with protocol tcp. If it is, an error is issued. -s Causes WLM to run in secure mode if you have distributed security certificates to the systems or partitions being managed by the same WLM global arbiter (wlmpard). For more information on using security certificates, see wlmcert(1M).
PAGE 371
WLM command reference wlmcomd connections could result in denial of service. You can restrict connections by deploying wlmcomd on systems behind a firewall that blocks access to the port being used.
PAGE 372
WLM command reference wlmcw wlmcw The wlmcw command starts the WLM configuration wizard. The wizard greatly simplifies the creation of your initial WLM configuration. The wizard is only for creating new configurations. It cannot edit existing configurations. Also, it provides only a subset of the WLM functionality to simplify the initial configuration process. After you create a configuration, you can view it to gain a better understanding of how to create more complex configurations manually.
PAGE 373
WLM command reference wlmcw # wlmd -a configfile Appendix A 373
PAGE 374
WLM command reference wlmd wlmd The wlmd (daemon) command controls the WLM daemon, allowing you to activate configurations as well as stop the daemon. You must log in as root to run wlmd, unless you are just using the -c option. The following are valid option combinations: wlmd -h wlmd -V wlmd -C wlmd [-p] [-s] [-t] [-W] [-i] -A [-l logoption[=n][,...]] wlmd [-p] [-s] [-t] [-W] [-i] -a configfile [-l logoption[=n][,...]] wlmd [-W] -c configfile wlmd -k where: 374 -h Displays usage information and exits.
PAGE 375
WLM command reference wlmd -i Initializes workload group assignments, ensuring a new configuration’s user, Unix group, compartment, and application records are used when the same workload groups exist in the active and new WLM configurations.
PAGE 376
WLM command reference wlmd WLM runs in secure mode by default when you use the /sbin/init.d/wlm script to start WLM. (If you upgraded WLM, secure mode might not be the default. Ensure that the appropriate secure mode variables in /etc/rc.config.d/wlm are set correctly. You can change the default by editing the values for these variables. For more information on these variables, see “Securing WLM communications” on page 244.) -t Generates comma-separated audit data files.
PAGE 377
WLM command reference wlmd all Logs group, host, metric, and SLO statistics every WLM interval. all=n Logs group, host, metric, and SLO statistics every n WLM intervals. group Logs group statistics every WLM interval. group=n Logs group statistics every n WLM intervals. host Logs host statistics every WLM interval. host=n Logs host statistics every n WLM intervals. metric Logs metric statistics every WLM interval. metric=n Logs metric statistics every n WLM intervals.
PAGE 378
WLM command reference wlmd % wlminfo slo -o In place of slo, you can also use group, host, or metric. For more information on wlminfo, see wlminfo(1M). You can enable automatic trimming of the wlmdstats file by using the wlmdstats_size_limit tunable in your WLM configuration. For more information, see wlmconf(4). -k NOTE 378 Stops (kills) wlmd. Do not use prmconfig -r while wlmd is active. Use wlmd -k to stop WLM.
PAGE 379
WLM command reference wlmgui wlmgui The wlmgui command invokes the WLM graphical user interface. It allows you to create, modify, and deploy WLM configurations both locally and remotely. In addition, it provides monitoring capabilities. Valid option combinations are: wlmgui wlmgui -h wlmgui -V where: -h Displays usage information and exits. This option overrides all other options. -V Displays version information and exits. This option overrides all options other than -h.
PAGE 380
WLM command reference wlmgui You must start wlmcomd on each system that has a WLM (wlmd) or a WLM global arbiter (wlmpard) that you want to manage using wlmgui. (wlmpard is needed only if you are using WLM vPar management or its Instant Capacity-based nPartition management.) As a security measure, wlmcomd must be explicitly started: /opt/wlm/bin/wlmcomd You can also start wlmcomd at boot time by editing the following file: /etc/rc.config.
PAGE 381
WLM command reference wlminfo wlminfo The wlminfo command provides various WLM data. You indicate the type of data to display by specifying a command with wlminfo. Commands include slo, metric, group, host, and par. Each command has its own options. The following are valid option combinations: wlminfo -h [cmd] wlminfo -V wlminfo -i wlminfo slo [-l] [-o] [-v] [-b { 0 | 1 }] [-q] [-c] wlminfo slo -s slo1 [-s slo2 ...] [-l] [-o] [-v] [-b { 0 | 1 }] [-q] [-c] wlminfo slo -g grp1 [-g grp2 ...
PAGE 382
WLM command reference wlminfo Displays usage information and exits. If you specify cmd, the usage information is limited to cmd data. This option overrides all other options and commands. -V Displays version information and exits. This option overrides all commands and any options other than -h. -i Launches wlminfo in interactive mode, displaying a graphical user interface (GUI). This option overrides all commands.
PAGE 383
WLM command reference wlminfo Displays data about the most active processes. For a description of the wlminfo output, see wlminfo(1M).
PAGE 384
WLM command reference wlmpard wlmpard The wlmpard command controls the HP-UX WLM global arbiter, which governs cross-partition management as well as management of Temporary Instant Capacity (TiCAP) and Pay per use (PPU) resources. Every global arbiter interval (120 seconds by default), the WLM global arbiter checks for CPU resource requests from the partitions using that global arbiter.
PAGE 385
WLM command reference wlmpard -V Displays version information and exits. This option overrides all options other than -h. -C Displays the most recent global arbiter configuration, appending two commented lines that indicate the origin of the configuration. -n Prevents the global arbiter from running in daemon mode (that is, forces it to run in the foreground). -p Causes the global arbiter to run in passive mode.
PAGE 386
WLM command reference wlmpard variable. For more information on this and other secure mode variables, see “Securing WLM communications” on page 244.) -t Generates comma-separated audit data files. These files are placed in the directory /var/opt/wlm/audit/ and are named wlmpard.monyyyy, with monyyyy representing the month and year the data was gathered. You can access these files directly or through the wlmaudit command. For information on wlmaudit or on the format of the data files, see wlmaudit(1M).
PAGE 387
WLM command reference wlmpard -l par Logs statistics in the file /var/opt/wlm/wlmpardstats. You must use -A or -a configfile with -l par. When using -l, specifying: par Logs statistics every global arbiter interval. par=n Logs statistics every n global arbiter intervals. Change the interval as explained in “Specifying the global arbiter interval (optional)” on page 270. For information on setting logging as a default, see “Enabling statistics logging at reboot” on page 245.
PAGE 388
WLM command reference wlmrcvdc wlmrcvdc The wlmrcvdc utility collects data and forwards it to the WLM daemon. It can collect this data from either of the following rendezvous points: • Named pipe (FIFO) You send data to named pipes using wlmsend, discussed in the section “wlmsend” on page 393. wlmrcvdc creates the named pipe, using access permissions of 0600. • A command’s standard output For examples showing how to get data to WLM, see “What methods exist for sending data to WLM?” on page 493.
PAGE 389
WLM command reference wlmrcvdc This utility simplifies the process of providing WLM the data it needs to gauge SLO performance, set shares-per-metric allocations, and enable/disable SLOs. By using wlmrcvdc, you can avoid using the WLM API, which is discussed in “Sending data from a collector written in C” on page 499. On the command line, the following options are available: -h Displays usage information and exits. This option overrides all other options. -V Displays version information and exits.
PAGE 390
WLM command reference wlmrcvdc -g group Sets the Unix group of the FIFO file to group (symbolic or numeric) and changes permissions on the rendezvous point to 0620. If command is specified, this option also sets the Unix group of the command process to group. By default, group is bin. command [args...] Instructs wlmrcvdc to start command with the specified args arguments in the background and use its standard output as the rendezvous point.
PAGE 391
WLM command reference wlmrcvdc glance_prm_byvg Retrieves PRM data regarding logical volumes glance_tt/glance_tt+ Retrieve ARM transaction data for applications registered through the ARM API function arm_init() sg_pkg_active Checks on the status of a Serviceguard package time_url_fetch Measures the response time for fetching a URL using the Apache ab tool.
PAGE 392
WLM command reference wlmrcvdc wlmoradc Produces an SQL value or an execution time (walltime) that results from executing SQL statements against an Oracle instance wlmwlsdc Gathers WebLogic Server Management Bean (MBean) information to track how busy WebLogic instances are NOTE The WLM daemon discards the stderr for command; however, using the coll_stderr tunable, you can redirect stderr to a specified file.
PAGE 393
WLM command reference wlmsend wlmsend The wlmsend utility sends data to a rendezvous point for the wlmrcvdc utility to collect. Use wlmsend on the command line, in a shell script, or in a perl program. For examples showing how to use wlmsend to get data to WLM, see “What methods exist for sending data to WLM?” on page 493. The syntax is: wlmsend [-h] [-V] [-w wait_time] metric [value] -h Displays usage information then exits. This option overrides all other options.
PAGE 394
WLM command reference wlmsend NOTE Be careful of I/O buffering when feeding data to wlmsend.
PAGE 395
B WLM configuration file syntax overview This appendix provides a quick reference for the WLM configuration file syntax, indicating the required and optional components. Optional components are enclosed in square brackets ([]). Pointers to detailed syntax information are also included. Also included is an example WLM configuration.
PAGE 396
WLM configuration file syntax overview Configuration file syntax } slo slo2_name { pri = priority; [ entity = PRM group group_name; ] cpushares = value { more | total } [ per metric met [ plus offset ] ]; [ mincpu = lower_bound_request; ] [ maxcpu = upper_bound_request; ] [ condition = condition_expression; ] [ exception = exception_expression; ] } tune [ [ [ [ [ [ [ [ [ [ [ [ [ [ [ metric [ slo_name ] ] { coll_argv = data_collector; ] wlm_interval = number_of_seconds; ] absolute_cpu_units = {0 | 1}; ] di
PAGE 397
WLM configuration file syntax overview Configuration file syntax • “Defining the PRM components (optional)” on page 149 • “Defining SLOs” on page 186 This section explains the two different forms of the slo structure shown previously.
PAGE 398
WLM configuration file syntax overview Configuration file example Configuration file example Use the following example to better understand the syntax. For an explanation of the file’s components, see Chapter 5, “Configuring WLM,” on page 135.
PAGE 399
WLM configuration file syntax overview Configuration file example # This is a stretch goal for the finance query group. If all other CPU # requests of higher priority SLOs have been met, apply more CPU to # group finance, so its application runs faster. slo finance_query_stretch { pri = 5; mincpu = 20; maxcpu = 80; entity = PRM group finance; goal = metric fin_app.query.resp_time < 1.
PAGE 400
WLM configuration file syntax overview Configuration file example 400 Appendix B
PAGE 401
C HP-UX command and system call support Several HP-UX commands and system calls support WLM in assigning users and applications to the proper workload groups. Other commands have options that allow you to use WLM more efficiently. These are standard HP-UX commands and system calls; they are not shipped as part of the WLM or PRM products. Table C-1 lists HP-UX commands and system calls that support workload groups.
PAGE 402
HP-UX command and system call support Table C-2 describes HP-UX commands that have options for WLM. Table C-2 WLM options in HP-UX commands Command Option Description acctcom -P Displays the workload group ID (PRMID) of each process. acctcom -R group Displays only processes belonging to the workload group given by group, which is specified by workload group name or workload group ID. id -P Displays the workload group ID (PRMID) and name of the invoking user’s initial group.
PAGE 403
D Integration with other products WLM integrates with various other products to provide greater functionality.
PAGE 404
Integration with other products Integrating with Process Resource Manager (PRM) Integrating with Process Resource Manager (PRM) You can use WLM to control resources that are managed by PRM. WLM uses PRM when a prm structure is included in the WLM configuration. With such configurations, you can use PRM’s informational and monitoring commands such as prmlist and prmmonitor. You can also use the prmrun and prmmove commands, among others.
PAGE 405
Integration with other products Integrating with processor sets (PSETs) Integrating with processor sets (PSETs) PSETs allow you to group processors together, dedicating those CPU resources to certain applications. WLM can automatically adjust the number of CPU resources in a PSET-based workload group in response to SLO performance.
PAGE 406
Integration with other products Integrating with processor sets (PSETs) NOTE 406 On HP-UX 11i v1 (B.11.11) systems, you must install PSET (PROCSETS) software; see the HP-UX WLM Release Notes. PSET functionality comes with HP-UX 11i v2 (B.11.23) and later. Certain software restrictions apply to using PSET-based groups with virtual partitions (vPars), Instant Capacity, and Pay per use. For more information, see the WLM Release Notes (/opt/wlm/share/doc/Rel_Notes).
PAGE 407
Integration with other products Integrating with nPartitions (nPars) Integrating with nPartitions (nPars) You can run WLM within and across nPartitions. For systems with partitions using Instant Capacity cores, WLM provides a global arbiter, wlmpard, that can take input from the WLM instances on the individual partitions. The global arbiter then “moves” cores across partitions, if needed, to better achieve the SLOs specified in the WLM configuration files that are active in the partitions.
PAGE 408
Integration with other products Integrating with HP Integrity Virtual Machines (Integrity VM) Integrating with HP Integrity Virtual Machines (Integrity VM) HP Integrity Virtual Machines is a robust soft partitioning and virtualization technology that provides operating systems isolation, shared CPU resources (with sub-core granularity), shared I/O, and automatic, dynamic resource allocation. It is available for HP-UX 11i v2 running on HP Integrity servers.
PAGE 409
Integration with other products Integrating with HP Integrity Virtual Machines (Integrity VM) Running WLM on an Integrity VM Host To run WLM on the Integrity VM Host, you must use a strictly host-based configuration—a WLM configuration designed exclusively for moving cores across HP-UX Virtual Partitions or nPartitions, or for activating Temporary Instant Capacity (TiCAP) or Pay per use (PPU) cores. (WLM will not run with FSS groups or PSETs on Integrity VM Hosts where guests are running.
PAGE 410
Integration with other products Integrating with Temporary Instant Capacity (TiCAP)/ Pay per use (PPU) Integrating with Temporary Instant Capacity (TiCAP)/ Pay per use (PPU) This section discusses the use of WLM with Temporary Instant Capacity (v6 or later) or Pay per use (v4, v7, or later). (Instant Capacity was formerly known as iCOD.) In particular, with WLM managing the use of these CPU resources, you can ensure your workloads use only the amount of resources needed for them to meet their SLOs.
PAGE 411
Integration with other products Integrating with Temporary Instant Capacity (TiCAP)/ Pay per use (PPU) To take advantage of this optimization, use the utilitypri keyword in your global arbiter configuration as explained in wlmparconf(4) and wlmpard(1M). NOTE Specifying this priority ensures WLM maintains compliance with your Temporary Instant Capacity usage rights.
PAGE 412
Integration with other products Integrating with Security Containment (to form Secure Resource Partitions) Integrating with Security Containment (to form Secure Resource Partitions) The HP-UX feature Security Containment provides file and process isolation and is available starting with HP-UX 11i v2. Combining that isolation with WLM workload groups, you can form Secure Resource Partitions, which give your workload groups both isolation and automatic resource allocation.
PAGE 413
Integration with other products Integrating with OpenView Performance Agent (OVPA) / OpenView Performance Manager (OVPM) Integrating with OpenView Performance Agent (OVPA) / OpenView Performance Manager (OVPM) You can treat your workload groups as applications and then track their application metrics in OpenView Performance Agent for UNIX as well as in OpenView Performance Manager for UNIX.
PAGE 414
Integration with other products Integrating with Serviceguard http://openview.hp.com. Now all the application metrics will be in terms of workload (PRM) groups. That is, your workload groups will be “applications” for the purposes of tracking metrics. Integrating with Serviceguard This section discusses how you can better use WLM with Serviceguard. The optional HP product Serviceguard provides users and applications with a high availability environment.
PAGE 415
Integration with other products Integrating with Serviceguard Figure D-2 WLM integration with Serviceguard Before failover PackageA PackageB PackageC Server1 Server2 After failover X Server1 Appendix D PackageA PackageB PackageC Server2 415
PAGE 416
Integration with other products Integrating with Serviceguard Steps for integration To integrate WLM with Serviceguard: Step 1. Install WLM on each node in your Serviceguard cluster. Step 2. Edit a single WLM configuration file to handle all the Serviceguard packages in the cluster. The following example assumes there are two packages: pkgA and pkgB. This configuration file must: a. Place all the packages’ applications in workload groups in the prm structure: prm { ...
PAGE 417
Integration with other products Integrating with Serviceguard c. Set up slo structures for each status metric, with the SLO being active when the package is active: slo pkgA_slo { ... condition = metric pkgA_active; ... } slo pkgB_slo { ... condition = metric pkgB_active; ... } Recall that the condition statement governs when an SLO is active: If an SLO’s condition expression is true, the SLO is active.
PAGE 418
Integration with other products Integrating with Serviceguard workload groups with inactive SLOs receive a minimum of 0.2% of the total CPU resources (with incremental allocations of 0.1%). Similarly, if you are using WLM memory management, the workload group with no active SLOs receives 1% of the memory (0.2% if extended_shares is set, with incremental allocations of 0.1%), unless the group has a gminmem value requesting more.
PAGE 419
Integration with other products Integrating with Serviceguard Step 5. Activate WLM with your configuration file on all the nodes: # /opt/wlm/bin/wlmd -a configfile NOTE If you are using WLM’s secure communications, be sure to distribute the security certificates as explained in the section HOW TO SECURE COMMUNICATIONS in wlmcert(1M). For information on Serviceguard, see the manual Managing MC/ServiceGuard, available at http://docs.hp.com.
PAGE 420
Integration with other products Integrating with HP Systems Insight Manager (SIM) and Servicecontrol Manager (SCM) Integrating with HP Systems Insight Manager (SIM) and Servicecontrol Manager (SCM) This section discusses how you can use WLM with the HP products Systems Insight Manager and Servicecontrol Manager. These products both provide a single point of administration for multiple HP-UX systems. Systems Insight Manager is the newer product.
PAGE 421
Integration with other products Integrating with HP Systems Insight Manager (SIM) and Servicecontrol Manager (SCM) Activate WLM Configuration Gracefully shuts down WLM on the selected nodes, then restarts it, activating the WLM configuration file at /var/opt/wlm/SCM-managed.wlm. Run this tool after placing a configuration on the node using the tool Install WLM Configuration. Enable WLM Sets the variable WLM_ENABLE to 1 in the file /etc/rc.config.d/wlm on the selected nodes.
PAGE 422
Integration with other products Integrating with HP Systems Insight Manager (SIM) and Servicecontrol Manager (SCM) Retrieve WLM Configuration Prompts you for a destination directory on the CMS, then places the currently activated configuration files from the selected nodes in the specified directory. These files are named $HOST.wlm, with $HOST replaced by the names of the nodes the files were retrieved from.
PAGE 423
Integration with other products Integrating with HP Systems Insight Manager (SIM) and Servicecontrol Manager (SCM) Checking the syntax after distributing the files allows you to verify that the applications, data collectors, and any users in the configuration file actually exist on the selected nodes. However, if you have other types of syntax issues, you will have to fix the issues in each of the distributed files—or fix them once in the CMS file and redistribute.
PAGE 424
Integration with other products Integrating with HP Systems Insight Manager (SIM) and Servicecontrol Manager (SCM) Accessing the WLM tools NOTE Using WLM with SIM / SCM requires that you install the fileset CMSConfig.WLMB-CMS-Tools on the SIM / SCM CMS. This fileset is available from the depot /var/opt/mx/depot11 on the host where WLM has been installed.
PAGE 425
Integration with other products Integrating with HP Systems Insight Manager (SIM) and Servicecontrol Manager (SCM) For more SCM information For SCM documentation, visit: http://docs.hp.
PAGE 426
Integration with other products Integrating with Oracle® databases Integrating with Oracle® databases WLM allows you to place Oracle database instances and other applications in their own WLM workloads. With the instances and applications separated in this manner, WLM can then manage the performance of each instance and application through prioritized SLOs.
PAGE 427
Integration with other products Integrating with Oracle® databases Why use Oracle database metrics with WLM? The key benefit of using Oracle database metrics with WLM is that you can use the database metrics to manage the performance of your instances. You specify SLOs for the instances based on the metrics.
PAGE 428
Integration with other products Integrating with Oracle® databases Tools in the HP-UX WLM Oracle Database Toolkit (ODBTK) This toolkit includes two tools: wlmoradc wlmoradc is a data collector for Workload Manager and is designed to provide an easy building block for Oracle instance management with wlmd.
PAGE 429
Integration with other products Integrating with Oracle® databases What metrics are available? The following types of database metrics are available: • Time elapsed while SQL code executes • Value returned by executed SQL code This value can be information from Oracle V$ tables. These tables provide dynamic performance data for Oracle instances and allow the Oracle database administrator to see current performance information.
PAGE 430
Integration with other products Integrating with Oracle® databases Table D-1 ODBTK’s example WLM configuration files described WLM configuration file Purpose timed_sys_table.wlm Demonstrate a response-time goal using V$ Oracle system tables user_cnt_boost.wlm Demonstrate a conditional allocation, with a new allocation enforced when more than a set number of users connect Table D-2 describes the wlmoradc configuration files.
PAGE 431
Integration with other products Integrating with Oracle® databases ODBTK examples As noted in the previous section, ODBTK comes with many useful examples in the directory /opt/wlm/toolkits/oracle/config/. We will look at parts of those files here. The first example draws from /opt/wlm/toolkits/oracle/config/shares_per_user.wlm. The slo structure gives workload group Ora_grp_1 three CPU shares for each user connected to the associated Oracle instance.
PAGE 432
Integration with other products Integrating with Oracle® databases The next example is from /opt/wlm/toolkits/oracle/config/user_cnt_boost.wlm. The second SLO, Ora_1_slo, provides a minimum 20 share allocation. The first SLO, Ora_1_slo_boost, becomes active and boosts the allocation to 40 shares if the metric oracle.instance1.user_cnt is 11 or more. slo Ora_1_slo_boost { pri = 1; cpushares = 40 total; entity = PRM group Ora_grp_1; condition = metric oracle.instance1.
PAGE 433
Integration with other products Integrating with Oracle® databases For more ODBTK information If you would like to learn more about ODBTK, see: Appendix D • wlmtk(5) • wlmoradc(1M) • smooth(1M) • HP-UX Workload Manager Toolkits User’s Guide (opt/wlm/toolkits/doc/WLMTKug.pdf) • HP-UX Workload Manager Toolkits A.01.10.
PAGE 434
Integration with other products Integrating with Apache Integrating with Apache WLM can help you manage and prioritize Apache-based workloads through the use of the WLM Apache Toolkit (ApacheTK), which is part of the freely available product WLM Toolkits (WLMTK). WLM can be used with Apache processes, Tomcat, CGI scripts, and related tools using the HP-UX Apache-based Web Server.
PAGE 435
Integration with other products Integrating with Apache How do I get started with ApacheTK The best way to use ApacheTK is to read the white paper Using HP-UX Workload Manager with Apache available from /opt/wlm/toolkits/apache/doc/apache_wlm_howto.html and at the following Web site: http://www.hp.
PAGE 436
Integration with other products Integrating with BEA WebLogic Server Integrating with BEA WebLogic Server WLM can help you manage and prioritize WebLogic-based workloads through the use of the WLM BEA WebLogic Toolkit (WebLogicTK), which is part of the freely available product WLM Toolkits (WLMTK). Why use WebLogicTK? Using WLM with WebLogic you can move CPU resources to or from WebLogic Server instances as needed to maintain acceptable performance.
PAGE 437
Integration with other products Integrating with BEA WebLogic Server For more WebLogicTK information If you would like to learn more about WebLogicTK, see: Appendix D • /opt/wlm/toolkits/weblogic/doc/weblogic_wlm_howto.html. • wlmtk(5) • wlmwlsdc(1M) • expsmooth(1M) • HP-UX Workload Manager Toolkits User’s Guide (opt/wlm/toolkits/doc/WLMTKug.pdf) • HP-UX Workload Manager Toolkits A.01.10.
PAGE 438
Integration with other products Integrating with SAP software Integrating with SAP software In conjunction with HP Serviceguard Extension for SAP (SGeSAP), WLM provides integration with SAP applications through the WLM SAP Toolkit (SAPTK). This toolkit is part of the freely available WLM Toolkits, or WLMTK. SAP has several different types of processes such as dialog (interactive), batch, update, and spool processes. Each type of process might have greater importance at unique times of the month.
PAGE 439
Integration with other products Integrating with SAP software wlmsapmap wlmsapmap is the SAP process ID collector for WLM. It returns a list of process IDs that are of a specified process type. SAP has several process types, including dialog (DIA), batch (BTC), and update (UPD). How do I get started with SAPTK? The best way to get started with SAPTK is to read the white papers Using HP-UX Workload Manager with SAP and SAP and HP-UX Workload Manager: Potential use cases, available from: http://www.hp.
PAGE 440
Integration with other products Integrating with SAS software Integrating with SAS software WLM provides integration with SAS software through the WLM Toolkit for Base SAS Software (SASTK). This toolkit, which is part of the freely available WLM Toolkits, or WLMTK, product relies upon the WLM Duration Management Toolkit, DMTK.
PAGE 441
Integration with other products Integrating with SAS software • Examples that show how express lanes can be used to quickly complete urgent jobs DMTK does not reduce the amount of CPU time an application must have to complete; it merely attempts to regulate the application’s access to CPU resources. For example, if an application takes one hour to complete when using 100% of the CPU resources, DMTK cannot make its duration less than one hour.
PAGE 442
Integration with other products Integrating with the HP-UX SNMP agent Integrating with the HP-UX SNMP agent WLM provides integration with the HP-UX SNMP agent through the WLM SNMP Toolkit (SNMPTK). This toolkit is part of the freely available WLM Toolkits, or WLMTK. SNMPTK provides a WLM data collector called snmpdc, which fetches values from an SNMP agent so you can use them as metrics in your WLM configuration.
PAGE 443
Integration with other products Integrating with the HP-UX SNMP agent For more SNMPTK information If you would like to learn more about SNMPTK, see: Appendix D • wlmtk(5) • snmpdc(1M) • HP-UX Workload Manager Toolkits A.01.10.
PAGE 444
Integration with other products Integrating with the HP-UX SNMP agent 444 Appendix D
PAGE 445
E Useful PRM utilities This appendix highlights PRM utilities that WLM users may find helpful. NOTE These PRM utilities are helpful only when the current WLM configuration includes a prm structure. For additional information, see the manpages or to the Process Resource Manager User’s Guide (available in /opt/prm/doc/). Useful PRM utilities are: • prmanalyze (with any options) Allows you to analyze resource usage and contention to help plan configurations.
PAGE 446
Useful PRM utilities • prmmonitor (with any options) Monitors current PRM configuration and resource usage by workload group. • prmmove (with any options) Moves processes or groups of processes to another workload group. • prmrun (with any options) Runs an application in its assigned group or in a specified group. Do not use the prmloadconf and prmrecover tools with WLM.
PAGE 447
F Understanding how PRM manages resources One of the ways in which WLM performs resource management is through HP Process Resource Manager (PRM). This appendix focuses on management using PRM. The appendix is for background information and is included here only for completeness, as WLM uses some of these concepts and tools. Table F-1 provides an overview of the resource management.
PAGE 448
Understanding how PRM manages resources Management of CPU resources Management of CPU resources When you configure WLM, you specify minimum and maximum requests for CPU shares and/or a shares-per-metric request for each SLO. Based on the SLO’s priority, the workload’s performance, and the system’s available CPU resources, WLM manages each workload group’s number of CPU shares, increasing or decreasing that number automatically.
PAGE 449
Understanding how PRM manages resources Management of real memory Management of real memory NOTE WLM manages memory based on your use of the keywords gminmem, gmaxmem, and memweight in your WLM configuration. A portion of real memory is always reserved for the kernel (/stand/vmunix) and its data structures, which are dynamically allocated. The amount of real memory not reserved for the kernel and its data structures is termed available memory—available memory is not the total memory on the system.
PAGE 450
Understanding how PRM manages resources Management of real memory Capping memory use You can optionally specify a memory cap (upper bound) for a workload group using the gmaxmem keyword in your WLM configuration, as explained in “Specifying a group’s maximum memory (optional)” on page 184. Typically, you might choose to assign a memory cap to a workload group of relatively low priority, so that it does not place excessive memory demands on the system.
PAGE 451
Understanding how PRM manages resources Management of real memory Figure F-1 Locked memory distribution by prm2d memory manager 200 Mbytes of available memory (includes lockable memory) GroupA 170 Mbytes of lockable memory GroupB GroupC PRM does not suppress a process that uses locked memory once the process has the memory because suppressing the process will not cause it to give back memory pages. However, the memory resources that such a process consumes are still counted against its PRM group.
PAGE 452
Understanding how PRM manages resources Management of disk bandwidth Management of disk bandwidth PRM manages disk bandwidth at the logical volume group level. As such, your disks must be mounted and under the control of Logical Volume Manager (LVM) to take advantage of PRM disk bandwidth management. LVM divides the disk in much the same way as the hard partitions implemented under previous versions of HP-UX for the Series 800 systems.
PAGE 453
Understanding how PRM manages resources How resource allocations interact Multiple users accessing raw devices (raw logical volumes) will tend to spend most of their time seeking. The overall throughput on this group will tend to be very low. This degradation is not due to PRM’s disk bandwidth management. When performing file system accesses, you need approximately six disk bandwidth consumers in each workload group before I/O scheduling becomes noticeable. With two users, you just take turns.
PAGE 454
Understanding how PRM manages resources Management of applications To use system resources in the most efficient way, monitor typical resource use in workload groups and adjust shares accordingly. You can monitor resource use with the wlminfo command, the prmanalyze command, the prmmonitor command, or the optional HP product GlancePlus. For wlminfo information, see wlminfo(1M). For prmanalyze information, see prmanalyze(1). For more information on prmmonitor, see prmmonitor(1).
PAGE 455
Understanding how PRM manages resources Management of applications These rules may not apply to processes that bypass login. For more information, see the Process Resource Manager User’s Guide. How application processes are assigned to workload groups at start-up Table F-2 describes what workload groups an application process is started in based on how the application is started.
PAGE 456
Understanding how PRM manages resources Management of applications Table F-2 Group assignments at process start-up (Continued) Process initiated Process runs in workload group as follows By prmmove {targetgrp | -i} Process runs in the workload group specified by targetgrp or in the user’s initial group. The application manager cannot move a process started in this manner to another group.
PAGE 457
Understanding how PRM manages resources Management of applications However, the next record uses a wildcard in the directory name and is not valid: apps = PRM_SYS : “/opt/wl?/bin/wlmd”; #INVALID NOTE Be sure to quote strings that include wildcard characters.
PAGE 458
Understanding how PRM manages resources Management of applications db02_payroll db03_payroll db04_payroll dbsmon_payroll dbwr_payroll dbreco_payroll To make sure all payroll processes are put in the same workload group, use pattern matching in the alternate names field of the application record, as shown in the following example: apps = business_apps : “/usr/bin/database db*payroll”; For alternate names and pattern matching to work, the processes must share the same file ID.
PAGE 459
Understanding how PRM manages resources Management of applications How the application manager affects workload group assignments The PRM application manager checks that applications are running in the correct workload groups every interval seconds. The default interval is 30 seconds; however, you can change it with “prmconfig -I interval APPL” as explained in prmconfig(1).
PAGE 460
Understanding how PRM manages resources Management of applications 2. Is the process running in a secure compartment that is mapped to a workload group using a compartment record? (Use the HP-UX feature Security Containment to create secure compartments.) If yes, move the process to the group indicated in the compartment record. If no, continue with the checklist. 3.
PAGE 461
Understanding how PRM manages resources Management of applications If yes, move the process to the PRM_SYS group. If no, the application manager continues with the checklist. 5. Is the process in the PRM_SYS group, or does it have a user ID different from its parent? If yes, continue with the checklist. If no, leave the process where it is. 6. Is the process run by a user associated with a user record? If yes, move the process to the initial group indicated in the user record.
PAGE 462
Understanding how PRM manages resources Management of applications In this case, the application manager determines that the application has been moved manually and leaves it as is, in GroupA. Next, assume the user launches the bar application, which also has an application record. % bar The application starts in the invoking user’s initial group. However, the application manager will soon place the application in the group specified in the application record, GroupF.
PAGE 463
Understanding how PRM manages resources Management of applications % calendar The second and fourth records both seem to match the calendar command. The expressions are expanded in the order they appear in the configuration file. So, the second record is expanded first and is used for the calendar process, placing it in GroupB.
PAGE 464
Understanding how PRM manages resources Management of applications 464 Appendix F
PAGE 465
G Migrating from PRM to WLM This appendix provides information on converting PRM configuration files to WLM configuration files. If your PRM configuration places users in PRM groups, these users can be grouped similarly in WLM workload groups. If you are migrating from PRM to WLM, you can quickly convert your PRM configuration files to WLM configuration files with the wlmprmconf utility.
PAGE 466
Migrating from PRM to WLM The wlmprmconf utility expects that, if specified in the PRM configuration file, groups PRM_SYS and OTHERS have IDs 0 and 1, respectively. Conversely, if IDs 0 and 1 are specified, the corresponding group names must be PRM_SYS and OTHERS, respectively. When the PRM configuration file does not follow these criteria, wlmprmconf generates a WLM configuration file with comments indicating the inconsistency.
PAGE 467
H Advanced WLM usage: Using performance metrics This appendix includes details about configuring WLM SLOs based on metric goals and supplying performance data to WLM. Configuring WLM for metric-based SLOs This section includes details about configuring WLM SLOs to satisfy metric goals. Overview To use WLM with metric goals, follow these basic steps: Step 1. Identify the workloads to run on a given system. Each workload can consist of one or more applications and multiple users. Step 2.
PAGE 468
Advanced WLM usage: Using performance metrics Configuring WLM for metric-based SLOs NOTE • The easiest data collector to set up is wlmrcvdc using the sg_pkg_active command, wlmoradc command, one of the glance_* commands, or one of the other commands given in the wlmrcvdc section of Appendix A, “WLM command reference,” on page 363. • You can also set up wlmrcvdc to forward the stdout of a datacollecting command to WLM.
PAGE 469
Advanced WLM usage: Using performance metrics Configuring WLM for metric-based SLOs For information on passive mode, including its limitations, see “Passive mode versus actual WLM management” on page 238. Activate the WLM file configfile in passive mode as follows: # wlmd -p -a configfile To see approximately how the configuration would affect your system, use the WLM utility wlminfo.. Step 6. Activate the configuration.
PAGE 470
Advanced WLM usage: Using performance metrics Configuring WLM for metric-based SLOs When using wlminfo slo, there are two columns that can indicate the death of a data collector process: State and Concern. For more information on these columns, see wlminfo(1M). Alternatively, configure EMS monitoring requests that notify you on the death of a data collector.
PAGE 471
Advanced WLM usage: Using performance metrics Configuring WLM for metric-based SLOs The goal keyword is optional. If neither the goal nor cpushares keyword is specified, the SLO is allocated CPU resources according to its mincpu setting. For information on setting mincpu, see “Specifying the lower and upper bound requests on CPU resources (optional)” on page 196. You cannot specify both a goal statement and a cpushares statement in the same SLO.
PAGE 472
Advanced WLM usage: Using performance metrics Configuring WLM for metric-based SLOs Specifying a shares-per-metric allocation request (optional) An SLO can directly express an allocation request using the cpushares keyword. This keyword allows you to make allocation requests of the form “x shares of the CPU resources for each metric y”. For example, you could give an Oracle instance n CPU shares for each process in the instance.
PAGE 473
Advanced WLM usage: Using performance metrics Configuring WLM for metric-based SLOs 1% of the system’s total CPU resources by default (if the tunable extended_shares is set to 1, the gmincpu value is 0.2% by default). request_value is bounded by the SLO’s mincpu and maxcpu values, if specified. If the workload group can get a larger allocation from an SLO with an absolute allocation request at that priority, it does so.
PAGE 474
Advanced WLM usage: Using performance metrics Configuring WLM for metric-based SLOs Consider using absolute_cpu_units (discussed in “Using absolute CPU units” on page 217) to minimize the effects of a system’s variable number of CPU resources on your offset value.
PAGE 475
Advanced WLM usage: Using performance metrics Configuring WLM for metric-based SLOs Providing CPU resources in proportion to a metric This section explains how to set up a workload group with an allocation that varies in proportion to a metric. To adjust a workload’s CPU allocation in step with a given metric, such as number of processes in the workload, users connected to a database, or any other metric, use the cpushares keyword with per metric in an slo structure.
PAGE 476
Advanced WLM usage: Using performance metrics Configuring WLM for metric-based SLOs In this example, the focus is the sales group, which will get a varying amount of CPU resources based on a metric: prm { groups = sales : 2; apps = sales : /opt/sales/bin/sales_monitor; } Step 2. Define the SLO. The SLO in your WLM configuration file must specify a priority (pri) for the SLO, the workload group to which the SLO applies (entity), and the cpushares statement to request CPU resources in proportion to a metric.
PAGE 477
Advanced WLM usage: Using performance metrics Configuring WLM for metric-based SLOs tune application_procs { coll_argv = wlmrcvdc glance_prm APP_ACTIVE_PROC sales; } As a result of this structure, each time the glance_prm command prints a value on standard out, wlmrcvdc sends the value to WLM for use in changing the CPU allocation for the sales group. For information on other ways to use wlmrcvdc, see wlmrcvdc(1M). Step 4. Activate the configuration. # /opt/wlm/bin/wlmd -a config.
PAGE 478
Advanced WLM usage: Using performance metrics Configuring WLM for metric-based SLOs data_collector_and_arguments Is the full path to a data collector, plus any arguments. Separate arguments with white space. Use double quotes to form single arguments for an option and when using characters used in the syntax, such as semicolons, pound characters (#), and curly brackets. This string cannot exceed 240 characters.
PAGE 479
Advanced WLM usage: Using performance metrics Configuring WLM for metric-based SLOs tune order_transaction_time { coll_argv = /home/orders/data_collector; } # Value of metric analysis_application_running given to WLM by wlmsend # through global wlmrcvdc slo analysis { ... condition = metric analysis_application_running; ... } # Value of metric job_count given to WLM by wlmsend through # global wlmrcvdc slo batch_processing { ... cpushares = 2 more per metric job_count; ...
PAGE 480
Advanced WLM usage: Using performance metrics Configuring WLM for metric-based SLOs coll_stderr Is an optional tunable with a default value of /dev/null. Specify this tunable in global or metric-specific tune structures. file Is either syslog (which corresponds to syslog on the system through the logger command, typically /var/adm/syslog/syslog.log) or the full path to a file. For example, you can specify file in either of the following ways: tune num_users { ... coll_stderr = syslog; ...
PAGE 481
Advanced WLM usage: Using performance metrics Configuring WLM for metric-based SLOs Is a floating-point value ranging from 0 to 0.999. The default value is 0, resulting in no smoothing. Values closer to 1 result in more smoothing. NOTE Do not use cntl_smooth for metrics that are expected to equal zero in SLO condition expressions. For example, do not use smoothing globally if your configuration uses the sg_pkg_active collector to indicate a Serviceguard package is active (1) or inactive (0).
PAGE 482
Advanced WLM usage: Using performance metrics Supplying data to WLM Supplying data to WLM You supply data to WLM so it can manage SLOs that have performance goals, shares-per-metric allocations, or condition/exception statements with metric expressions. You gather this data with collectors provided by WLM or ones you develop yourself. Data collectors are specified in the WLM configuration file. WLM spawns the collectors when the configuration is activated.
PAGE 483
Advanced WLM usage: Using performance metrics Supplying data to WLM NOTE A white paper on data collectors (”Writing a Better WLM Data Collector”) is available at the following location: /opt/wlm/share/doc/howto/perfmon.html This paper provides additional information on implementing your data collectors. How applications can make metrics available to WLM Consider the following overview of how to get metrics to WLM.
PAGE 484
Advanced WLM usage: Using performance metrics Supplying data to WLM Independent collectors Independent collectors use the wlmsend command to communicate a metric value to WLM. For more information on this command, see “What methods exist for sending data to WLM?” on page 493. They are called “independent” because they are not started by the WLM daemon wlmd, and they are not required to run continuously.
PAGE 485
Advanced WLM usage: Using performance metrics Supplying data to WLM Because they are started by a daemon process (wlmd), stream collectors do not have a stderr on which to communicate errors. However, WLM provides the coll_stderr tunable that allows you to log each collector’s stderr. In addition, a stream data collector can communicate using either syslog(3C) or logger(1) with the daemon facility.
PAGE 486
Advanced WLM usage: Using performance metrics Supplying data to WLM tune metricNC { coll_argv = collector_path collector_args ; } wlmrcvdc is an example of a native collector. What happens when there is no new data? If an entire WLM interval passes and WLM does not receive any new data for a metric, all controllers using that metric simply request the same allocations for their associated workload groups as they did the previous interval.
PAGE 487
Advanced WLM usage: Using performance metrics Supplying data to WLM GlancePlus is an optional HP product. Install it if you wish to use the HP ARM implementation or to collect any of the GlancePlus data listed previously. For purchase information, contact your HP sales representative. Table H-1 gives an overview of the types of data you can collect and the methods for transporting that data to WLM.
PAGE 488
Advanced WLM usage: Using performance metrics Supplying data to WLM ARM transaction data For information on: • How to get transaction data for an application that is already instrumented with ARM API calls to WLM • How to instrument an application with ARM API calls see “Sending ARM transaction data from a modified C program” on page 504. For information on how to simulate transactions, see “Sending ARM transaction data from a script with simulated transactions” on page 509.
PAGE 489
Advanced WLM usage: Using performance metrics Supplying data to WLM The percentage of the total CPU time devoted to processes in this group during the interval. This indicates the relative CPU load placed on the system by processes in this group. APP_MEM_UTIL The approximate percentage of the system’s physical memory used as resident memory by processes in this group during the interval.
PAGE 490
Advanced WLM usage: Using performance metrics Supplying data to WLM Percentage of time the CPUresources were active during the interval. GBL_NUM_USER The number of users logged into the system. For a full list of the available data, see the GlancePlus online help, available through gpm. To extract global data (GBL_*), use wlmrcvdc with the glance_gbl command in the configuration file. For example: tune num_users { ... coll_argv = wlmrcvdc glance_gbl GBL_NUM_USER; # Metric to monitor ...
PAGE 491
Advanced WLM usage: Using performance metrics Supplying data to WLM tune active_procs { ... coll_argv = wlmrcvdc glance_prm APP_ACTIVE_PROC # Metric to monitor Grp3; # Name of workload (PRM) group ... } tune mem_util { ... coll_argv = wlmrcvdc glance_prm APP_PRM_MEM_UTIL # Metric to monitor Grp3; # Name of workload (PRM) group ... } Use glance_prm to collect metrics of the form APP_* and APP_PRM_*.
PAGE 492
Advanced WLM usage: Using performance metrics Supplying data to WLM To extract PRM by volume group data (PRM_BYVG_* metrics), use wlmrcvdc with the glance_prm_byvg command in the configuration file. For example: tune vg_util { ... coll_argv = wlmrcvdc glance_prm_byvg PRM_BYVG_GROUP_UTIL # Metric to monitor Grp17 /dev/vg03; # Name of workload (PRM) group # and logical volume group ... } For wlmrcvdc conceptual information, see “Sending data with wlmsend and wlmrcvdc: How it works” on page 509.
PAGE 493
Advanced WLM usage: Using performance metrics Supplying data to WLM For wlmrcvdc conceptual information, see “Sending data with wlmsend and wlmrcvdc: How it works” on page 509. For wlmrcvdc syntax information, see “wlmrcvdc” on page 388.
PAGE 494
Advanced WLM usage: Using performance metrics Supplying data to WLM Table H-2 provides an overview of how to send your data to WLM. For more information on the transport methods, read the sections following the table. Table H-2 Available transport methods For sending data from Command-line/shell script The transport method is On the command line or in a loop in a script: wlmsend metric value or cmnd1 | cmnd2 | [...
PAGE 495
Advanced WLM usage: Using performance metrics Supplying data to WLM Table H-2 Available transport methods (Continued) For sending data from stdout of command The transport method is A tune structure in the configuration file: tune metric { ... coll_argv = wlmrcvdc command; ... } program that uses WLM API (libwlm.sl) A tune structure in the configuration file: tune metric { ... coll_argv = program; ...
PAGE 496
Advanced WLM usage: Using performance metrics Supplying data to WLM # Set up the SLO slo data_cruncher { pri = 3; mincpu = 15; maxcpu = 25; entity = PRM group crunch; goal = metric job_time < 2.0; } # Set up wlmrcvdc tune job_time { coll_argv = wlmrcvdc; } On the command line that follows, some program is updating logfile.
PAGE 497
Advanced WLM usage: Using performance metrics Supplying data to WLM } # Set up wlmrcvdc tune job_time { coll_argv = wlmrcvdc; } Next, set up wlmsend in a loop in a shell script. Here, the function get_value provides the metrics to feed wlmsend. Alternatively the script itself could be interacting with the workload to gather performance data. Add a sleep command to slow down how often metrics are retrieved.
PAGE 498
Advanced WLM usage: Using performance metrics Supplying data to WLM mincpu maxcpu entity goal = = 15; = 25; = PRM group crunch; metric job_time < 2.0; } # Set up wlmrcvdc tune job_time { coll_argv = wlmrcvdc; } Next, set up wlmsend in a loop in a perl program. Here, the function get_value provides the metrics to feed wlmsend. Alternatively the program itself could interact with the workload to gather performance data. Add a sleep command to slow down how often metrics are retrieved.
PAGE 499
Advanced WLM usage: Using performance metrics Supplying data to WLM Sending data via stdout You can use wlmrcvdc to forward the stdout of a command to WLM. If your data collector program prints its metrics to stdout, you can run the program by invoking it with wlmrcvdc in the WLM configuration file. wlmrcvdc automatically forwards stdout to WLM. NOTE If the command exits with status zero, you can use wlmsend to continue feeding data to the rendezvous point for wlmrcvdc.
PAGE 500
Advanced WLM usage: Using performance metrics Supplying data to WLM • Develop a program that writes its data to stdout and use it with wlmrcvdc For information on this approach, see “Sending data via stdout” on page 499. • Use the WLM API This section discusses the API. NOTE You can also write a program that invokes the wlmsend command. However, if you are writing a program, it is recommended that you use wlmrcvdc to capture stdout or you use the API.
PAGE 501
Advanced WLM usage: Using performance metrics Supplying data to WLM For information on signal handling, see “Handling signals in data collectors” on page 515. Run your data collection program in the group PRM_SYS to ensure it receives the proper resources. Each data collector should send data to WLM once per WLM interval, if possible. (For information on this interval, see “Specifying the WLM interval (optional)” on page 215.
PAGE 502
Advanced WLM usage: Using performance metrics Supplying data to WLM NOTE This function is not thread-safe: Multiple threads cannot call the function at the same time. The syntax is: int wlm_mon_attach(char *metric); where metric Is the name of the metric being passed on this connection. This string must exactly match the metric string used in the configuration file, in the specification of the goal.
PAGE 503
Advanced WLM usage: Using performance metrics Supplying data to WLM /opt/wlm/include/wlm.h NOTE This function is not thread-safe: Multiple threads cannot call the function at the same time. The syntax is: ssize_t wlm_mon_write(int handle_id, void *buf, size_t numbytes); where handle_id Is the handle identifier returned by the wlm_mon_attach() call. buf Is a pointer to the double-precision floating-point value to be transferred to the data queue. numbytes Is the number of bytes to write to the data queue.
PAGE 504
Advanced WLM usage: Using performance metrics Supplying data to WLM Closing communications with wlm_mon_detach() The data collector uses the wlm_mon_detach() function to close (detach from) a communication channel with the WLM daemon. To use this function, you must reference the following include file: /opt/wlm/include/wlm.h NOTE This function is not thread-safe: Multiple threads cannot call the function at the same time.
PAGE 505
Advanced WLM usage: Using performance metrics Supplying data to WLM If you are using the HP ARM implementation, you can send transaction data to WLM using wlmrcvdc with the glance_tt command. Figure H-1 presents an overview of how an application that is instrumented with ARM API calls works with WLM. First, the application invokes the ARM API calls, made available through libarm. libarm then feeds the data from the ARM calls to an implementation-specific storage facility.
PAGE 506
Advanced WLM usage: Using performance metrics Supplying data to WLM • The glance_tt(5) manpage, which lists some of the more frequently used metrics (manpage also available at http://www.hp.com/go/wlm) The ARM API consists of six routines, which are described in the following table. For complete usage information on these routines, see arm(3), which is installed when you install GlancePlus. (To see this manpage, be sure “/opt/perf/man” is part of your MANPATH environment variable.
PAGE 507
Advanced WLM usage: Using performance metrics Supplying data to WLM (process the data) arm_stop(process_data, status) arm_start(report_data) (generate report) arm_stop(report_data, status) arm_stop(total_time, status) end loop arm_end To use ARM transaction data with WLM: Step 1. Modify your program to use the ARM API calls described previously. To use these calls, you must reference the include file /opt/perf/include/arm.h.
PAGE 508
Advanced WLM usage: Using performance metrics Supplying data to WLM Step 3. Ensure that ttd is configured and running. For more information, see ttd(1). Step 4. Start the program you instrumented with ARM calls. The start and stop times for your transactions will now be made available through the HP ARM implementation. For more information, see ttd(1) and midaemon(1). Step 5. Edit your WLM configuration to pick up the transaction data.
PAGE 509
Advanced WLM usage: Using performance metrics Supplying data to WLM Sending ARM transaction data from a script with simulated transactions Using transactions that simulate the key transactions of your workload can simplify data collecting. These transactions may not provide the fine granularity that you can achieve by placing ARM API calls in the source code. However, you do avoid modifying the application with instrumentation. Set up simulated transactions as follows: Step 1. Create a simulation.
PAGE 510
Advanced WLM usage: Using performance metrics Supplying data to WLM Use wlmsend on the command line, in a shell script, or in a perl program. Use wlmrcvdc in the WLM configuration file. NOTE Avoid the process overhead of using wlmsend in a compiled data collector; let wlmd invoke your data collector through the configuration file if possible. Use the API described in “Sending data from a collector written in C” on page 499 to write such a data collector.
PAGE 511
Advanced WLM usage: Using performance metrics Supplying data to WLM Figure H-2 illustrates the command-pipe mode of operation for wlmrcvdc. In this mode, wlmrcvdc starts a command, reads its stdout, and forwards that data to the WLM daemon, wlmd.
PAGE 512
Advanced WLM usage: Using performance metrics Supplying data to WLM Figure H-3 ➊ wlmrcvdc: FIFO-file mode (wlmrcvdc) wlmrcvdc creates a rendezvous point based on the metric name Rendezvous point ➌ wlmrcvdc collects the data and forwards it to wlmd wlmrcvdc wlmd wlmsend ➋ wlmsend feeds data into the rendezvous point In this next example, when the WLM configuration file is activated, wlmrcvdc creates a rendezvous point, with a name based on the metric name resp_time.
PAGE 513
Advanced WLM usage: Using performance metrics Supplying data to WLM Figure H-4 ➊ wlmsend wlmsend: command-output mode (command | wlmsend metric) wlmsend continuously feeds its stdin to the rendezvous point Rendezvous point ➋ wlmrcvdc collects the data and forwards it to wlmd wlmrcvdc wlmd data piped to wlmsend command This example of wlmsend’s command-output mode shows how to set up the WLM configuration file and wlmsend to work together to forward the data to WLM.
PAGE 514
Advanced WLM usage: Using performance metrics Supplying data to WLM Figure H-5 illustrates the single-value mode of operation for wlmsend. In this mode, wlmsend is repeatedly invoked with a single metric value. However, it is invoked in a loop in a shell script or a perl program that updates that metric value.
PAGE 515
Advanced WLM usage: Using performance metrics Supplying data to WLM /opt/wlm/bin/wlmsend job_time $value sleep 60 done In a perl program, the loop might look like one of the following: while (1) { $value = &get_value; system “/opt/wlm/bin/wlmsend job_time $value”; sleep(60); } Using a print statement instead of a system statement: open(WLM, “| /opt/wlm/bin/wlmsend $metric”); # make the new file descriptor unbuffered select ((select(WLM), $|=1)[0]); while (1) { $value = &get_value; print WLM “$value\n”; sl
PAGE 516
Advanced WLM usage: Using performance metrics Supplying data to WLM 516 Appendix H
PAGE 517
Glossary absolute CPU units In WLM configurations, absolute CPU units are specified in terms of 100, where 100 represents one core. For example, 50 absolute CPU units represents half of a core; 200 absolute CPU units represents 2 cores. Absolute CPU units are useful when the number of active cores changes due to WLM management of Instant Capacity (iCAP), Temporary Instant Capacity (TiCAP), Pay per use (PPU), or virtual partition resources.
PAGE 518
Glossary deactivated processor deactivated processor A processor that either has not yet been activated or that has been turned off by the Instant Capacity software (formerly known as iCOD software) and returned to the pool of inactive processors. These processors are available for activation. default PSET In configurations managing PSET-based and FSS workload groups, the default PSET, when created at system initialization, consists of all of your system’s processors.
PAGE 519
Glossary nPartitions launches run in the user’s initial group—assuming those applications are not specified in application records. This is the group prmconfig, prmmove -i, login, at, and cron use to determine where to place user processes. If a user does not have a user record or is not in a netgroup that has a user record, the user default group OTHERS becomes the user’s initial group. The initial group is shown as init_group in the following example: users = user : init_group [alt_group1 alt_group2 ...
PAGE 520
Glossary OTHERS group OTHERS group The reserved workload group OTHERS with ID 1. WLM uses this group as the initial group for any user who does not have a user record in the WLM configuration file. creator. When you create a job, the shell assigns all the processes in the job to the same process group. Signals can propagate to all processes in a process group; this is a principal advantage of job control. partitions See the definition for nPartitions and virtual partitions.
PAGE 521
Glossary system administrator system’s CPU resources, which is half of 1 core on a system with only 1 active core, but 8 cores on a system with 16 active cores. See also absolute CPU units. rendezvous point A designated location for holding performance data. The wlmsend utility forwards data for a given metric to a rendezvous point. The wlmrcvdc utility monitors the rendezvous point, receiving the data and sending it the WLM daemon. secure compartment See the definition for Secure Resource Partition.
PAGE 522
Glossary Temporary Instant Capacity (TiCAP) Temporary Instant Capacity (TiCAP) An HP product option included with Instant Capacity (iCAP) that enables you to purchase prepaid processor/core activation rights for a specified (temporary) period of time. Temporary capacity is sold in increments such as 20-day or 30-day increments, where a day equals 24 hours for a core. TiCAP was formerly referred to as TiCOD. See also Instant Capacity (iCAP), Pay per use (PPU).
PAGE 523
Glossary workload group WLM Workload Manager. HP-UX WLM provides automatic resource allocation and application performance management through the use of prioritized service-level objectives (SLOs). WLM daemon See wlmd. You assign critical applications to workload groups. WLM then manages the performance of FSS workload groups by adjusting their CPU resources, while assigning PSET workload groups whole cores for their dedicated use. See also active workload group.
PAGE 524
Glossary workload group 524 Glossary
PAGE 525
Index A absolute CPU units described, 54 enabling, 217 absolute_cpu_units keyword, 217 and Serviceguard, 417 acctcom command support for WLM, 402 allocating CPU resources fixed amount, 92 for given time, 96 granularity of, 219 rising tide model, 122 alternate name pattern matching and, 457 PRM group assignments based on, 460 ApacheTK, 434 API for sending performance data, 493 wlm_mon_attach() function, 501 wlm_mon_detach() function, 504 wlm_mon_write() function, 502 application records defining, 166 workloa
PAGE 526
Index login support for WLM, 401 ps examples, 73 support for WLM, 402 smooth, 428 wlmgui, 347 communications securing automatically at reboot, 244 compartment records defining, 171 workload group placement precedence, 459 condition keyword, 206 configuration file activating global arbiter (wlmpard -a filename), 265, 384 WLM (wlmd -a filename), 241, 376 characters allowed, 136 configuration wizard (wlmcw command), 138 conventions, 136 creating, 136 getting started, 78 examples, 233, 283 location, 79 global
PAGE 527
Index how to write, 482 independent, 484 monitoring to ensure it is running, 131, 469 native, 485 Oracle data, 492 sending existing metrics to WLM, 493 SIGTERM signal, 515 specifying in the configuration file, 215, 477 stderr (capturing), 223, 479 white paper, 483 data collector stream, 484 de-allocating CPU resources when not needed, 233 disk bandwidth manager and swap partitions, 452 how PRM manages disk bandwidth, 452 disk bandwidth shares, specifying, 174 disks keyword, 174 distribute_excess keyword, 17
PAGE 528
Index global tune structure, 210 gmaxcpu keyword, 177 default value (total CPU resources), 178 gmaxmem keyword, 184 default value (100%), 184 gmincpu keyword, 175 default value (1% of system CPU resources), 176 gminmem keyword, 183 default value (1% of system memory), 183 goal compared to stretch goal, 203 determining for workload, 87 keyword, 200 not using with cpushares and ’more’, 200, 471 specifying for an SLO, 199 types performance, 200 usage, 200 goal-based SLOs, 118 groups keyword, 82, 159 H hmaxcpu
PAGE 529
Index cntl_smooth, 223, 480 coll_argv, 215, 477 coll_stderr, 223, 479 condition, 206 cpushares, 204, 205, 472 not using with goal keyword, 200, 471 disks, 174 distribute_excess, 179, 218 default value (0—do not distribute), 219 interaction with weight keyword, 182 entity, 196 exception, 206 extended_shares, 219 gmaxcpu, 177 default value (total CPU resources), 178 gmaxmem, 184 default value (100%), 184 gmincpu, 175 default value (1% of system CPU resources), 176 gminmem, 183 default value (1% of system memo
PAGE 530
Index capping, 450 lockable, 450 manager how PRM manages memory, 449 specifying minimum for workload group, 183 memweight keyword, 184 default value (1), 185 metric goals, 471 configuring, 467 making available to WLM, 483 smoothing, 223, 480 specifying in a condition statement, 207 metric/SLO-specific tune structure, 210 metric-based SLOs, 467 metric-specific tune structure, 210 migrating to WLM from PRM, 465 mincpu keyword, 197 minimum CPU allocation for FSS groups, 176 for PSET-based groups, 176 monitori
PAGE 531
Index data sending to WLM, 500 goals, 471 management, overview, 34 monitoring methods, 34 perl interface to send data to WLM, 497 ports used by WLM, 66 pri keyword, 193 primary_host keyword in WLM configuration file, 264 PRM integrating with WLM, 404 migrating from PRM to WLM, 465 not using with WLM, 59 utilities, 445 prm structure configuring, 151 PRM_SYS group as a reserved workload group, 163 prmmove, 86 prmrun, 86 process ID (PID) finder and process maps, 172 and SAP processes, 438 process map, 44 defin
PAGE 532
Index rendezvous point and wlmsend, 393 creation, 510 reserved workload groups, 163 rising tide model of CPU allocation, 122 rtprio interaction with PRM, 448 rtsched interaction with PRM, 448 S SAPTK, 438 SASTK, 440 SCM integrating with WLM, 420 scomp keyword, 85, 171 secure compartments, 44 assigning to workload groups, 85, 171 secure mode default, 244 Secure Resource Partitions, 412 creating, 171 Security Containment, 171, 412 integrating with WLM, 412 service management, 40 Servicecontrol Manager integra
PAGE 533
Index command and system call, 401 policies, 30 swap partitions, not placing under PRM control, 452 system calls exec support for WLM, 401 fork support for WLM, 401 pstat support for WLM, 401 System Management Homepage (SMH), 360 Systems Insight Manager integrating with WLM, 420 T Temporary Instant Capacity (TiCAP) integrating with WLM, 410 optimizing, 57, 102 global arbiter configuration, 265 specifying priority for resource usage, 273 specifying reserve threshold, 274 temporary_reserve_threshold keyword,
PAGE 534
Index tuning WLM, 228 WebLogic (weblogic_wlm_howto.
PAGE 535
Index default value (0—do not trim file), 224 wlmgui monitoring WLM, 347 wlmgui command syntax, 379 wlminfo command syntax, 381 display examples, 106 monitoring data collectors, 131, 469 monitoring SLO violations, 121 monitoring WLM, 343 wlmoradc (integrating with Oracle), 428 wlmpard configuration file, 265 disabling, 134 starting, 242 stopping, 134 wlmpard command syntax, 384 wlmpard statistics enabling logging at reboot, 246 wlmpardstats file, 63, 113 wlmpardstats_size_limit keyword, 272 default value (0
PAGE 536
Index 536