HP Caliper User's Guide

ManualsBrandsHP ManualsSoftwareHP-UX Caliper Software

341

342

343

344

345

346

347

348

349

350

For control-dominated code or for workloads that seldom miss the internal caches,

this value will be very small. For data-flow-type workloads, this number can, if

extensive prefetching is employed, be quite high, up to a maximum of 16, which

is the Itanium 2 bus limit.

The reported average latency value will be incorrect on Itanium 2 steppings earlier

than B2.

• CPU

CPU transaction component is a measure of the percentage of all bus transactions

generated by all CPUs on a shared front side bus (FSB).

• I/O

I/O transaction component is a measure of the percentage of all bus transactions

initiated by any I/O agent on a shared FSB.

• Util Adrs

Average address bus utilization gives an estimate of total address bus utilization

resulting from all bus transactions to include cache misses, I/O port reads/writes,

interprocessor interrupts, writebacks, cache line invalidates (FC instruction, store

hit on shared line), and clean castouts (if enabled). The utilization is computed as

follows:

ADRS UTIL = 100.0 * (total transactions/sec * 3.0) / bus

cycles/sec

The constant value (3.0) is the number of address cycles needed for each bus

transaction.

• Util Data

Data bus utilization gives a lower bound estimate of total data bus utilization

resulting from bus transactions that result in a data transfer, that is, BRL, BRIL,

BWL, and nonzero byte BRP/BWP transactions. A lower bound data bus utilization

is computed as follows:

DATA BUS CYCLES/SEC = ((BRL + BRIL + BWL + IMPLICIT WB)/sec

* 4.0) +

((nonzero byte BRP's/BWP's)/sec * 1.0)

DATA UTIL = 100 * (DATA BUS CYCLES/SEC) / BUS CYCLES SEC

The constants (4.0 and 1.0) represent the number of cycles that the data bus is

occupied to perform the requisite data transfer. All cache line transfers (brl, bril,

bwl) require four cycles. The nonzero BRP's/BWP's require one or two cycles (16,

32, 64 bytes). Since most of the nonzero BRP's/BWP's are to I/O ports and

semaphores, it was decided to assume a single-cycle transfer. Thus, there is a small

possibility of undercounting cycles.

• BRL

Bus Read Line is the transaction used to read cache lines, due either to an instruction

cache miss or to a load data miss.

342 Event Set Descriptions for CPU Metrics