HP Caliper User's Guide

of instructions executing varies from 1 to 6, which is the maximum dispatch for
the Itanium 2 processor. Taken branches, non-double-bundle aligned branch
targets, and explicit stop bits are the primary determinants of code-based execution
limitations. You can obtain some idea of this from the dispersal event set.
BE Flush
This counts the number of stall cycles resulting from a pipeline flush caused by a
branch misprediction, an exception, an ALAT flush, or a serialization flush.
Scoreboard
This counts stall cycles due to dependencies on integer or floating-point operations,
floating-point flushes, and control or application register read or writes.
L1Dtlb
This counts the number of cycles stalled due to a level 1 data TLB miss that hits
in the level 2 data TLB. This is sometimes called a L1DTLB transfer stall. If the
level 2 TLB misses, the hardware page walker (HPW) is invoked to insert the
required page into the level 2 TLB, which is then forwarded to the level 1 data
TLB.
L2Dtlb
This counts the number of cycles stalled due to a level 2 data TLB miss during the
time the HPW is actively attempting to resolve the requested TLB entry. If the
entry is not in the cache, the HPW will terminate and initiate a trap to software to
provide the required TLB entry. This component counts the stall component only
due to the HPW providing the required TLB entry. Time spent in the software trap
handler is not counted in this component.
Dcache
This counts the number of cycles stalled due to data cache misses at any level of
the cache hierarchy (L1, L2, L3). Due to event limitations, it is not possible to
distinguish between freg-freg and freg-load dependencies. This has the unfortunate
effect of counting either scoreboard cycles as data cache cycles or data access cycles
as scoreboard cycles. This implementation allocates all floating-point stalls to the
data cache category. This has the implication that some floating-point register
dependency stalls that should be allocated to the scoreboard category will be
incorrectly allocated to the data cache category.
RSE Active
This counts the number of cycles that the pipeline is stalled due to the Register
Save Engine spilling/filling registers to/from memory.
340 Event Set Descriptions for CPU Metrics