HP Caliper User Guide Release 5.5 (5900-2351, August 2012)

CYC_BE_EXE_REPLAY.GR_LOAD_WAW This is the number of cycles lost (stall cycles) in replay due
to WAW hazard in an instruction's GR load.
CYC_BE_DET_REPLAY.GR_LOAD This is the number of cycles lost (stall cycles) in replay due
to memory loads of single cycle GR load instructions. The
loads do not hit the FLD (first level data cache) and must be
obtained from lower level caches or memory leading to
extra cycles.
DATA_REF.ANY The number of data memory references issued into memory
pipeline. Includes check loads, non-uncacheable accesses,
RSE operations, semaphores, and floating-point memory
references.The count includes wrong path operations but
excludes predicated off operations. This event does not
include VHPT memory references.
FLD_LOAD.ANY The number of requests issued to the first level data cache
(also called L1 data cache).
FLD_LOAD_MISS.ANY The number of requests which were issued to the first level
data cache but missed the FLD.
MLD_REF.ANY The number of requests which were issued to the middle
level data cache (also called MLD, L2D or L2 data cache).
MLD_REF.MISS The number of requests which were issued to the middle
level data cache but missed it.
MLD_REF.PRIMARY The number of requests which were issued to middle level
data cache due to a primary miss in FLD but also missed
the MLD.
IA64_INST_RETIRED Number of retired IA-64 instructions. The count includes
predicated on and predicated off instructions and nops, but
excludes hardware-inserted RSE operations.
% Unstalled execution (higher is
better)
Percentage of cycles without any stalls.
% of Cycles lost due to GR/load
penalties (lower is better)
Percentage of cycles lost due to GR load dependency stalls.
L1 data cache miss percentage Percentage of L1 data cache reads that are misses
Percent of data references accessing
L1 data
Percentage of data references that access cache the L1 data
cache.
L2 data cache miss percentage Percentage of L2 data cache reads that are misses.
L1 data cache misses per 1000
instructions retired
Number of L1 data cache misses per 1000 instruction
retired.
L2 data cache misses per 1000
instructions retired
Number of L2 data cache misses per 1000 instruction
retired.
Instructions retired per L1 data cache
access
Number of instructions retired per L1 data cache access.
Instructions retired per L2 data cache
access
Number of instructions retired per L2 data cache access.
dcache Measurement Report Metrics
See Table 21 (page 195).
In this table, “program object” refers to any of the following:
Thread
Load module
Function
194 Descriptions of Measurement Reports