HP Caliper User's Guide

While executing those instructions will not cause an application to crash in the
absence of HP Caliper, they will still have an impact on performance. Executing
a break instruction causes a trap to the breakpoint handler in the kernel.
The presence of trigger macros may disable some optimization that the compiler
could perform.
The trigger instructions are defined so that code will not be moved around them.
This is done to ensure that code seen in the source between two sample points will
not executed before or after those samples are taken.
This prevents the compiler from reordering statements while optimizing code, so
the measured program results may be worse than it would be otherwise. For
example, with sample points inside of a loop, this could mean that loop invariant
promotion or other loop transformations become illegal or less effective. For sample
points placed at the entrance and exit of functions, this could affect performance
if the function is inlined.
Unfortunately, the only way to check for such issues is to check the code generated
by the compiler with and without those macros, and estimate whether the program
measurements are significantly affected.
Restricting PMU Measurements to Specific Code Regions
By default, HP Caliper measures PMU events for your entire program. You can,
however, restrict measurements to performance-sensitive regions of code. This feature
is enabled with the CALIPER_PMU_ENABLE and CALIPER_PMU_DISABLE macros and
the --user-regions option.
You can use this feature with these measurements:
alat
branch
dcache
dtlb
ecount
fprof
icache
itlb
pmu_trace
scgprof
While you can also use this feature with the cgprof measurement, it might lead to
inconsistent results. This is because the time statistics are collected using the PMU,
while the call graph and function counts are collected using dynamic instrumentation.
Taking PMU Samples in Your Code 211