HP Caliper User's Guide
dcache Measurement Report Description
With the dcache measurement, produced by the dcache measurement configuration
file, HP Caliper measures and reports on data cache metrics. This measurement is
similar to the icache measurement.
The report shows two levels of information:
• Exact counts of data cache metrics summed across the entire run of an application
• Sampled data cache metrics that are associated with particular locations in the
application
The sampled metrics also provide detailed latency information by breaking up the
misses into eight different latency buckets based on latency cycles. The different buckets
provide percentage of misses with different latency ranges.
A latency bucket is a grouping of latency data associated with data accesses serviced
by particular levels of CPU cache and system memory. The different latency buckets
can be one of the following: L2 cache access, L3 cache access, and memory access. On
cell-based systems, the following additional buckets are provided: cell local memory
access, 1–hop memory access, 2–hop memory access, and cache-to-cache (C2C) access.
The latency bucket information is particularly useful for understanding data cache
access behavior of large-enterprise multithreaded, multiprocess applications and
fine-tuning the applications. For example, if a large percentage of data cache misses
are due to 1– or 2–hop C2C accesses, this could indicate that the processes are sharing
data and running on CPUs in two different cells. You can possibly improve performance
significantly by scheduling those processes to run on CPUs within the same cell.
You can turn off the latency bucket information by using the --latency-buckets
False option.
On HP-UX, HP Caliper uses the model command to determine what the CPU type
and CPU frequency are.
On Linux, you need to use the --system-model option to help HP Caliper determine
the CPU type and CPU frequency. If you do not use this option, HP Caliper will break
up the misses into the following three buckets by default: L2 cache access, L3 cache
access, and memory access.
The report shows measured data by thread, load module, function, statement, and
instruction.
Command-line options let you control the amount of data reported, how the data is
sorted, and the number of statements and instructions reported for each sampled
program location.
You can use the --dcache-data-profile option to get Data Summary output with
a report. See “Using the --dcache-data-profile Option to Produce a Data Summary”
(page 256).
Example Command Line for Text Report
$ caliper dcache -o reports/dcachem.txt ./matmul
dcache Measurement Report Description 249