HP Caliper Ktrace Features Guide

_swtch() which calls swtch_to_thread(). And function real_sleep() calls
_swtch(). When pm_swtch.c is optimized, real_sleep() sometimes calls
swtch_to_thread() directly, and in those cases the calls to _swtch() are missing.
Performance
The performance cost to run ktracer depends on the frequency of calls to traced
functions. Performance also depends on the processor architecture and speed, cache
hit rate and memory speed. A performance estimate is a cost of 0.1 microseconds each
time a trace point is encountered.
The time to capture a trace is typically 90 cycles. It is faster to trace hundreds of functions
that are infrequently called than to trace a single function called intensively.
For example, if clock_int is called 100 times per second on a 1.6 GHz system and it
was traced for 1 second, it would cost:
(100 traces) * ( 90 cycles/trace)/(1.6 G cycles/second) = 5.6 nanoseconds or 0.00000056%
overhead during the 1 second trace time. This would be a very low overhead.
If intr_strobe_clear_idle() is called 500 times per millisecond on a 1.6 GHz
system and if it was traced for 1 millisecond it would cost:
(500 traces) * ( 90 cycles/trace)/(1.6 G cycles/second) = 28 microseconds
which is 2.8 % overhead during the 1 millisecond trace time. This would be perceptible
overhead.
Performance 17