HP Caliper Ktrace Features Guide

_swtch() which calls swtch_to_thread(). And function real_sleep() calls

_swtch(). When pm_swtch.c is optimized, real_sleep() sometimes calls

swtch_to_thread() directly, and in those cases the calls to _swtch() are missing.

Performance

The performance cost to run ktracer depends on the frequency of calls to traced

functions. Performance also depends on the processor architecture and speed, cache

hit rate and memory speed. A performance estimate is a cost of 0.1 microseconds each

time a trace point is encountered.

The time to capture a trace is typically 90 cycles. It is faster to trace hundreds of functions

that are infrequently called than to trace a single function called intensively.

For example, if clock_int is called 100 times per second on a 1.6 GHz system and it

was traced for 1 second, it would cost:

(100 traces) * ( 90 cycles/trace)/(1.6 G cycles/second) = 5.6 nanoseconds or 0.00000056%

overhead during the 1 second trace time. This would be a very low overhead.

If intr_strobe_clear_idle() is called 500 times per millisecond on a 1.6 GHz

system and if it was traced for 1 millisecond it would cost:

(500 traces) * ( 90 cycles/trace)/(1.6 G cycles/second) = 28 microseconds

which is 2.8 % overhead during the 1 millisecond trace time. This would be perceptible

overhead.

Performance 17