Technical data
126
Chapter 6: Compiling and Debugging Parallel Fortran
----------------------------------------------------------
* -p[rocedures] using basic-block counts; sorted in *
* descending order by the number of cycles executed in *
* each procedure; unexecuted procedures are excluded *
----------------------------------------------------------
13302554 cycles
cycles %cycles cum % cycles bytes procedure (file)
/call /line
12479754 93.81 93.81 594274 25 calc_
(/tmp/ctmpa00857)
282980 2.13 95.94 14149 58 move_
(/tmp/ctmpa00837)
155721 1.17 97.11 43 29 _flsbuf (flsbuf.c)
The single-processor execution time has increased by about 30 percent.
Look at an execution profile of the master thread in a parallel run and
compare it with these single-process profiles:
% prof -pixie -quit 1% try1.mp try1.mp.Addrs
try1.mp.Counts00421
----------------------------------------------------------
* -p[rocedures] using basic-block counts; sorted in *
* descending order by the number of cycles executed in *
* each procedure; unexecuted procedures are excluded *
----------------------------------------------------------
12735722 cycles
cycles %cycles cum % cycles bytes procedure (file)
/call /line
6903896 54.21 54.21 328767 37 calc_
(/tmp/ctmpa00869)
3034166 23.82 78.03 137917 16 mp_waitmaster
(mp_simple_sched.s)
1812468 14.23 92.26 86308 19 _calc_88_aaaa
(/tmp/fMPcalc_)
294820 2.31 94.57 294820 13 mp_create
(mp_utils.c)
282980 2.22 96.79 14149 58 move_
(/tmp/ctmpa00837)










