User`s manual

6-4 Optimizing DSP56300/DSP56600 Applications MOTOROLA
Pipeline Interlocks
Data ALU Pipeline Interlocks
;A=next a,PUT d’
move x:(r4)+,AA,y:(r1) ;PUT b’, A=next a
move B,y:(r7)
move y:(r0)+,B ;B=next c
The parallel source moves that caused the pipeline interlocks were
shifted to the following instructions. This example illustrates the
importance of ordering the arithmetic instructions and the parallel
read operations. Taking this approach when writing a program can
shorten the execution time by preventing unnecessary pipeline
interlocks.
6.1.2.2 Loop Unrolling
The usage of two accumulators can avoid arithmetic pipeline when
combined in loop unrolling techniques. The following two
examples demonstrate possible applications for this method.
6.1.2.2.1 Loop Unrolling in N Array Scale routine
The following code segment is used for scaling an array of N
positive numbers:
clr A x:(r0)+,B
rep #N
max B,A x:(r0)+,A ;Largest value of N numbers
clb A,B ;Count leading bits of the
;largest number
move x:(r1)+,A
do #N,_end
normf B1,A ;Scaling block of N numbers
move x:(r1)+,AA,y:(r4)+
_end
The read operation of accumulator A in the eighth instruction
causes an arithmetic pipeline interlock in the critical loop, causing
the loop to execute 3N cycles instead of 2N. Using two accumulators
can avoid this to happen, as demonstrated in the modified code:
clr A x:(r0)+,B
rep #N
max B,A x:(r0)+,A ;Largest value of N numbers
clb A,B ;Count leading bits of the
;largest number
move x:(r1)+,A
move x:(r1)+,BB,y0
do #N/2,_end
normf y0,A ;Scaling block of N numbers
normf y0,B
move x:(r1)+,AA,y:(r4)+
move x:(r1)+,BB,y:(r4)+
_end