Owner's manual
AN253
10
Figure 3: IIR Optimization Example
To accomplish this optimization (see Figure 3): 1.) Use the additional registers in the MaverickCrunch co-
processor to load the filter and state variables once before the inner loop; 2.) Shuffle the state variables
around in registers during the inner loop; 3.) Store the state variables after the inner loop. This removes
nine loads and four stores from the inner loop. Also note that the copy instructions used to shuffle the state
// Floating-point Biquad IIR // Floating-point Biquad IIR
// (Basic: Non-optimized) // (Optimized)
main_loop
cfldrd bqd1k_s0, [bdq1k]
cfldrd bqd1k_s1, [bdq1k, 8]
cfldrd temp2, [fcoef] cfldrd bqd1k_s2, [bdq1k, 16]
cfldrd bqd1k_s0, [bdq1k] cfldrd bqd1k_s3, [bdq1k, 24]
cfmuld acc, temp2, bqd1k_s0
cfldrd outp, [fcoef]
cfnegd acc, acc
cfldrd temp1, [fcoef, 8]
cfldrd temp2, [fcoef, 8] cfldrd temp2, [fcoef, 16]
cfldrd bqd1k_s1, [bdq1k, 8] cfldrd temp3, [fcoef, 24]
cfmuld temp, temp2, bqd1k_s1
cfldrd temp4, [fcoef, 32]
cfsubd acc, acc, temp
cfnegd outp, outp
cfldrd temp2, [fcoef, 16] cfnegd temp1, temp1
cfldrd temp4, [data]
cfmuld temp, temp2, temp4 main_loop
cfaddd acc, acc, temp
cfldrd temp2, [fcoef, 24] cfldrd inp, [data]
cfldrd bqd1k_s2, [bdq1k, 16] cfmuld acc, outp, bqd1k_s0
cfmuld temp, temp2, bqd1k_s2 cfmuld temp, temp1, bqd1k_s1
cfaddd acc, acc, temp
cfcpyd bqd1k_s1, bqd1k_s0
cfldrd temp2, [fcoef, 32] cfaddd acc, acc, temp
cfldrd bqd1k_s3, [bdq1k, 24] cfmuld temp, temp2, inp
cfmuld temp, temp2, bqd1k_s3 cfaddd acc, acc, temp
cfaddd acc, acc, temp cfmuld temp, temp3, bqd1k_s2
cfstrd acc, [data], 8 cfaddd acc, acc, temp
cfstrd acc, [bdq1k] cfmuld temp, temp4, bqd1k_s3
cfstrd bqd1k_s0, [bdq1k, 8] cfcpyd bqd1k_s3, bqd1k_s2
cfstrd temp4, [bdq1k, 16] cfcpyd bqd1k_s2, inp
cfstrd bqd1k_s2, [bdq1k, 24] cfaddd acc, acc, temp
cfcpyd bqd1k_s0, acc
subs nn, nn, 1
bgt main_loop subs nn, nn, 1
bgt main_loop
ldr temp1, =bqd1k_dCrstates
cfstrd bqd1k_s0, [temp1]
cfstrd bqd1k_s1, [temp1, 8]
cfstrd bqd1k_s2, [temp1, 16]
cfstrd bqd1k_s3, [temp1, 24]
3
2
1