Order Number: AN2009/D Rev. 1, 9/2001 MOTOROLA Semiconductor Products Sector Application Note Contents Introduction to the StarCore SC140 Tools: An Approach in Nine Exercises Emmanuel Roy and David Crawford This document presents a quick, comprehensive hands-on introduction to the StarCore SC140 DSP core using programming examples and exercises. The goal is to help the software developer start writing high-level language applications in C.
The following StarCore software development tools were used in the development of the SC140 exercises. Later versions of the SC140 tools should generate similar or better results: • Version 1.0 StarCore 100 C Compiler. Produces highly optimized code. Compiler features include ANSI C-standard compliance, fixed-point optimization, global optimization, and a standard C library. • Version 6.3.44 StarCore 100 Assembler. Translates assembly language files into machine-readable object files.
Compiler C files .c, .h IR library files .lib CCSC100 C Compiler Front End IR files .obj [IR = Intermediate Representation] Optimizer icode Assembler Assembly files .sl Assembly files .asm Assembler asmsc100 Listing files .lst Object files .eln Linker Object library files .elb sc100-ld Linker Map files .map Interactive Simulator “Run-time” Simulator runsc100 - Execute Program to completion - C file I/O capability Absolute files .eld simsc100 - DOS based Figure 2.
1 File I/O Exercise The file I/O exercise shows how to use standard ANSI C I/O features within the current tools suite. Hands On 1. Create a new text file called io.c. 2. Within the io.c file, write code using the ANSI C printf function to display Welcome to StarCore SC140 Tools on the screen (remember to include the header file stdio.h), 3. Compile the file using ccsc100 io.c -o io.eld. The -o option specifies the output file name (for example, io.eld).
Hardware Support on StarCore 2 Integer and Fractional Arithmetic Exercise One of the strengths of both the StarCore architecture and the StarCore compiler is the ability to perform both fractional and integer arithmetic. This exercise presents a reminder about integer and fractional arithmetic representation and then shows how to use the StarCore compiler fractional intrinsics. Values stored in memory or registers are interpreted differently depending on the operation performed.
Compiler Support on StarCore 2.2 Compiler Support on StarCore The StarCore compiler implements fractional arithmetic using built-in intrinsic functions based on integer data types. Any fractional values or constants must therefore be defined using their integer equivalent.
Compiler Support on StarCore 5. Open the generated assembly file Ex2.sl and look at the integer instructions within the loop. 6. In the box provided here, write down the integer C code and the generated assembly instructions for the loop. Notice that the first data load is automatically pipelined in the software. Integer Arithmetic C code Generated Assembly code Fractional Arithmetic 7. For fractional arithmetic, copy and paste the loop of Ex2.c.
Compiler Support on StarCore 11. Compare the fractional assembly instructions generated to the assembly integer instructions. 12. Recompile the code without the -S option to produce an executable file. 13. Run the code using runsc100. The variables “res” and “fres” should print to the screen. What is the algebraic relationship between these two variables? Congratulations, you have completed Exercise 2. Good To Know To perform fractional operations: • Intrinsics are used.
Compiler Support on StarCore 3 Local Versus Global Optimization Exercise The local versus global optimization exercise shows the difference between two C compiler options: local optimization (the default) and global optimization. Local optimization compiles each file of the project individually as represented in Figure 4. Global optimization acts as a global binder that links all the intermediate representation (IR) files into one file before optimizing the application.
Compiler Support on StarCore StarCore C Compiler C files .c, .h C files .c, .h C files .c, .h C Compiler Front End C Compiler Front End C Compiler Front End IR files .obj IR files .obj Global Optimization IR files .obj Optimizer icode Assembler asmsc100 Object Library Files, .elb Assembler asmsc100 Assembler asmsc100 Linker Figure 5.
Compiler Support on StarCore Ex3_prod.c Ex3_main.c ... ... main() { ... long Prod(short a1[], short a2[]) { ... } res=Prod(&array1[0],&array2[0]); ... } Figure 6. Files for the Local Versus Global Optimization Exercise 1. Open the two files and understand their functionality. Local Optimization 2. Compile the two files: ccsc100 -Ot2 Ex3_main.c Ex3_prod.c -o Ex3.eld 3. Run the code: runsc100 -t Ex3.eld. The -t option for runsc100 enables the cycle count generation.
Compiler Support on StarCore To understand how global optimization makes best use of available information, perform these steps: 6. Recompile the application with -S option (Stop After Compilation) and with the local optimization: ccsc100 -Ot2 Ex3_main.c Ex3_prod.c -S. 7. Rename the .sl files as Ex3_main1.sl and Ex3_prod1.sl. 8. Open the files to see what the compiler has produced. 9. Enable global optimization: ccsc100 -Ot2 -Og Ex3_main.c Ex3_prod.c -S. 10. Open Ex3_main.
Compiler Support on StarCore The following instructions bring more than one byte at a time to the data register: move.w (Rx), Dn move.f (Rx), Dn move.2w (Rx), Dh move.2f (Rx), Dh move.4w (Rx), Dk move.4f (Rx), Dk move.
Compiler Support on StarCore The following instructions require data to be aligned on the specified boundaries: move.w move.f move.2w move.2f move.4w move.4f move.l move.2l (r0),d0 (r0),d0 (r0), d0:d1 (r0), d0:d1 (r0),d0:d1:d2:d3 (r0),d0:d1:d2:d3 (r0),d0 (r0),d0:d1 2-byte boundary 2-byte boundary 4-byte boundary 4-byte boundary 8-byte boundary 8-byte boundary 8-byte boundary 8-byte boundary Hands On 1. Open the Ex4.
Compiler Support on StarCore data: 0x01 0x23 0x45 0x67 0x89 0xEE 0xFF 0x11 0x22 0xAB 0xCD 0xEF move.w (r0),d0 move.2w (r0),d0:d1 r0 00 0000 0xBB 0xCC 0xDD Simulator Expected move #data+2,r0 0xAA 0102 d0 d0 d1 move.2f (r0),d2:d3 d2 d3 move.4w (r0),d4:d5:d6:d7 d4 d5 d6 d7 move.2l (r0),d8:d9 d8 d9 Second Code Section 4. Compile the Ex4.c file: ccsc100 -be Ex4.c -o Ex4.eld.
Compiler Support on StarCore 12. Type next to step through the code. 13. Look at the register contents in the session window and write the values in the Simulator Columns boxes above for both sections. Congratulations, you have completed Exercise 4. . Good To Know • 5 Unaligned data accesses lead to erroneous results.You must consider these issues when developing assembly code.
Compiler Support on StarCore Hands On 1. Open the Ex5.c file. 2. Build the code with -Ot2, then run it and notice the output result. 3. Split the current implementation of the loop (that is, res = L_mac(res, x[i], x[i]);) into four independent equations as represented in Figure 9. “Independent” means that the four equations are accumulated into different variables. Therefore, create four variables for each product. Tip: Watch your index increment. 4. Recompile the file and run it.
Compiler Support on StarCore Good To Know 6 • The use of four variables removes the accumulation dependency that is required for parallelism. • Bit exact considerations must be understood if this technique is used: overflow/saturation characteristics may change during split summation. Multi-Sample Exercise The multi-sample exercise demonstrates the multisample technique.
Compiler Support on StarCore Hands On 1. Open the Ex6.c file. 2. Compile Ex6.c using the -Ot2 option. Run the code and verify that the output is correct. See the comments in Ex6.c for the correct values of y[]. 3. Recompile Ex6.c using the -Ot2 and -S options. Examine the assembly language file Ex6.sl to see how the inner loop is compiled. Intermediate Version: Compromise Between Memory and Speed 4. Save Ex6.c as Ex6_1.c. 5. Change the C code of Ex6_1.c according to the following steps: a.
Compiler Support on StarCore C Code Generated Assembly Code Further Speed Optimization The register-to-register transfers can be eliminated by expanding the inner loop so that each group of four MAC instructions uses the data registers already containing the required data values. This yields faster code, but code size is greater. 9. Save Ex6_1.c as Ex6_2.c. 10. In Ex6_2.
Compiler Support on StarCore Table 3. Inner Loop Characteristics of Multi-sample and Single-sample Techniques. (Continued) Characteristic Single-sample Algorithm Multi-sample Algorithm 2N N/2 Small Large Number of memory moves (bandwidth) Code size 7 Control Code: The True Bit Exercise The True bit exercise shows how the compiler uses the True bit and how you can help the compiler to improve the performance. The True bit is set/cleared by compare or test instructions.
Compiler Support on StarCore Hands On 1. Open the example Ex7.c file. 2. Understand the conditional test in the code. 3. Compile the project with the -Ot2 and -S options. 4. Open the generated assembly file Ex7.sl, and look at the conditional instructions within the loop. 5. In the box provided here, write down how many execution sets are within the loop: Optimized for Time Optimized for Space 6. Recompile using the compiler optimization option for code size (-Os option). 7.
Compiler Support on StarCore 8 Calling an Assembly Routine From C Exercise Practical DSP application commonly use a mixture of C and assembly language. This exercise shows how an assembly language function can be called from C code. The code for this exercise is contained in two files: Ex8.c and addvecs.asm. The C code in Ex8.c calls the assembly language function, addvecs(), in file addvecs.asm, to add two vectors together and return the sum of all the elements of the resultant vector.
Compiler Support on StarCore High Address SP Local Variables (if any) SP (Current) Saved Registers SP Return Address Parameters 3, 4, 5, ... SP ¹ ² Return Address Parameters 3, 4, 5, ... SP Return Address SP Parameters 3, 4, 5, ... Parameters 3, 4, 5, ... ¼ ½ ³ SP ª Low Address ¹ Prior to function call ² On entry to function ³ During function execution ¼ Prior to exit from function ½ On return from function ª Calling function deallocates parameters on stack Figure 11.
Compiler Support on StarCore SP On function entry SP Prior to function call 4 Status Register 4 Return Address 4 &z[0] 2 2 M } } Pushed on stack by jsr/bsr instruction Parameters pushed onto stack prior to jsr/b Figure 12. Stack Contents on Entry to advecs() 4. In the box provided here, write what you think the offsets should be: Z_OFFSET M_OFFSET 5. Modify the addvecs.asm file to incorporate your offset values. 6. Build the code. 7. Run the code: runsc100 Ex8.eld. 8.
Compiler Support on StarCore 13. Are the offsets used in Ex8.c the same as the offsets used in addvecs.asm? If not, can you explain why? Congratulations, you have completed Exercise 8. Good To Know The stack pointer must always be a multiple of 8. It is illegal to increment it by a non-multiple of 8. 9 The Challenge This section presents you with a challenge involving an example that implements a complex scalar product. The objective of this session is to optimize the code from Ex9.
Compiler Support on StarCore 10 Solutions to Exercises Exercise 1: /***************************************************************************** * MOTOROLA INC. * SEMICONDUCTOR PRODUCTS SECTOR * COPYRIGHT 1999 MOTOROLA INC. ******************************************************************************* * INTRODUCTION TO THE SC140 TOOLS * Developed by MOTOROLA SPS/NCSG/WISD *******************************************************************************/ #include
Compiler Support on StarCore } printf("The integer result is: %d (0x%x)\n",res,res); printf("The fractional result is: %d (0x%x)\n",fres,fres); } Exercise 3: No code modification is required. Exercise 4: data: 0x01 0x23 0x45 0x67 0x89 0xAB 0xCD 0xEF 0xAA 0xBB 0xCC 0xDD 0xEE 0xFF 0x11 0x22 Expected move #data,r0 move.w (r0),d0 move.2w (r0),d0:d1 move.2f (r0),d2:d3 move.4w (r0),d4:d5:d6:d7 move.
Compiler Support on StarCore data: 0x01 0x23 0x45 0x67 0x89 0xAB 0xCD 0xEF 0xAA 0xBB 0xCC 0xDD 0xEE 0xFF 0x11 0x22 Simulator Expected move #data+2,r0 r0 move.w (r0),d0 d0 00 move.2w (r0),d0:d1 move.2f (r0),d2:d3 move.4w (r0),d4:d5:d6:d7 move.
Compiler Support on StarCore Exercise 5: /***************************************************************************** * MOTOROLA INC. * SEMICONDUCTOR PRODUCTS SECTOR * COPYRIGHT 1999 MOTOROLA INC. ******************************************************************************* * INTRODUCTION TO THE SC140 TOOLS * Developed by MOTOROLA SPS/NCSG/WISD *******************************************************************************/ /* Split Summation Technique Exercise */ #include
Compiler Support on StarCore Exercise 6: Intermediate version: Compromise between Memory and Speed /***************************************************************************** * MOTOROLA INC. * SEMICONDUCTOR PRODUCTS SECTOR * COPYRIGHT 1999 MOTOROLA INC.
Compiler Support on StarCore * y[18] = 0x0DC0 * * y[19] = 0x0D80 * * y[20] = 0x0D40 * * y[21] = 0x0D00 * * y[22] = 0x0CC0 * * y[23] = 0x0C80 * * y[24] = 0x0C40 * * y[25] = 0x0C00 * * y[26] = 0x0BC0 * * y[27] = 0x0B80 * * y[28] = 0x0B40 * * y[29] = 0x0B00 * * y[30] = 0x0AC0 * * y[31] = 0x0A80 * **********************************************************************/ main() { long res0, res1, res2, res3; short var0, var1, var2, var3; short n, i, *x_ptr; x_ptr = &input[14]; for(n=0; { res0 = res1 = res2 = res3
Compiler Support on StarCore var0 = *x_ptr--; /* var0 = x[n-i-1] */ } /*** Truncate results and store in y[] ***/ y[n] = extract_h(res0); y[n+1] = extract_h(res1); y[n+2] = extract_h(res2); y[n+3] = extract_h(res3); x_ptr += 20; /* Increment pointer by 20 to point to x[n+7] for next iteration */ } /*** Print results, y[] ***/ for (n=0; n<32; n++) { printf ("y[%d] = 0x%04hX\n", n, y[n]); } } Further Optimizing the Speed /***************************************************************************** * MO
Compiler Support on StarCore * y[0] = 0x0020 * * y[1] = 0x0080 * * y[2] = 0x0140 * * y[3] = 0x0280 * * y[4] = 0x0460 * * y[5] = 0x0700 * * y[6] = 0x0A80 * * y[7] = 0x0D00 * * y[8] = 0x0EA0 * * y[9] = 0x0F80 * * y[10] = 0x0FC0 * * y[11] = 0x0F80 * * y[12] = 0x0F40 * * y[13] = 0x0F00 * * y[14] = 0x0EC0 * * y[15] = 0x0E80 * * y[16] = 0x0E40 * * y[17] = 0x0E00 * * y[18] = 0x0DC0 * * y[19] = 0x0D80 * * y[20] = 0x0D40 * * y[21] = 0x0D00 * * y[22] = 0x0CC0 * * y[23] = 0x0C80 * * y[24] = 0x0C40 * * y[25] = 0x0C00 *
Compiler Support on StarCore var3 var2 var1 var0 = = = = *x_ptr--; *x_ptr--; *x_ptr--; *x_ptr--; /* /* /* /* var3 var3 var3 var3 = = = = x[n+3] x[n+2] x[n+1] x[n] */ */ */ */ /*** x_ptr now points to x[n-1] ***/ for(i=0; i<12; i+=4) { res0 = L_mac(res0, a[i], var0); res1 = L_mac(res1, a[i], var1); res2 = L_mac(res2, a[i], var2); res3 = L_mac(res3, a[i], var3); var3 = *x_ptr--; /* var3 = x[n-i-1] */ res0 res1 res2 res3 var2 = = = = = L_mac(res0, a[i+1], L_mac(res1, a[i+1], L_mac(res2, a[i+1], L_ma
Compiler Support on StarCore } Exercise 7: /***************************************************************************** * MOTOROLA INC. * SEMICONDUCTOR PRODUCTS SECTOR * COPYRIGHT 1999 MOTOROLA INC.
Compiler Support on StarCore Exercise 8: Z_OFFSET M_OFFSET equ -12 equ -14 Exercise 9: /***************************************************************************** * MOTOROLA INC. * SEMICONDUCTOR PRODUCTS SECTOR * COPYRIGHT 1999 MOTOROLA INC. ******************************************************************************* * INTRODUCTION TO THE SC140 TOOLS * Developed by MOTOROLA SPS/NCSG/WISD *******************************************************************************/ #include
Compiler Support on StarCore NOTES: 38 Introduction to the SC140 Tools
Compiler Support on StarCore NOTES: Introduction to the SC140 Tools 39
EOnCE is a registered trademark of Motorola, Inc. StarCore, PowerQUICC II, Motorola, and the Motorola logo are trademarks of Motorola, Inc. The PowerPC name is a trademark of International Business Machines Corporation used by Motorola under license from International Business Machines Corporation. Motorola reserves the right to make changes without further notice to any products herein.