Platform Developer’s Kit PDK Tutorial Manual
PDK Tutorial Manual Celoxica, the Celoxica logo and Handel-C are trademarks of Celoxica Limited. All other products or services mentioned herein may be trademarks of their respective owners. Neither the whole nor any part of the information contained in, or the product described in, this document may be adapted or reproduced in any material form except with the prior written permission of the copyright holder. The product described in this document is subject to continuous development and improvement.
PDK Tutorial Manual Contents 1 PAL TUTORIAL ........................................................................................................... 7 1.1 RUNNING THE PAL TUTORIAL IN SIMULATION .......................................................... 7 1.2 RUNNING THE PAL TUTORIAL IN HARDWARE ........................................................... 7 1.3 PAL TUTORIAL PART 1 ......................................................................................... 8 1.3.1 Compile-time configuration..
PDK Tutorial Manual 5.3 ADDING MOUSE INPUT ......................................................................................... 68 6 TUTORIAL: HANDEL-C CODE OPTIMIZATION ................................................................. 71 6.1 TIMING AND AREA EFFICIENT CODE ....................................................................... 71 6.1.1 Complex statements.............................................................................................................. 71 6.1.
PDK Tutorial Manual Conventions The following conventions are used in this document. 2 Warning Message. These messages warn you that actions may damage your hardware. Ï Handy Note. These messages draw your attention to crucial pieces of information. Hexadecimal numbers will appear throughout this document. The convention used is that of prefixing the number with '0x' in common with standard C syntax.
PDK Tutorial Manual Assumptions & Omissions This manual assumes that you: • have used Handel-C or have the Handel-C Language Reference Manual • are familiar with common programming terms (e.g. functions) • are familiar with your operating system (Linux or MS Windows) This manual does not include: • instruction in VHDL or Verilog • instruction in the use of place and route tools • tutorial example programs. These are provided in the Handel-C User Manual Page 6 www.celoxica.
PAL tutorial 1 PAL tutorial The PAL tutorial shows an experienced Handel-C programmer how to implement platform-independent hardware using the Handel-C language, DK and the PAL API. The application implemented in the tutorial is a simple program that displays a square bouncing around the screen. The tutorial workspace can be accessed from the Start menu, by default it is under Celoxica>Platform Developer’s Kit>PAL>PAL Tutorial Workspace.
PAL tutorial 5. Place and route the files (using Xilinx or Altera tools as appropriate). 6. Download the resulting .bit file onto the Spartan II FPGA on the RC100. 1.3 PAL Tutorial Part 1 The Part1 project in the PAL tutorial describes how to use PAL resources. The application created bounces a square around a VGA screen. 1.3.
PAL tutorial Getting compile-time information from the resource There are a number of API calls that allow you to get information from PAL resources at compile time. They can be recognized by the abbreviation CT appended to the name of the macro. For this application, the number of visible lines and columns in the video scan is used in order to be able to test where the current scan position is in a frame.
PAL tutorial PalVideoOutWrite (VideoOut, {24-bit expression}); Getting run-time information from the resource Some PAL resources return run-time information to the user about their current state. The PAL methods for accessing this information are of the form PalXGetY (PalHandle), where X is the type of resource, Y is the attribute to query and PalHandle is the handle to the PAL resource to be queried.
PAL tutorial macro expr UsingButtons = PalSwitchCount () > 1; macro expr UsingMouse = !UsingButtons && (PalPS2PortCount () > 0); The first expression, UsingButtons, is set to one if there are two or more buttons available on the target board. If there are no buttons available, the code needs to check if there is a mouse available instead: if the expression UsingButtons evaluates to false and there is at least one PS2 port available, then the macro expression UsingMouse will evaluate to true.
PAL tutorial 1.5 PAL Tutorial Part 3 The Part3 project in the PAL tutorial describes how to use an external RAM. The RAM is initialized and run from the main() function. The GenerateData() macro no longer displays the square directly to the screen but draws the square into RAM. Every clock cycle during the visible period of the scan, the display process reads pixels out of the RAM and displays them on the screen. 1.5.
PAL tutorial par { PalVideoOutRun (VideoOut); PalFastRAMRun (FastRAM); // main program here } Enabling the RAM resource Once the RAM resource is being run, it needs to be enabled before it can be accessed. This is done using the PalFastRAMEnable() macro. In this application, it is done at the same time as the enabling of the video resource: par { PalFastRAMEnable (FastRAM); PalVideoOutEnable (VideoOut); } Writing data to the RAM This application uses a single-bank of RAM as a frame-buffer.
PAL tutorial 2 DSM tutorials There are two Data Stream Manager tutorials: • Pattern matching tutorial: a simple example, targeting the DSM Simulation Virtual platform • FIR filter tutorial: a more complex example, running on the DSM Sim platform, the RC200 and the Memec Virtex-II Pro development board. The tutorials show you how to implement platform-independent hardware-software co-designs between a processor and an FPGA using DK, and the DSM API. There are also a number of DSM example programs.
PAL tutorial 6. Open the tutorial MSVC workspace: Start>Programs> Celoxica>Platform Developer's Kit>DSM>DSM Tutorial Workspace [VC++]. 7. Choose Part1, Part2 or Part3 of the tutorial by selecting Project>Set Active Project. 8. Compile the project by pressing F7. 9. Execute the simulation by pressing F5. The output will depend on which part of the tutorial you have downloaded, but will describe any patterns matched and the time taken to do so. 2.1.
PAL tutorial DsmInstance *Instance; DsmPortS2H *DataOutPort; DsmPortH2S *MatchInPort; int DsmTutorial (DsmInterface Interface, void *InterfaceData) { DsmWord Data[MAX_DATA_LENGTH_WORDS]; DsmWord Pattern; int i, DataLengthWords; DsmSetDefaultErrorHandler (); DsmInit (Interface, InterfaceData, H2S_COUNT, S2H_COUNT, &Instance); DsmPortS2HOpen (Instance, DATA_S2H_PORT, &DataOutPort); DsmPortH2SOpen (Instance, MATCH_H2S_PORT, &MatchInPort); // Do tutorial algorithm here DsmPortS2HClose (DataOutPort); DsmPortH2S
PAL tutorial DsmInstance DsmPortS2H DsmPortS2H DsmPortH2S *Instance; *DataOutPort; *PatternOutPort; *MatchInPort; int DsmTutorial (DsmInterface Interface, void *InterfaceData) { DsmWord Data[MAX_DATA_LENGTH_WORDS]; DsmWord Pattern; int i, DataLengthWords; DsmSetDefaultErrorHandler (); DsmInit (Interface, InterfaceData, H2S_COUNT, S2H_COUNT, &Instance); DsmPortS2HOpen (Instance, DATA_S2H_PORT, &DataOutPort); DsmPortS2HOpen (Instance, PATTERN_S2H_PORT, &PatternOutPort); DsmPortH2SOpen (Instance, MATCH_H2S_
PAL tutorial 2.2 DSM FIR filter tutorial 2.2.1 Introduction The DSM FIR filter tutorial connects a FIR filter to a processor using DSM. The application sends a set of input samples stored in RAM to the FIR filter and reads the filtered data back. The input and output waveform is then displayed on the screen for DSM Virtual Simulation, MicroBlaze and Virtex-II Pro PowerPC platforms. The connection to the video display is also based on a DSM layer.
PAL tutorial • RS-232 Serial cable Optionally from MathWorks for MV2P Target: • MATLAB 7.0.1 (Release 14). Other versions might work, but have not been tested. 2.2.3 System design The DSM FIR filter tutorial demonstrates a method for hardware/software co-design using DK and DSM. 1. Create a system-level design. 2. Translate the hardware side of the design into Handel-C. 3. Translate the software side of the design into ANSI-C.
PAL tutorial FIRFilter (FIRPortH2S, FIRPortS2H); DsmVideo (VideoPortS2H, VideoPortH2S, VideoPL1RAM, PAL_ACTUAL_CLOCK_RATE); } DSM FIR filter tutorial: hardware side The Handel-C code for the DSM FIR filter tutorial can be opened from Start>Programs>Celoxica>Platform Developer's Kit>DSM>DSM Examples Workspace [DK]. FIR Filter implementation The main task of the filter is to take input data and operate on it, and to provide results from operations on earlier input data.
PAL tutorial (unsigned) adjs (Output, width(DsmWord))); DsmFlush (PortH2S); } } } } Video output implementation The DSM video driver is used to display processed data in visual form on a monitor screen. To use the video driver on hardware side, you need to include dsm_video.hch and link your application with dsm_vide.hcl library. This is done at the beginning of the source file dsm_fir.hcc. The dsm_fir.h which is shared between hardware and software sides defines the DSM H2S and S2H ports for video.
PAL tutorial printf ("Output = %d\n", Output); #if defined WIN32 || defined __MICROBLAZE__ if (i != 0) { SetColor (LIGHTGREEN); /* draw input by green */ Line (i - 1, Input[i-1] + HEIGHT/2, i,Input[i] + HEIGHT/2); SetColor (LIGHTRED); /* draw output by red */ Line (i - 1, OldOutput + HEIGHT/2, i, Output + HEIGHT/2); } OldOutput = Output; #endif /* Flush remaining writes */ DsmFlush (PlotPortS2H); /* Shutdown */ DsmPortS2HClose (FirPortS2H); DsmPortH2SClose (FirPortH2S); DsmPortS2HClose (VideoPortS2H); Ds
PAL tutorial 6. Open the MSVC Examples workspace from the start menu: Start>Programs> Celoxica>Platform Developer's Kit>DSM>DSM Examples Workspace [VC++]. 7. Right click the DsmFIR project in the left pane and select Set Active Project. 8. Compile the project by pressing F7. 9. Execute the simulation by pressing F5. When you run the simulation, the PALSim application and the DSM Sim Monitor will appear.
PAL tutorial DSM SIM MONITOR CONTENTS 2.2.5 Running the tutorial in hardware The DSM FIR filter tutorial workspace is configured to automatically run Xilinx EDK and Place and Route tools in a custom build step when you target the MicroBlaze processor on the RC200, RC200E or Memec Design platform. You must have the Xilinx software installed for this to work. You can run the application using a lowpass filter or a highpass filter (either low frequency or high frequency waves are let through).
PAL tutorial Building the hardware side 1. Make sure that the board is connected to your PC with a parallel cable before you build the hardware. 2. Open the DSM Examples Workspace in DK by clicking on Start>Programs>Celoxica>Platform Developer's Kit>DSM>DSM Examples Workspace [DK]. 3. Choose the DsmFIR project and set it as the active project. 4. Choose the MB_RC200 platform in Active Build Configuration. 5. Click on the build icon, or press F7 to start the compilation.
PAL tutorial Building the software side The software is built before generation of the BIT file. You must run the terminal program before the BIT file is downloaded onto the board. 1. Select Start>Programs>Celoxica>Platform Developer's Kit> PowerPC Hyperterminal. 2. If you changed the program code and need to recompile it again, you can just hit the build button in DK. Program will be recompiled and downloaded with the bit file onto the board. Running the application 1.
PAL tutorial 6. You can compare the results gnerated from the board with the results generated in matlab by running the dsm_fir_ref.m script. MATLAB OUTPUT FOR THE VIRTEX-II PRO Page 27 www.celoxica.
DSM tutorials 3 Platform Support Library tutorial 3.1 Introduction A Platform Support Library (PSL) is a Handel-C library containing functions for communicating with peripheral devices on an FPGA/PLD platform. A collection of functions for a particular device is referred to as a device driver. The PSL tutorial guide describes techniques and considerations for implementing device drivers in Handel-C, and thereby creating a PSL.
DSM tutorials 3.3 Creating a PSL To create a PSL you compile the device drivers that match the peripherals on your target platform into a Handel-C library and header file. Each of the drivers should be configured with interfaces that match the pin allocations on your platform. The organization of a PSL with respect to device drivers and application code is illustrated here: ORGANIZATION OF PLATFORM SUPPORT LIBRARIES 3.
DSM tutorials • How fast does it need to run • Can it function independently of the system clock frequency • Can you perform multiple instantiations of the device driver The size and speed of your device driver will be related to its complexity. If you require a device driver that does a lot of work translating API functions to device commands you will have to trade this off against hardware size or speed.
DSM tutorials Step 4: Implement procedures for the device interface Wrap communication with the Handel-C interfaces inside macro procedures. You should implement macros that do simple device operations such as writing a value to an input to the device. If the device uses some handshaking mechanism to input or output data you should also capture this inside the macros.
DSM tutorials Now define a Handel-C interface to attach this pin to a Handel-C variable. Use the Handel-C bus_out interface as the pin is an output from the device driver. Define a variable to serve as the expression output on the interface: static unsigned 1 LedValue = 0; interface bus_out () Led0 (unsigned 1 data = LedValue) with {data = LedPin}; Note that making the LedValue variable static prevents it from being visible outside the file it is defined in.
DSM tutorials Here are macro expressions for the RAM pins: static macro expr RAMAddrBus = {"A1", "A2","A3", "A4", "A5", "A6", "A7", "A8", "A9","A10","A11","A12","A13","A14","A15","A16"}; static macro expr RAMDataBus = {"D1","D2","D3", "D4","D5","D6","D7","D8"}; static macro expr RAMCSPin = {"CS"}; static macro expr RAMWEPin = {"WE"}; static macro expr RAMOEPin = {"OE"}; The Cypress CY7C1049B-25 has an access time of 25ns, which corresponds to a maximum clock frequency of 40MHz.
DSM tutorials in this example. The DK online help contains more information about timing constraints. To locate this information select the Index tab from the online help navigation window and enter timing as a keyword.
DSM tutorials You can capture this equation in a Handel-C macro expression and use it to evaluate the required number of clock cycles at compile time. /* Return the number of clock cycles needed for a time delay.
DSM tutorials HANDEL-C RAM READ The timing for a write operation given in the data sheet corresponds to this diagram: RAM WRITE OPERATION The address and data must be stable when the write enable is active. Handel-C has a synchronous timing model and there is no facility for designing asynchronous systems. You can only guarantee that the value of a changing expression is stable at the end of a clock cycle (at the rising edge of the clock). Page 36 www.celoxica.
DSM tutorials In order to guarantee the write enable is active only when the data and address are valid, the operation must be performed over three clock cycles. As illustrated in the following timing diagram: HANDEL-C RAM WRITE This example implements the write operation over three clock cycles to achieve complete flexibility over the system clock frequency. Page 37 www.celoxica.
DSM tutorials macro proc RAMWrite (Address, Data) { par { seq { par { RAMAddress = Address; RAMDataOut = Data; } seq(i = 0; i++; i != Time2Cycles (15)) { par { RAMAddress = Address; RAMDataOut = Data; NRAMWE = 0; } } par { RAMAddress = Address; RAMDataOut = Data; } } seq(i = 0; i++; i != Time2Cycles (25)) { delay; } } } Flash memory device driver The operation of flash memory is more complicated than asynchronous RAM. It is organized into blocks of data.
DSM tutorials • write enable pin (input) • status pin (output) • byte enable pin (input) The device can operate in 16 bit data or 8 bit data mode. You select the mode using the byte enable input. In 16 bit mode the Least Significant Bit (LSB) of the address bus is discarded. This example uses the device in 16 bit mode so the byte enable is deactivated by wiring high (it is active low) and only the most 22 most significant bits of the address bus are used.
DSM tutorials The structure that contains variables shared between the server and API functions also contains expressions for the interfaces to the device. The advantage of this is that the same API functions and server code can be used to control multiple 28F640J3A flash memory devices at the same time. A different copy of the structure is created for each device and then the server is run multiple times in parallel with the application, once for each device.
DSM tutorials DataBus Connects the server to the input expression of the flash data bus interface. StatusBus Connects the server to the input expression of the flash status bus interface. CEn Connects the server to the output expression on the flash chip enable pin. WEn Connects the server to the output expression on the flash write enable pin. OEn Connects the server to the output expression on the flash output enable pin.
DSM tutorials macro proc FlashRun (FlashPtr, ClockRate) { // Initialization sequence unsigned 3 Command; do { FlashPtr->APICommand ? Command; switch (Command) { case FlashAPICmdRead: // Read sequence goes here FlashPtr->APICommand ! 0; break; case FlashAPICmdWrite: // Write sequence goes here break; case FlashAPICmdErase: // Erase sequence goes here break; default: delay; break; } } while (1); } The full implementation of the server can be found in the accompanying source code.
DSM tutorials macro proc FlashWriteWord (FlashPtr, Address, Data) { par { FlashPtr->APICommand ! FlashAPICommandWriteWord; FlashPtr->APIAddress = Address; FlashPtr->APIData = Data; } } macro proc FlashEraseBlock (FlashPtr, BlockNumber) { par { FlashPtr->APICommand ! FlashAPICommandEraseBlock; FlashPtr->APIBlockNumber = BlockNumber; } } When the flash device driver is used, the application programmer must: • Declare a variable of type (Flash *) • Call FlashInit() with appropriate parameters to build inte
DSM tutorials static macro expr { "A17", "D15", "B15", "E13", "C13", "B13", }; FlashAddrPins = "C16", "D14", "E14", "A16", "C15", "A15", "F12", "C14", "B14", "A14", "D13", "E12", "A13", "B12", "D12", "C12", "D11" /* … */ static Flash *FlashPtr; The API implementation for the PSL is given in the following code: macro proc PSLFlashRun (ClockRate) { FlashInit (&FlashPtr, FlashAddrPins, FlashDataPins, FlashChipEnablePins, FlashOutputEnablePin, FlashWriteEnablePin, FlashStatusPin, FlashByteEnablePin, FlashEra
DSM tutorials • Making a driver portable (see page 55) 4.1 Handel-C language basics The TutorialHCBasics workspace illustrates the use of some of the Handel-C operators and constructs which are not present in C or C++. To open the workspace, select Start>Programs>Celoxica>Platform Developer's Kit>Tutorials>TutorialHCBasics.
DSM tutorials while (1) { /* * Run the two displays in parallel */ par { seq { /* * Increment up to 15, then wrap round to 0 */ Count++; /* * Write Count to display */ PalSevenSegWriteDigit (PalSevenSegCT (0), Count, 0); } seq { /* * Increment up to 5, then reset to 0 */ Circle = (Circle == 5) ? 0 : (Circle + 1); /* * Look up value in ROM, and set display */ PalSevenSegWriteShape (PalSevenSegCT (1), CircleDisplayEncode[Circle]); } } } Each iteration of the while(1) loop takes two clock cycles to complete,
DSM tutorials Swapping variable values The swapexample project in the TutorialHCBasics workspace shows how the values of two variables can be exchanged in a single clock cycle without using an intermediate location to store the contents of one of them. This is possible because a variable in Handel-C does not take on the value assigned to it until the end of a clock cycle.
DSM tutorials par { while (1) { unsigned 1 Temp; do { par { Count++; PalSevenSegWriteDigit (PalSevenSegCT (0), Count, 0); } } while(Count != 0); CountChan ! 0; CircleChan ? Temp; /* Write to one channel */ /* Read from other channel */ } while (1) { unsigned 1 Temp; CountChan ? Temp; /* Read from one channel */ do { par { Circle++; PalSevenSegWriteShape CircleDisplayEncode[Circle]); } } while(Circle != 6); Circle = 0; CircleChan ! 0; (PalSevenSegCT (1), /* Reset Circle for next loop */ /* Write to o
DSM tutorials The channelexample project is straightforward to run in hardware, but in simulation breakpoints must be set in each of the two parallel loops. This is necessary because otherwise the Debugger will continue to follow the thread it is currently in, and it will not be possible to step through the code in the other thread. By setting breakpoints on the Circle++ and Count++ lines, it will be possible to step through the code continuously, and see both displays operating cycle-by-cycle. 4.1.
DSM tutorials Take operator The takeexample project in the TutorialHCBasics workspace shows how to use the take bits <- operator. The source code is shown below: while (1) { par { /* * Increment up to 15, then wrap round to 0 */ Count++; /* * Write Count and Count <- 3 to display */ PalSevenSegWriteDigit (PalSevenSegCT (0), Count, 0); PalSevenSegWriteDigit (PalSevenSegCT (1), adju( (Count <- 3), 4), 0); } } The <- operator returns the n least significant bits from its operand.
DSM tutorials while (1) { par { /* * Increment up to 15, then wrap round to 0 */ Count++; /* * Write Count and Count[2:1] to display */ PalSevenSegWriteDigit (PalSevenSegCT (0), Count, 0); PalSevenSegWriteDigit (PalSevenSegCT (1), adju( (Count[2:1]), 4), 0); } } The [m:n] operator returns bits m to n from its operand. The value of Count is shown on the first 7-segment display, while the second display shows the value of the middle two bits of Count.
DSM tutorials while (1) { par { /* * Increment up to 15, then wrap round to 0 */ Count++; /* * Write Count and (Count[2:0] @ 0) to display */ PalSevenSegWriteDigit (PalSevenSegCT (0), Count, 0); PalSevenSegWriteDigit (PalSevenSegCT (1), Count[2:0] @ 0, 0); } } The @ operator joins together two operands to form a result whose width is equal to the sum of the operand widths.
DSM tutorials unsigned 4 Count1; unsigned 4 Count2; unsigned 4 Count3; signal CountSig; while (1) { /* * Increment up to 15, then wrap round to 0 */ Count1++; par { CountSig = Count1 * 2; Count2 = CountSig; Count3 = CountSig + 1; /* Assign value to the signal, */ /* use the value from the signal */ /* and use it again here */ } /* * Write Count2 and Count3 to display */ PalSevenSegWriteDigit (PalSevenSegCT (0), Count2, 0); PalSevenSegWriteDigit (PalSevenSegCT (1), Count3, 0); } 4.
DSM tutorials BLOCK DIAGRAM 4.2.2 Seven-segment display hardware interface First define macro expressions for the pins which the seven-segment displays are connected to. The example shown is for the RC200: static macro expr SevenSeg0Pins = {"L5", "G4", "F3", "K3", "L4", "L3", "H4", "G3"}; static macro expr SevenSeg1Pins = {"K4", "G5", "H3", "L6", "F5", "H5", "J3", "J4"}; Now define registers to hold the values to be displayed, initialising them to zero.
DSM tutorials macro proc SevenSeg0WriteDigit (Value, DecimalPoint) { SevenSeg[0] = DecimalPoint @ TranslationROM0[Value]; } The two macros shown for displaying a shape and a digit are for a single seven-segment display, and a further copy of each will be required for each additional display. 4.3 Using PAL to create a generic device driver Rather than using the seven-segment PSL driver (see page 53), the tutorial will continue using the standard PAL seven-segment displays instead.
DSM tutorials CREATING A NEW WORKSPACE Creating a new project Then, select the File>New menu again and create a new project in the workspace, as shown below. If you are targeting a board, the chip type must be set correctly – the figure below shows the setting for the Celoxica RC200. For simulation, the chip type is irrelevant. CREATING A NEW PROJECT Page 56 www.celoxica.
DSM tutorials Creating simulation and hardware configurations Now, select the Build>Configurations menu, select the Debug configuration, and click the Add button. A dialog box will appear, where a new configuration name can be entered, and settings copied from an existing configuration. Create a new configuration called Sim, based on the existing Debug configuration, as shown below. Also create a configuration called RC200, based on the existing EDIF configuration.
DSM tutorials Customizing the simulation configuration The two new configurations can now be customized for their particular targets. Select the Project>Settings menu, and from the Settings for drop-down, select the newly created Sim configuration. On the General tab, change the output directories to match the configuration name – Sim in this case, as shown below. SETTING OUTPUT DIRECTORIES Page 58 www.celoxica.
DSM tutorials On the Preprocessor tab, add USE_SIM to the Preprocessor definitions box, as shown below. This definition is used to specify which PAL target is to be used for this configuration. SETTING PREPROCESSOR DEFINITIONS The final step in setting up the new configuration is to go to the Linker tab in the Project Settings, and add libraries which are required for PAL. For simulation the target is the PalSim Virtual Platform, which requires the Handel-C libraries sim.hcl and pal_sim.hcl to be added.
DSM tutorials C:\program files\celoxica\pdk\software\lib\palsim.lib The Linker tab with all the libraries set up for simulation is show below. LINKER SETTINGS FOR SIMULATION Customizing the hardware configuration The RC200 configuration must be set up in a similar way to the simulation configuration, but the preprocessor definition should be USE_RC200, and the included Handel-C libraries should be rc200.hcl and pal_rc200.hcl.
DSM tutorials As the RC200 is a hardware target, a device type must also be specified. Go to the Chip tab in Project Settings, make sure that Family is set to Xilinx Virtex-II, Device is set to XC2V1000, Package is set to fg456 and Speed Grade is set to 4, as shown below. CHIP TYPE SETTINGS FOR RC200 4.3.2 Seven-segment project in PAL To use PAL in a project, a target clock rate must be set and the pal_master.hch header file must be included, as shown below.
DSM tutorials PalSevenSegWriteDigit (PalSevenSegCT (0), (unsigned 4) 0xE, 0); PalSevenSegWriteShape (PalSevenSegCT (1), (unsigned 8) 0b11110110); The TutorialSevenSeg2 workspace has this code in it, set up for Sim and RC200. To open the tutorial, select Start>Programs>Celoxica>Platform Developer's Kit>Tutorials>TutorialSevenSeg2. Page 62 www.celoxica.
Tutorial: Handel-C and PSL basics 5 Tutorial: Handel-C and VGA graphics output The Handel-C and VGA graphics tutorial illustrates how to use Handel-C to generate simple VGA graphics and respond to user input. Three examples are used, each building on the previous one to add new features. The TutorialVGA workspace contains the code for each of the examples. To open the workspace, select Start>Programs>Celoxica>Platform Developer's Kit>Tutorials>TutorialVGA.
Tutorial: Handel-C and PSL basics macro macro macro macro macro expr expr expr expr expr White Black Red Green Blue = = = = = 0xFFFFFF; 0x000000; 0xFF0000; 0x00FF00; 0x0000FF; macro expr ScanX = PalVideoOutGetX (VideoOut); macro expr ScanY = PalVideoOutGetY (VideoOut); Having defined these simple macro expressions, it is now possible to make the RunOutput macro display graphics on a VGA output. The example in GraphicsDemo1 draws a white grid on a black background.
Tutorial: Handel-C and PSL basics To run the example yourself, open the TutorialVGA workspace (Start>Programs>Celoxica>Platform Developer's Kit>Tutorials>TutorialVGA on the Start Menu), set GraphicsDemo1 as the active project, set the Active Configuration to Sim, then build and run the project. For a Celoxica RC200 board with a VGA monitor connected, set the Active Configuration to RC200, rebuild, then use the Place and Route tools to generate a bitfile to download to the board. 5.
Tutorial: Handel-C and PSL basics number of pixels in the Y direction. This is necessary for the display output code shown to work correctly, as attempting to store a negative result in an unsigned number results in a large (incorrect) positive number. The code below shows how the user interaction is performed. Two calls are made in parallel to PalSwitchRead() to get data from the two switches, and at the same time the data from the switches is checked and the box size updated.
Tutorial: Handel-C and PSL basics static macro proc Sleep (Milliseconds) { #ifdef USE_SIM macro expr Cycles = (10000 * Milliseconds) / 1000; #else macro expr Cycles = (ClockRate * Milliseconds) / 1000; #endif unsigned (log2ceil (Cycles)) Count; Count = 0; do { Count++; } while (Count != Cycles - 1); } The figure below shows the GraphicsDemo2 project running in simulation on the PalSim Virtual Platform.
Tutorial: Handel-C and PSL basics 5.3 Adding mouse input The GraphicsDemo3 project in the TutorialVGA workspace (Start>Programs>Celoxica>Platform Developer's Kit>Tutorials>TutorialVGA on the Start Menu) contains the code for this example. This example extends the GraphicsDemo2, by allowing the red box drawn on the screen to be moved around using a mouse and changing the colour of the box when the mouse buttons are pressed. To use the mouse under PAL, the pal_mouse.
Tutorial: Handel-C and PSL basics while (1) { par { XPos = MouseX; YPos = MouseY; if (MouseL == 1) BoxColour++; else delay; if (MouseR == 1) BoxColour = Red; else delay; } } The code for updating the box position and colour can not go in the same while(1) loop as the code which reads the switches, as it needs to execute every cycle, and the switch code includes calls to Sleep().
Tutorial: Handel-C and PSL basics The figure below shows the GraphicsDemo3 project running in simulation on the PalSim Virtual Platform. To run the example yourself, open the TutorialVGA workspace (Start>Programs>Celoxica>Platform Developer's Kit>Tutorials>TutorialVGA on the Start Menu), set GraphicsDemo3 as the active project, set the Active Configuration to Sim, then build and run the project.
Tutorial: Handel-C and VGA graphics output 6 Tutorial: Handel-C code optimization The following examples illustrate different methods of optimizing Handel-C code to produce smaller and faster designs. A basic knowledge of Handel-C is assumed, and some knowledge of digital electronics and design techniques will also be helpful. Timing and area efficient code (see page 71) Loops and control code (see page 74) 6.
Tutorial: Handel-C and VGA graphics output signal unsigned 16 temp1, temp2; par { temp1 = b << c; temp2 = b * d; a = temp1 + temp2; } This code still has the complex statement broken into three parts however as temp1 and temp2 are signals all the operations must still be performed in one clock cycle. This is because signals do not store the values assigned to them, so the results from the first two lines of code are fed straight into the third line in the same cycle.
Tutorial: Handel-C and VGA graphics output ram unsigned 8 Memory[4]; This will create a more efficient structure in hardware, but will now be limited to a single access per clock cycle. The rom keyword can be used if a read-only memory is required and can be declared as static to allow initialization: static rom unsigned 8 Memory[4] = {23, 25, 26, 29}; Block Memory Many FPGAs have more than one method of implementing memories, optimized for different sizes.
Tutorial: Handel-C and VGA graphics output 6.1.3 Macro procedures vs. functions The main difference between a macro proc and a function in Handel-C is the number of hardware copies that result. Placing a block of frequently used code in a function means that one copy of the code will exist in the hardware and every time the function is called this single copy of the code will be used. A macro procedure builds a fresh copy of the code every time it is called.
Tutorial: Handel-C and VGA graphics output assignment to take a single clock cycle. The result is that for() loops have a single clock cycle overhead so the example below takes 20 cycles to execute, rather than 10: for (i = 0; i < 10; i++) { a[i] = 0; } To improve the performance, a while() loop should be used instead as shown below. In this example the loop will now take 11 clock cycles instead of 20.
Tutorial: Handel-C and VGA graphics output static unsigned 1 Test = 1; unsigned 8 a; unsigned 32 b, c, d; while (Test == 1) { par { a++; Test = ((b * c) + d) > (d - b); } } 6.2.3 Avoiding combinatorial loops A combinatorial loop is a series of logic components connected in a loop with no latches or delay elements inserted.
Tutorial: Handel-C and VGA graphics output 6.2.4 Nested control Using nested if() statements, or long chains of if()...else() blocks can result in a design having a low clock rate. This is because the worst case is that all the nested conditions must be executed in a single cycle, so the delay can become significant. If possible, Handel-C code should be written to avoid nesting control statements more than a few layers deep.
Tutorial: Handel-C code optimization 7 Tutorial: Handel-C advanced optimization The following examples illustrate advanced methods of optimizing Handel-C code to produce smaller and faster designs. This builds on the content of the Code Optimization Tutorial, which should be studied first. Two main techniques are covered; pipelining and client-server architectures. A thorough knowledge of Handel-C is assumed, and some knowledge of digital electronics and design techniques will also be helpful.
Tutorial: Handel-C code optimization The behaviour and timing of the code is as follows: • After the first clock cycle: • • • new values for the additions are calculated and stored in sum1 and sum2. the value in a will be undefined, as it depends on sum1 and sum2 for its inputs, and they were undefined at the start of the cycle. After the second clock cycle: • another set of new values for the additions are calculated and stored in sum1 and sum2.
Tutorial: Handel-C code optimization #define WIDTH 8 unsigned WIDTH sum[WIDTH]; unsigned WIDTH a[WIDTH]; unsigned WIDTH b[WIDTH]; while(1) { par { sum[0] = ((a[0][0] == 0) ? 0 : b[0]); par (i=1; i<=(WIDTH-1); i++) { sum[i] = sum[i - 1] + ((a[i][0] == 0) ? 0 : b[i]); a[i] = a[i - 1] >> 1; b[i] = b[i - 1] << 1; } } } The first line of code inside the while(1) loop sets the value of sum[0], then the replicated par moves the shifted inputs through the a[] and b[] arrays, and the results through the sum[] array
Tutorial: Handel-C code optimization struct _DivideStruct { unsigned 16 InputA; unsigned 16 InputB; unsigned 16 Result; }; typedef struct _DivideStruct DivideStruct; Now create a server process: macro proc DivideServer(DividePtr) { /* perform divide operations forever */ while(1) { DividePtr->Result = DividePtr->InputA / DividePtr->InputB; } } and a client API macro: macro proc Divide(DividePtr, a, b, ResultPtr) { /* send data to the divide server */ par { DividePtr->InputA = a; DividePtr->InputB = b; } /*
Tutorial: Handel-C code optimization 7.3.2 Flash memory client-server example The operation of flash memory is more complicated than asynchronous RAM. It is organized into blocks of data. An entire block must be erased before any locations within it can be programmed. This example is based on the Intel flash memory part 28F640J3A, which has a capacity of 64 Mbits, organized as 64 blocks. You can obtain the data sheet for this part from http://developer.intel.com.
Tutorial: Handel-C code optimization /* * Erase data from the block in the Flash referenced by BlockNumber * Parameters: FlashPtr : input of type (Flash)* * BlockNumber : input of type (unsigned 6) */ extern macro proc FlashEraseBlock (FlashPtr, BlockNumber); The macro procedure containing the server has the following prototype: /* * Run the Flash device driver server * Parameters: FlashPtr : input of type (Flash)* * ClockRate : clock rate in Hz */ extern macro proc FlashRun (FlashPtr, ClockRate); The stru
Tutorial: Handel-C code optimization DataBus Connects the server to the input expression of the flash data bus interface StatusBus Connects the server to the input expression of the flash status bus interface CEn Connects the server to the output expression on the flash chip enable pin WEn Connects the server to the output expression on the flash write enable pin OEn Connects the server to the output expression on the flash output enable pin DataOE Connects the server to the output enable expres
Tutorial: Handel-C code optimization macro proc FlashRun (FlashPtr, ClockRate) { // Initialization sequence unsigned 3 Command; do { FlashPtr->APICommand ? Command; switch (Command) { case FlashAPICmdRead: // Read sequence goes here FlashPtr->APICommand ! 0; break; case FlashAPICmdWrite: // Write sequence goes here break; case FlashAPICmdErase: // Erase sequence goes here break; default: delay; break; } } while (1); } The full implementation of the server can be found in the TutorialFlashRAM workspace.
Tutorial: Handel-C code optimization macro proc FlashWriteWord (FlashPtr, Address, Data) { par { FlashPtr->APICommand ! FlashAPICommandWriteWord; FlashPtr->APIAddress = Address; FlashPtr->APIData = Data; } } macro proc FlashEraseBlock (FlashPtr, BlockNumber) { par { FlashPtr->APICommand ! FlashAPICommandEraseBlock; FlashPtr->APIBlockNumber = BlockNumber; } } When the flash device driver is used, the application programmer must: • Declare a variable of type (Flash *) • Call FlashInit() with appropriate p
Tutorial: Handel-C advanced optimization 8 Tutorial: Using the logic estimator The following examples illustrate the use of the DK Logic Estimator to produce smaller and faster designs. A basic knowledge of Handel-C is assumed and some knowledge of digital electronics and design techniques will also be helpful. The tutorial workspace can be opened by selecting Start>Programs>Celoxica>Platform Developer's Kit>Tutorials>TutorialEstimator.
Tutorial: Handel-C advanced optimization 8.2 Using the logic estimator results The results from the logic estimator can help you to improve the speed and reduce the size of a Handel-C design.
Tutorial: Handel-C advanced optimization It should appear as below: ESTIMATION SUMMARY FROM VERSION1 PROJECT The first section of the summary provides an estimation of the logic area, described in terms of LUTs, FFs, memory bits and miscellaneous other components. The numbers of these components are listed per source file in the project, with a total at the end.
Tutorial: Handel-C advanced optimization 8.3 Reducing the logic delay If you build the code for the TutorialEstimator version1 project, open the summary.html page, and then click on the Detailed information path, you should see information on the longest paths in the project.
Tutorial: Handel-C advanced optimization do { par { C = A * B; D = A + B; } par { Output = C + D; Index++; } } while (Index < 10000); The while() loop now takes two cycles to execute, but the longest path has been reduced from 31.13ns to 21.81ns (for a grade 4 part), as shown in the new estimation summary below, from the version2 project in the TutorialEstimator workspace (accessible from Start>Programs>Celoxica>Platform Developer's Kit>Tutorials>TutorialEstimator).
Tutorial: Handel-C advanced optimization directory. Open the file named Summary.html by double-clicking on it (this should load your computers default web browser). The code can be altered to allow the loop to execute in one cycle again by implementing a two-stage pipeline, where the first stage calculates the values of C and D, and the second stage adds them together.
Tutorial: Handel-C advanced optimization Try modifying the code in the version2 project in the TutorialEstimator workspace to use this pipeline, rebuild it, and open the estimation summary again. You will see that the longest path is unchanged, and there has been no significant change in the number of LUTs or other logic elements used, despite calculating the values for C and D in two separate places.
Tutorial: Handel-C advanced optimization If you open the summary.html page for the version2 project in the TutorialEstimator workspace (accessible from Start>Programs>Celoxica>Platform Developer's Kit>Tutorials>TutorialEstimator), and click on the link to version2.
Tutorial: Handel-C advanced optimization First, the while condition on line number 57 uses a "less than" < comparison, when in fact a "not equal" != will perform the same function, as Index is only incremented by 1 each time through the loop. Try changing this line of code from < to != in the in the version2 project in the TutorialEstimator workspace, rebuild it, and look at the Estimator output again.
Tutorial: Handel-C advanced optimization The version3 project in the TutorialEstimator workspace (accessible from Start>Programs>Celoxica>Platform Developer's Kit>Tutorials>TutorialEstimator) contains these changes and the estimation summary from building it is show below. Comparing this with the summary from the version2 project, it can be seen that a logic area reduction of over 15% has been achieved by changing only two lines of code. ESTIMATION SUMMARY FROM VERSION3 PROJECT Page 96 www.celoxica.
Tutorial: Handel-C advanced optimization One final change can be made to reduce the logic area further still, and it will have the side-effect of reducing the delay at the same time. In the version3 project of TutorialEstimator workspace, Open the Project Settings dialog, go to the Synthesis tab, and enable ALU mapping, as shown below: ENABLING ALU MAPPING FOR VERSION3 PROJECT Page 97 www.celoxica.
Tutorial: Handel-C advanced optimization As we are targetting a Xilinx Virtex-II device in this case, and the design contains a multiplier, the ALU Mapper will use a single embedded multiplier on the device to perform this operation. The logic estimator summary after this change is shown below: LOGIC ESTIMATOR SUMMARY FOR VERSION3 PROJECT WITH ALU MAPPING ENABLED Page 98 www.celoxica.
Tutorial: Handel-C advanced optimization You can see that with ALU mapping enabled there is another column in the area estimation, showing how many embedded ALUs were used. You can also see the dramatic reduction in logic area and delay compared to the original estimator output for the version3 project, shown earlier. Below is the detailed area estimation with ALU mapping enabled, where you can see that an ALUs column is now present, and one is used on the line of code with the multiplier.
Tutorial: Using the logic estimator 9 FIR Tutorial The FIR tutorial illustrates how to implement a FIR (Finite Impulse Response) filter using Handel-C, starting with a software-style implementation and finishing with an efficient hardware implementation. This tutorial will not cover the theory of FIR filters. The TutorialFIR workspace contains the code for each of the examples. To open the workspace, select Start>Programs>Celoxica>Platform Developer's Kit>Tutorials>TutorialFIR.
Tutorial: Using the logic estimator When the coefficients are symmetrical, pairs of samples taken from the start and end of the series can be added together, as shown in the figure below. The advantage of this is that the number of multiplications required can be reduced by up to 50% (in this case it is now four, instead of the seven required in the diagram above). This is important for a hardware implementation of an FIR filter as multipliers require a significant amount of logic.
Tutorial: Using the logic estimator /* * Structure of variables to interface to FIR filter */ struct _FirStruct { unsigned 1 InputValid; unsigned 1 OutputValid; signed Input; signed Output; signed Coeffs[]; }; typedef struct _FirStruct FirStruct; There are then prototypes for the FIR macro procedures: macro proc FirFilter (FirPtr, DataWidth, Taps, CoeffList); macro proc FirWrite (FirPtr, Data); macro proc FirRead (FirPtr, DataPtr); The FirWrite and FirRead macros are shown below.
Tutorial: Using the logic estimator The FirFilter macro contains the code to perform the actual FIR filtering. Before the filter starts operation, the coefficients which were passed into the FirFilter macro are stored in the Coeffs[] array in the FIR interface structure: par (i = 0; i < Taps; i++) { FirPtr->Coeffs[i] = CoeffList[i]; } After storing the coefficients, the FirFilter macro enters a while(1) loop which contains several sequential stages within it.
Tutorial: Using the logic estimator par { FirPtr->Output = Accumulator; FirPtr->OutputValid = 1; } The main function in fir1.hcc is set up to read input data from a file using chanin during simulation, and to read from an interface when built for EDIF. Build the project for Debug, and start the simulation. The input data will be read from the file input.txt, and the filtered output will be written to the file output.txt.
Tutorial: Using the logic estimator SETTING THE CHIP TYPE Now select the Synthesis tab and ensure that the settings are exactly as shown below, with the Technology Mapper enabled, and Retiming disabled. SYNTHESIS SETTINGS Page 105 www.celoxica.
Tutorial: Using the logic estimator Finally, select the Linker tab, and check that Generate estimation info is enabled. LINKER SETTINGS Page 106 www.celoxica.
Tutorial: Using the logic estimator Now rebuild the project for EDIF, and open Summary.html in the folder PDK/Tutorials/General/TutorialFIR/Version1/EDIF. The summary file shows logic area and delay estimation for the project, as shown below. As we improve the FIR in the next stages of the tutorial, you can refer back to the summary on this page to compare the area and delay of new versions.
Tutorial: Using the logic estimator The replicated par{} builds a copy of the line of code it contains for every tap in the FIR, and all the lines are executed in parallel. The results from the parallel multiplications are stored in the MultResults array, and are added together by a call to the RecurseAdd macro as shown below: Accumulator = RecurseAdd(MultResults, Taps-1); RecurseAdd is a recursive macro expression which is passed an array and the index of the top element of that array.
Tutorial: Using the logic estimator A more efficient adder tree in terms of logic delay is shown below: 7 6 5 4 3 2 1 0 result IMPROVED ADDER TREE Shown below is the logic estimator summary and longest path for the first version of RecurseAdd, used in the Version2 project in the TutorialFIR workspace. This summary can be viewed by building the project for EDIF, and opening Summary.html in the folder PDK/Tutorials/General/TutorialFIR/Version2/EDIF. Page 109 www.celoxica.
Tutorial: Using the logic estimator LOGIC ESTIMATION SUMMARY FOR VERSION2 PROJECT LONGEST PATH SUMMARY FOR VERSION2 PROJECT The RecurseAdd macro expression can be re-written to build such an adder tree. This is achieved by writing a recursive macro expression which locates the middle element of the array it has been asked to add, then makes two calls to itself; one from Bottom to Middle, and the other from Middle+1 to Top, as shown below.
Tutorial: Using the logic estimator The logic estimator summary and longest path is shown below. This summary can be viewed by building the project for EDIF, and opening Summary.html in the folder PDK/Tutorials/General/TutorialFIR/Version3/EDIF. LOGIC ESTIMATION SUMMARY FOR VERSION3 PROJECT LONGEST PATH SUMMARY FOR VERSION3 PROJECT It can be seen that the logic delay is approximately one third of what it was the the first version of RecurseAdd, which is what would be expected.
Tutorial: Using the logic estimator the adder tree. Note that the logic area in the Estimator Summary is larger for the Version2 and Version3 projects than for Version1 (Initial version), which is to be expected as we now have a larger number of multipliers and adders. The tradeoff is that the number of clock cycles taken to process each data sample is significantly reduced. 9.
Tutorial: Using the logic estimator struct _FirStruct { signed Input; signed Output; signed Coeffs[]; }; macro proc FirWrite (FirPtr, Data) { FirPtr->Input = Data; } macro proc FirRead (FirPtr, DataPtr) { *DataPtr = FirPtr->Output; } If an application for the FIR filter was unable to provide new input data or accept output data every cycle, the interface structure could be modified to include an Enable register. All the code withing the body of the FIR filter would then be put inside an if...
Tutorial: Using the logic estimator LOGIC ESTIMATION SUMMARY FOR VERSION3 PROJECT 9.5 Reducing logic area There is one final optimization which we will make to the FIR filter to reduce the area it takes up on a device. It is possible to reduce the number of multipliers in the FIR filter by up to 50%, by taking advantage of the fact the FIR filters can have symmetrical coefficients.
Tutorial: Using the logic estimator FIR TAKING ADVANTAGE OF SYMMETRICAL COEFFICIENTS The FIR filter can easily be modified to take advantage of symmetrical coefficients.
Tutorial: Using the logic estimator In the summary from the Logic Estimator below, the hardware usage can be seen to be significantly reduced from the previous version (Single cycle FIR), with the number of FFs down by 6%, LUTs down by 34% and other components down by 38%. This summary can be viewed by building the project for EDIF, and opening Summary.html in the folder PDK/Tutorials/General/TutorialFIR/Version4/EDIF. LOGIC ESTIMATION SUMMARY FOR VERSION4 PROJECT Page 116 www.celoxica.
Tutorial: Using the logic estimator 9.6 Using ALU Mapping One of the new features introduced in DK 3.0 was ALU Mapping. This is only supported on FPGA devices which contain embedded ALU primitives, such as multipliers or MAC units. When this feature is enabled, DK will automatically target embedded ALUs, making use of them where they will result in the biggest increase in performance or reduction in logic area.
Tutorial: Using the logic estimator Open the alumapping1 project in the TutorialFIR workspace, accessible from Start>Programs>Celoxica>Platform Developer's Kit>Tutorials>TutorialFIR on the Start Menu. This project contains the same source code as the Version4 project in the same workspace, but has ALU Mapping enabled. Build the project for EDIF, and open Summary.html in the folder PDK/Tutorials/General/TutorialFIR/alumapping1/EDIF.
Tutorial: Using the logic estimator Compared to the summary for the previous project (Reducing logic area), it can be seen that the number of LUTs and "other" (e.g. fast carry chains) components has dropped significantly, while 11 ALUs are now used, and there has been an increase in the number of FFs. It can also be seen that with the use of the embedded ALUs, the estimated longest path has been reduced by almost 30%.
Tutorial: Using the logic estimator LONGEST PATH SUMMARY FOR ALUMAPPING1 PROJECT Our goal is now to reduce the delay on this path further. We will do this by pipelining the adder tree which is currently built by the RecurseAdd macro.
Tutorial: Using the logic estimator To simplify the Handel-C code required to implement this adder tree, we will declare a 2-dimensional array, as wide and deep as the adder tree required for the specified number of taps in the FIR filter.
Tutorial: Using the logic estimator Open the alumapping2 project in the TutorialFIR workspace, accessible from Start>Programs>Celoxica>Platform Developer's Kit>Tutorials>TutorialFIR on the Start Menu. This contains the code shown above for the pipelined adder tree. Build the project for EDIF, and open Summary.html in the folder PDK/Tutorials/General/TutorialFIR/alumapping2/EDIF.
Tutorial: Using the logic estimator The longest path is now through the multiplier again, but as this is now an embedded ALU, it is not possible to break it down and gain any further increase in speed. The next step in the tutorial will look at an alternative approach: using retiming to increase the speed of the FIR filter. 9.8 Using Retiming For this stage in the tutorial, we will return to the source code as used in "Reducing logic area".
Tutorial: Using the logic estimator The next step is to switch on the retimer. The settings for retiming are accessed through the Project->Settings menu, from which you must select the Synthesis tab, as shown below: RETIMING SETTINGS To enable retiming, simply check the box next to Enable Retiming. You must also have Enable Technology Mapper checked to use retiming.
Tutorial: Using the logic estimator RETIMER OUTPUT DURING BUILD In this you can see that the retimer has found a path with a delay of 27.31ns - which is equivalent to the final delay in the estimation summary above for the Version4 project. The retimer has discovered a requirement for a delay of 8.333ns, and has tried to meet this, achieving 8.878ns. Although this is longer than the required delay, it is likely to be close enough for the PAR tools to achieve the requested clock rate.
Tutorial: Using the logic estimator 9.9 Improving performance with retiming The previous version of the FIR (Using Retiming) used retiming but did not change the design at all. Build the Retiming1 project in the TutorialFIR workspace, accessible from Start>Programs>Celoxica>Platform Developer's Kit>Tutorials>TutorialFIR on the Start Menu. Open the logic estimator summary - Summary.html in the folder PDK/Tutorials/General/TutorialFIR/retiming1/EDIF, and click on the Detailed path information link.
Tutorial: Using the logic estimator Page 127 www.celoxica.
FIR Tutorial 10 Index Using ALU Mapping 117 Using an adder tree 119 A Using Parallel multipliers 107 add tree............................................... 107, 114 ALU Mapping................................ 93, 117, 119 arrays and memories.................................... 72 Using retiming 123, 126 flash memory ................................................82 G graphics ........................................................63 C client-server ...........................................
FIR Tutorial device drivers examples tutorials 29, 30 28, 31, 32, 38 28 R RAM use ....................................................... 72 recursive macro .......................................... 107 retiming ............................................... 123, 126 S select operator.............................................. 50 seven segment display ........................... 54, 61 signals........................................................... 52 static initialization........................