Stereo System User Manual

ManualsBrandsMotorola ManualsStereo SystemDSP96002

DSP96002

32-BIT

DIGITAL SIGNAL PROCESSOR

USER’S MANUAL

Motorola, Inc.

Semiconductor Products Sector

DSP Division

6501 William Cannon Drive, West

Austin, Texas 78735-8598

Summary of content (897 pages)

PAGE 1
DSP96002 32-BIT DIGITAL SIGNAL PROCESSOR USER’S MANUAL Motorola, Inc.
PAGE 2
SECTION 1 DSP96002 INTRODUCTION This manual describes the first member of a family of dual-port IEEE floating point programmable CMOS processors. The family concept defines a core as the Data ALU, Address Generation Unit, Program Controller and associated Instruction Set. The On-Chip Program Memory, Data Memories and Peripherals support many numerically intensive applications and minimize system size and power dissipation; however, they are not considered part of the core.
PAGE 3
1-2 DSP96002 USER’S MANUAL MOTOROLA
PAGE 4
SECTION 2 SIGNAL DESCRIPTION AND BUS OPERATION 2.1 PINOUT The functional signal groups of the DSP96002 are shown in Figure 2-2, and are described in the following sections. A pin allocation summary is shown in Figure 2-1. Specific pinout and timing information is available in the DSP96002 Technical Data Sheet (DSP96002/D). 2.1.1 Package The DSP96002 is available in a 223 pin PGA package. There are 176 signal pins (including 5 spares), 17 power pins and 30 ground pins.
PAGE 5
CPU Pins Pins Reset and IRQs Clock Input OnCE Port CPU Spare Quiet Power Quiet Ground CPU Subtotal 4 1 4 1 4 4 18 Power/Ground Planes Pins Package Noisy Power Plane Package Noisy Ground Plane Package Quiet Power Plane Package Quiet Ground Plane Power/Ground Plane Subtotal 2 5 1 1 9 Port A/B Each Port Both Ports Pins Pins Data Bus Address Bus Data Power Data Ground Address Power Address Ground Addr/Data Subtotal Port A/B Bus Control Signals Bus Control Spare Bus Control Power Bus Control Ground Co
PAGE 6
ADDRESS BUS A aA0-aA31 Vcc Vss 32 (2) (4) ADDRESS BUS B (2) (4) DATA BUS A aD0-aD31 Vcc Vss 32 32 32 (2) (4) DATA BUS B (2) (4) PORT A BUS CONTROL aS1 aS0 — aR/ W — – a B S — – a B L — – a T T — – a T S — – a T A bA0-bA31 Vcc Vss bD0-bD31 Vcc Vss PORT B BUS CONTROL bS1 bS0 — bR/ W — – B S — – b B L — – b T T — – b T S — – b T A DSP96002 223 PINS a — – A E — – a D E — – b A E — – b D E — – a H S — – a H A — – a H R — – b H S — – b H A — – b H R — – B R — – a B G — – a B B — – a B A — – B
PAGE 7
ing hardware reset and becomes a level sensitive or negative edge triggered, maskable interrupt request input during normal instruction processing. MODA, MODB and MODC select one of 8 initial chip operating modes, latched into the operating mode register — — — — – —— — – (OMR) when the R E S E T pin is deasserted. If I R Q C is asserted synchronous to the input clock (CLK), multiple processors can be resynchronized using the —— — – WAIT instruction and asserting I R Q C to exit the wait state. 2.1.
PAGE 8
Bus Control VCC(2) (Power) - isolated power for the bus control I/O drivers. Must be tied to all other chip power pins externally. User must provide adequate external decoupling capacitors. Bus Control VSS(4) (Ground) - isolated ground for the bus control I/O drivers. Must be tied to all other chip ground pins externally. User must provide adequate external decoupling capacitors. 2.1.
PAGE 9
and may change only when ware reset. — – T S is deasserted. A0-A31 are three-stated during hard- D0-D31 (Data Bus) - three-state, active high, bidirectional input/outputs when a bus master or — – not a bus master. The Data Enable ( D E) input acts as an output enable control for D0-D31. As a bus master, the data lines are controlled by the CPU instruction execution or the DMA controller. D0-D31 are also the Host Interface data lines. If there is no external bus activity, D0-D31 are three-stated.
PAGE 10
— an "early write" signal for DRAM interfacing. R/ W is high for a read access and is low — for a write access. The R/ W pin is also the Host Interface read/write input. As an in— — put, R/ W may change asynchronous relative to the input clock. R/ W goes high if — the external bus is not used during an instruction cycle. R/ W is three-stated during hardware reset. — – B S — – T T — – T S MOTOROLA (Bus Strobe) - three-state, active low output when a bus master, three-stated when not a bus master.
PAGE 11
— – — – When a bus master, the combination of B S and T S can be decoded externally to determine the status of the current bus cycle and to generate hardware strobes useful for latching address and data signals. The encoding is shown in Figure 2-4.
PAGE 12
— – A E — – D E (Address Enable) - active low input, must be asserted and deasserted synchronous to — – the input clock (CLK) for proper operation. If a bus master, A E is asserted to enable — – the A0-A31 address output drivers. If A E is deasserted, the address output drivers are three-stated. If not a bus master, the address output drivers are three-stated regard— – — – less of whether A E is asserted or deasserted. The function of A E is to allow multiplexed bus systems to be implemented.
PAGE 13
register (IVR) onto the data bus outputs D0-D31. This provides an interrupt acknowledge capability compatible with MC68000 family processors. — – If the host interface is in DMA mode, H A is used as a DMA transfer acknowledge input and it is asserted by an external device to transfer data between the Host Interface — – registers and an external device. In DMA read mode, H A is asserted to read the Host — – Interface RX register on the data bus outputs D0-D31.
PAGE 14
(BSET, BCLR, BCHG) will not give up bus mastership until the end of the current instruc—— – tion. B G is ignored during hardware reset. — – B A — – (Bus Acknowledge) - Open drain, active low output. When deasserting B A, the — – DSP96002 drives B A high during half a CLK cycle and then disables the active pull— up. In this way, only a weak external pull-up resistor is required to hold the line high.
PAGE 15
— – B L (Bus Lock) - active low output, never three-stated. Asserted at the start of an external indivisible Read-Modify-Write (RMW) bus cycle (providing an "early bus start" signal for — – DRAM interfacing) and deasserted at the end of the write bus cycle. B L remains as— – serted between the read and write bus cycles of the RMW bus sequence.
PAGE 16
3:4. 3:5. 3:6. When the Address and Memory Reference signals are stable, the data transfer is enabled by — – — – the Transfer Strobe T S signal. T S is asserted to "qualify" the Address and Memory — – Reference signals as stable and to perform the read or write data transfer. T S is asserted in the second phase of the bus cycle. — – Wait states are inserted into the bus cycle controlled by a wait state counter or by T A, whichever is longer. The wait state counter is loaded from the Bus Control Register.
PAGE 17
The disadvantage of this technique is that access time is measured from — – or B S. Hence faster memories are required. — – T S instead of from the address DSP96002 STATIC RAM — — — R/ — – O — – C S1 or S0 Figure 2-6. —W–E Controlled Writes Interface To Static RAM 3. 6.1.2 — – W E Controlled Writes — – This form of static interface uses the memory write enable ( W E) as the write strobe. The DSP96002 — — – R/ W signal is used to form a late read/write indication by gating it with T S.
PAGE 18
The Port A/B bus control signals are designed for efficient interface to DRAM/VRAM devices in both random read/write cycles and fast access modes such as those listed above. The bus control signal timing is specified relative to the external clock (CLK) to enable synchronous control by an external state ma— – chine. An on-chip page circuit controls the T T pin, indicating to the external state machine when a slow or fast access is being made.
PAGE 19
4.11.2 The Arbitration Protocol The bus is arbitrated by a central bus arbitrator, using individual request/grant lines to each bus master. The arbitration protocol can operate in parallel with bus transfer activity so that the bus hand-over can be made without much performance penalty. The arbitration sequence occurs as follows: 5:12. All candidates for bus ownership assert their respective the bus. — – B R signals as soon as they need 5:13.
PAGE 20
— – An implementation of a bus arbitration scheme may hold B G asserted, for example, to the current bus owner if none of the other devices are requesting the bus. As a consequence, the current bus master may — – — – keep B A asserted after ceasing bus activity, regardless of whether B R is asserted or deasserted. This situation is called "bus parking" and allows the current bus master to use the bus repeatedly without re-arbitration until some other device requests the bus.
PAGE 21
XX YY YX (illegal) REQUEST_BUS (Y) — – B R=0 — – B A=1 XY H ZX WY (non-existant) YZ ZY (delayed) IDLE (X) — – — – B R= R WX XW YW (illegal) ACTIVE_ MASTER (Z) — – B R=0 — – B A=0 PARKING_ MASTER (W) — – — B R= – R H XZ ZW WZ WW ZZ (delayed) Figure 2-9. Bus Handshake State Diagram Likewise, when executing the read part of a RMW access, the end_of_sequence signal is deasserted. — – This signal is used to give up bus ownership if B G is deasserted during bus transfers.
PAGE 22
— — – v ( ext_acc_req & ^ D B G ) — – &^ B G ZZ = ^end_of_sequence ZW = ^ext_acc_req — – WX = ^ext_acc_req & B G WY = NON-EXISTENT ARC WZ = ext_acc_req — – WW = ^ext_acc_req & ^ B G (note 3) (note 2) Notes: 1. Illegal arcs in DSP96002 since once the request of the bus is pending, it will not be canceled before the execution of the access. — – 2. Non-existent arc since if ext_acc_req arrives together with the negation of B G, the device becomes active master and begins its bus transfers. — — – — – 3.
PAGE 23
5.16.3.5 Case 5 – Bus Lock during RMW — – — – If the device requesting mastership asserts B R and the arbiter asserts the requesting devices’ B G — – — – and B B is deasserted, then the requesting device will assert B A. If a read-modify-write (RMW) in— – struction which accesses external memory is being executed, and the bus arbiter deasserts B G, then — – — – B A will remain asserted until the entire RMW instruction completes execution. B A will then be deas— – serted thereby relinquishing the bus.
PAGE 24
SECTION 3 CHIP ARCHITECTURE 3.1 INTRODUCTION The DSP96002 architecture is a 32-bit highly-parallel multiple-bus IEEE floating-point processor. The architecture is designed to accommodate various IC family members with different memory and on-chip peripheral requirements while maintaining a standard programmable core. The overall chip architecture is presented and detailed block diagrams of the Data ALU and Address Generation Unit AGU) core architecture are described. 3.
PAGE 25
Figure 3-1. DSP96002 Block Diagram 3.2.2 Address Buses Addresses are specified for internal X Data Memory and Y Data Memory on two unidirectional 32-bit buses, X Address Bus (XAB) and Y Address Bus (YAB). Internal address bus sizes depend on the amount of internal memory implemented.
PAGE 26
3.2.3 Data ALU The Data ALU performs all of the arithmetic and logical operations on data operands. The Data ALU consists of ten 96-bit general purpose registers, a 32-bit barrel shifter, a 32-bit adder, and a 32-bit parallel multiplier. Data ALU registers may be read or written over the XDB and YDB as 32 or 64-bit operands. The Data ALU is capable of multiplication, addition, subtraction, format conversion, shifting and logical operations in one instruction cycle.
PAGE 27
3.2.6 Y Data Memory The Y Data Memory may contain both data RAM and ROM. The Y Data RAM is a 32-bit wide internal memory and occupies the lowest 512 locations in Y Memory Space. The Y Data ROM is also a 32-bit wide internal memory and occupies 1024 locations in Y Memory Space. Addresses are received from the YAB and data transfers occur on the YDB. The Y memory is dual-access memory in the sense that it may be accessed twice during a cycle: once by the core and once by the DMA.
PAGE 28
A program loop begins execution after the DO instruction and continues until the program address fetched equals the loop address register contents (last address of program loop). The contents of the loop counter are then tested for one. If the loop counter is not one, the loop counter is decremented and the top location in the stack RAM is read (but not pulled) into the PC to return to the start of the loop.
PAGE 29
cessors, another DSP96002 or DMA hardware. The HI appears as a memory mapped peripheral occupying 16 words in the host processor address space. Separate transmit and receive data registers are doublebuffered to allow the DSP96002 and host processor to efficiently transfer data at high speed. Host processor communication with the HI is accomplished using standard Host processor data move instructions and addressing modes. Handshake flags are provided for polled or interrupt-driven data transfers. 3.2.11.
PAGE 30
Figure 3-2. Data ALU Block Diagram Data ALU Register File (D0-D9) The registers may also be treated as thirty 32-bit registers Dn.H, Dn.M, Dn.L, n=0,1,..,9. Each register may be read or written over the XDB or YDB as a word operand. When an individual 32-bit register is written over the XDB or YDB, no format conversion takes place and only the designated register is affected. The low portion of the registers, Dn.L, is used as source and/or destination for most integer operations.
PAGE 31
For the floating-point multiplication the Multiplier accepts two 44-bit input operands, and outputs one 44-bit result. The operation of the floating-point Multiplier occurs independently and in parallel with the operation of the floating-point Adder and with the XDB and YDB activity. For the fixed-point multiplication the Multiplier accepts two 32-bit input operands, and outputs one 64-bit result. The operation of the fixed point Multiplier occurs independently and in parallel with the XDB and YDB activity.
PAGE 32
All operations inside the Adder occur in one instruction cycle. Latches are provided on the Adder input operand buses to avoid race conditions. The major components of the Adder are • Add Unit • Subtract Unit • Barrel Shifter and Normalization Unit • Exponent Comparator and Update Unit • Special Function Unit 3.3.2.1 Add Unit The Add Unit is a high speed 32-bit asynchronous adder used in all floating-point non-multiply operations delivering a 32-bit result.
PAGE 33
Linkages are provided to shift in/out the condition code carry (C) bit. 3.3.2.4 Exponent Comparator and Update Unit EXC is an 11-bit subtracter which compares the exponents of the two operands of the add/subtract operations. It receives its inputs on the AEIA and AEIB buses from the high portion of the registers and delivers as result the largest exponent and the difference between the exponents.
PAGE 34
3.4 AGU The major components of the AGU are • Address Register Files • Offset Register Files • Modifier Register Files • Temporary Address Registers • Modulo Arithmetic Units • Address Output Multiplexers A block diagram of the AGU is shown in Figure 3-3. 3.4.1 Address Register Files Each of two Address Register Files consists of four 32-bit registers. The two files contain the address registers R0-R3 and R4-R7 respectively, which usually contain addresses used as pointers to memory.
PAGE 35
Figure 3-3. AGU Block Diagram ister during address register update calculations but they can hold data. Each modifier register may be read or written by the Global Data Bus. Each modifier register is automatically read when the same number address register is read and used as input to its associated modulo arithmetic unit. The registers accessed by the Global Data Bus and the Modulo Arithmetic Unit are not required to be the same. A separate write enable is provided for each register.
PAGE 36
one instruction cycle. In the following cycle, the contents of TempR are used to address X or Y memory. For all absolute addressing modes, the address of the operand is written into TempR and then used to address X, Y, or P memory. The temporary address registers TempN Low and TempN High are 32-bit registers which provide temporary storage for the PC loaded from the Program Address Bus and it is used in case of the PC relative addressing mode.
PAGE 37
address output multiplexers are shared by the DMA and the AGU. The output multiplexers are time multiplexed – the first half instruction cycle is assigned to DMA transfers while the second half cycle is assigned to core transfers.
PAGE 38
Figure 3-4.
PAGE 39
3 - 16 DSP96002 USER’S MANUAL MOTOROLA
PAGE 40
SECTION 4 SOFTWARE ARCHITECTURE 4.1 PROGRAMMING MODEL The programmer can view the DSP96002 architecture as three execution units operating in parallel. The three execution units are the • Data ALU • Address Generation Unit • Program Controller The DSP96002 instruction set has been designed to allow flexible control of these parallel processing resources. Many instructions allow the programmer to keep each unit busy, thus enhancing program execution speed.
PAGE 41
DATA ALU 95 0 D9.H D9.M D9.L D9 D8.H D8.M D8.L D8 D7.H D7.M D7.L D7 D6.H D6.M D6.L D6 D5.H D5.M D5.L D5 D4.H D4.M D4.L D4 D3.H D3.M D3.L D3 D2.H D2.M D2.L D2 D1.H D1.M D1.L D1 D0.H 31 31 D0.M 0 31 D0.L 0 31 ADDRESS GENERATION UNIT 31 0 0 D0 0 31 0 M7 N7 R7 M6 N6 R6 M5 N5 R5 M4 N4 R4 M3 N3 R3 M2 N2 R2 M1 N1 R1 M0 N0 R0 Figure 4-2. DSP96002 Programming Model – Data ALU and Address Generation Unit 4.
PAGE 42
floating point number a format conversion to/from the internal representation takes place. The format conversion is performed automatically and is transparent to the user. The registers serve as input pipeline registers between the XDB and YDB and the multiplier and/or adder. They are used as Data ALU source and/or destination operands allowing also new operands to be loaded for the next instruction while the register contents are used by the current instruction.
PAGE 43
ister will be accessed for an address register update calculation involving an address register of the same number (i.e., M0 is accessed when R0 is to be updated, M1 for R1, etc.). Each modifier register is set to $FFFFFFFF on processor reset which specifies the default value for linear arithmetic register update calculations. 4.6 PROGRAM COUNTER (PC) This 32-bit register contains the address of the next location to be fetched from Program Memory Space.
PAGE 44
31 LF 30 * 29 I1 28 I0 27 FZ 26 MP 25 24 * * MR Reserved Multiply Flush to Zero Interrupt Mask Reserved Loop Flag 23 * 22 R1 21 R0 20 SIOP 19 18 17 SOVF SUNF SDZ 16 SINX IER IEEE Inexact IEEE Divide-by Zero IEEE Underflow IEEE Overflow IEEE Invalid Operation Rounding Mode Reserved 15 14 UN CC NAN 13 S NAN 12 OP ERR 11 10 9 8 OVF UNF DZ INX ER Inexact Divide-by Zero Underflow Overflow Operand error Signaling NaN Not-A-Number Unordered Condition 7 A 6 R 5 LR 4 I 3 N 2 Z 1 V
PAGE 45
4.7.2 CCR Overflow (V) Bit 1 The integer overflow bit is set if an arithmetic overflow occurred in a fixed point operation. This means that the result is not representable in the destination size. The V bit is not affected by floating point operations unless they have a fixed point result. The overflow bit is also modified by Address Generation Unit operation when executing MOVETA instructions. The V bit is cleared during processor reset. 4.7.
PAGE 46
4.7.10 ER Divide-by-Zero (DZ) Bit 9 The DZ flag in the DSP96002 can be set by software as part ofo an FDIV routine. No single DSP96002 instruction can set the DZ flag. The DZ bit is cleared during processor reset and during all floating-point instructions. 4.7.11 ER Underflow (UNF) Bit 10 The underflow bit is set if a result of a floating-point operation is too small to be represented in a floatingE point data register (i. e., strictly between +2 min). The test is done on the exponent before rounding.
PAGE 47
4.7.16 ER Unordered Condition (UNCC) Bit 15 The unordered condition bit is set if a non-aware floating-point conditional instruction (FBcc, FJcc, FIFcc, etc) is executed when the NaN bit is set (the unordered condition). The result of the condition tested by an instruction depends on being able to represent the operand on the real number line. By definition, if the operand is a NaN, it cannot be ordered or represented on the real number line and therefore the UNCC bit will be set.
PAGE 48
The Data ALU performs rounding of the result to the precision specified by the instruction. The DSP96002 supports only single extended and single precision results. The DSP96002 implements all four rounding modes specified by the IEEE standard. These modes are round to nearest (RN), round toward zero (RZ), round toward plus infinity (RP) and round toward minus infinity (RM). The rounding definitions are listed below.
PAGE 49
ization, respectively. If FZ is set, floating-point underflows are flushed to zero. Any denormalized source operand is considered as zero (with the sign of the denormalized source operand) and any underflowed results are flushed to zero (with the sign of the original underflowed result). Cleared during processor reset. FZ 0 1 4.7.
PAGE 50
are checked. If it is not one, the LC is decremented, and the next instruction is taken from the address at the top of the system stack; otherwise the PC is incremented, the loop flag is restored (pulled from stack), the stack is purged, the LA and LC registers are pulled from the stack and restored and instruction execution continues normally. The LA register is a 32-bit read/write register written into by a DO instruction and is read by the system stack for stacking the register. 4.
PAGE 51
UF SE P3 P2 P1 P0 1 1 1 1 1 0 1 1 1 1 1 1 0 0 0 . . . 0 0 0 0 0 0 0 0 . . . 0 0 0 1 1 0 0 0 . . . 1 1 1 0 0 0 0 0 . . . 1 1 1 0 0 0 0 1 . . . 0 1 1 0 0 0 1 0 . . . 1 0 1 0 1 Description Stack Underflow condition after double pull. Stack Underflow condition. Stack Empty (reset). Pull causes underflow. Stack location 1. Double pull causes underflow. Stack location 2. Stack location 13. Stack location 14. Double push causes overflow. Stack location 15. (Stack full). Push causes overflow.
PAGE 52
4.11.3 Underflow flag (UF) Bit 5 The Underflow flag (UF) is set when a stack underflow occurs. The UF flag is cleared when a stack overflow occurs. While the SE flag remains set, the UF flag does not change with Stack Pointer operations caused by instructions that refer implicitly to the Stack Pointer such as RTI, RTS, DO, ENDDO, JSR, etc. The UF flag is cleared by hardware reset (see Figure 4-5). Implicit stack pointer operations that do not produce a stack error (i.e.
PAGE 53
4 - 14 DSP96002 USER’S MANUAL MOTOROLA
PAGE 54
SECTION 5 DATA ORGANIZATION AND ADDRESSING MODES 5.1 OPERAND SIZES Operand sizes are defined as follows: a byte is 8 bits long, a short word is 16 bits long, a word is 32 bits long and a long word is 64 bits long. For floating-point operations the operand sizes are defined as follows: a single real is 32 bits long, a double real is 64 bits long and a register operand is 96 bits long.
PAGE 55
31 30 1 0 SIGNED WORD INTEGER 0 2 21 230 -231 63 62 1 0 SIGNED LONG WORD INTEGER 0 2 21 262 -263 Figure 5-1. Bit Weighting and Alignment of Signed Integer Operands 31 30 1 0 UNSIGNED WORD INTEGER 0 2 21 230 231 63 62 1 0 UNSIGNED LONG WORD INTEGER 0 2 21 262 263 Figure 5-2.
PAGE 56
is not aware that the data is represented in a floating point format. The range of the unbiased exponent, E, is every integer between Emin and Emax, inclusive (-Emin
PAGE 57
Denormalized Numbers: E Represents real numbers in the form (-1)sx 2( min-1+127)x 0.f Bias of e .............. +127 ($7E) e ...................... 0 ($00) f....................... Non-Zero Mantissa................ 0.f Signed Zeros: E Represents real zeroes in the form (-1)sx 2( min-1+127)x 0.0 Bias of e .............. +127 ($7F) e ...................... 0 ($00) f....................... Zero Mantissa................ 0.f = 0.00...
PAGE 58
Normalized Numbers: Represents real numbers in the form (-1)s x 2(E+1023) x 1.f E ........................ unbiased exponent -1022 < E < +1023 Bias of e .............. +1023 ($3FF) e + E + bias ...................... 0 < e < 2046 ($7FE) f ...................... Zero or Non-Zero Mantissa................ 1.f Denormalized Numbers: Represents real numbers in the form (-1)sx 2(Emin-1+1023)x 0.f Emin.................... -1022 Bias of e .............. +1023 ($3FF) e ...................... 0 ($000) f ..............
PAGE 59
Sets of 3 Data ALU registers may be concatenated to form ten 96 bit registers which may be accessed as single real or double real operands. Floating-point operands are always represented in an internal double precision format, described below. 5.3.1.1 Internal floating-point Data Format All DSP96002 internal floating-point operations are performed using single extended precision. All operands are converted to the internal double precision format when written into a Data ALU register.
PAGE 60
e = Biased Exponent .... 11 95 94 93 92 S U V 64 63 62 75 74 Zero Biased Exponent I 0 11 10 Fraction Zero u = U tag .............. 1 v = V tag .............. 1 i = Integer Part ....... 1 f = Fraction ........... 52 z = Unused bits......... 29 Interpretation of Unused Bits: Input .................. Don’t Care Output.................. All Zeros Unused bits should be written with zero for future compatibility.
PAGE 61
Mantissa................ i.f = 1.00...00 NaNs (Not-a-Number): s ...................... Don’t care Bias of e .............. n.a. e ...................... 2047 ($7FF) i ...................... 1 f ...................... Non-Zero Mantissa................ i.f: 1.11...11 Legal QNaN 1.1x...xx QNaN 1.0x...xx SNaN 5.3.2 Address Generation Unit (AGU) Registers The notation Rn will be used to designate one of the 8 address registers R0-R7.
PAGE 62
• It has the same pattern for all precisions. • All bits of the fraction are set to one. • The biased exponent is set to all ones. • The sign bit is cleared. • In the internal floating-point format, the I bit is always set to one; note that if the I bit is set to zero, the pattern is not recognized as a legal pattern by the Data ALU hardware, and operations on these bit patterns may yield unexpected results.
PAGE 63
Single Precision → Double Precision Memory Format Internal Format 31 → 95 S 94 U - SET IF DENORMALIZED, CLEARED OTHERWISE 93 V - CLEARED 92 CLEARED . 75 CLEARED 30 → 74 73 SET IF NAN OR INFINITY, CLEARED IF ZERO, INV(BIT 30) OTHERWISE 72 SET IF NAN OR INFINITY, CLEARED IF ZERO, INV(BIT 30) OTHERWISE 71 SET IF NAN OR INFINITY, CLEARED IF ZERO, INV(BIT 30) OTHERWISE 29 → 70 . → . 23 → 64 63 I - CLEARED IF DENORM. OR ZERO, SET OTHERWISE 22 → 62 . → . 0 → 40 39 CLEARED . .
PAGE 64
register is also the destination of the current operation). The DSP96002 does not support double precision. It does support single extended precision. 5.5.2 Conversion to the Memory Formats Conversions from the internal double precision format to either of the two memory floating-point formats is performed whenever a data register is to be stored in memory or any other location external to the Data ALU.
PAGE 65
Double Precision → Single Precision Internal Format Memory Format 95 → 31 94 . 75 74 → 30 73 72 71 70 → 29 . → . 64 → 23 63 62 → 22 . → . 40 → 0 39 . 0 Double Precision → Double Precision Internal Format Memory Format 95 → 63 94 75 74 → 62 . → . 64 → 52 63 62 → 51 . → . 11 → 0 10 0 Figure 5-6. Conversion from Internal Format to Memory Formats 5.6.3 R Register References Register references (called R references) are references to the Data ALU, Address Generation Unit and Program Controller registers.
PAGE 66
5.6.4 Memory References Memory references are references to the 32-bit wide X or Y memory spaces and may be internal or external memory references depending on the effective address of the operand in the data bus movement field of the instruction. Data may be read or written from any address in either memory space. 5.6.4.1 X Memory References The operand is in X memory space and is a word reference. Data may be read from memory to a register or from a register to memory. 5.6.4.
PAGE 67
5.7 ADDRESSING MODES The DSP96002 instruction set contains a full set of operand addressing modes. All address calculations are performed in the Address Generation Unit to minimize execution time and loop overhead. Addressing modes specify whether the operand(s) is in a register or memory and provide the specific address of the operand(s). An effective address in an instruction will specify an addressing mode, and for some addressing modes the effective address will further specify an address register.
PAGE 68
5.7.2 Address Register Indirect Modes The effective address in the instruction specifies the address register Rn and the address calculation to be performed. These addressing modes specify that the operand(s) is in memory and provide the specific address of the operand(s). When an address register is used to point to a memory location, the addressing mode is called address register indirect.
PAGE 69
changed. The type of arithmetic used to increment Rn is determined by Mn. This reference is classified as a memory reference. 5.7.2.7 Predecrement by 1 -(Rn) The address of the operand is the contents of the address register Rn decremented by 1. Before the operand address is used, it is decremented (subtracted) by 1 and stored in the same address register. The type of arithmetic used to increment Rn is determined by Mn. The Nn register is ignored. This reference is classified as a memory reference. 5.7.2.
PAGE 70
5.7.4.1 Immediate Data This addressing mode requires one word of instruction extension. The immediate data is a word operand in the extension word of the instruction. This reference is classified as a program reference. 5.7.4.2 Immediate Short Data The 8-, 16-, or 19-bit operand is in the instruction operation word. The 8-bit operand is used for ANDI and ORI instructions and it is zero extended.
PAGE 71
(pointers) rather than moving large blocks of data. The contents of the address modifier register Mn defines the type of address arithmetic to be performed for addressing mode calculations, and for the case of modulo arithmetic, the contents of Mn also specifies the modulus. All address register indirect modes may be used with any address modifier type. Each address register Rn has its own modifier register Mn associated with it. 5.8.
PAGE 72
Addressing Mode Modifier MMM Operand Reference P S C D A X Y L XY Register Direct Data or Control Register Address Register Address Modifier Register Address Offset Register No No No No Address Register Indirect No Update Postincrement by 1 Postdecrement by 1 Postincrement by Offset Nn Postdecrement by Offset Nn Indexed by Offset Nn Predecrement by 1 Long Displacement No Yes Yes Yes Yes Yes Yes Yes x x x x x x x PC Relative Long Displacement Short Displacement Address Register No No No x x x Spec
PAGE 73
On the DSP96002, the upper and lower boundaries are not explicitly needed. If the address register pointer increments past the upper boundary of the buffer (base address plus M-1) it will wrap around to the base address. If the address decrements past the lower boundary (base address) it will wrap around to the base address plus M-1. If an offset Nn is used in the address calculations, the 32-bit value ∫Nn∫ must be less than or equal to M for proper modulo addressing.
PAGE 74
5.8.5 Address Modifier Type Encoding Summary Figure 5-8 contains a summary of the address modifier types discussed in the previous paragraphs.
PAGE 75
Modifier MMMMMM M M Address Calculation Arithmetic 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 2 Reverse Carry Modulo 2 Modulo 3 . . . . . . . 0 0 0 0 0 0 1 2 F F x x F F x x F F x x F F x x F F x x E F x x D E F F F F F F F x x 0 0 0 0 3 7 F x x 0 0 0 0 F F F . . Modulo 16,777,215 Modulo 16,777,216 reserved reserved . . . . . . . F F F F F F F F F x x 0 0 0 0 F F F x x 0 0 0 0 F F F x x 0 0 0 0 F F F x x 0 1 3 7 F F F (Bit Reversed Update) ((2**24)-1) (2**24) . .
PAGE 76
Figure 5-8.
PAGE 77
SECTION 6 INSTRUCTION SET AND EXECUTION 6.1 INTRODUCTION This chapter introduces the DSP96002 instruction set and instruction format. The complete range of instruction capabilities combined with the flexible addressing modes described in Chapter 5 provide a very powerful assembly language for digital signal processing and graphics algorithms. The instruction set has been designed to allow efficient coding for high-level language compilers and yet be easily programmed in assembly language.
PAGE 78
tion cycle in the IEEE mode if denormalized numbers are not detected, otherwise additional instruction cycles will be required. See Figure 6-1 for a list of the thirty eight floating point arithmetic instructions. FABS.S FABS.X FADD.S FADD.X FADDSUB.S FADDSUB.X FCLR FCMP FCMPG FCMPM FCOPYS.S FCOPYS.X FGETMAN FINT FLOAT.S FLOAT.X FLOATU.S FLOATU.X FLOOR FMPY FADD.S FMPY FADD.X FMPY FADDSUB.S FMPY FADDSUB.X FMPY FSUB.S FMPY FSUB.X FMPY.S FMPY.X FNEG.S FNEG.X FSCALE.S FSCALE.X FSEEDD FSEEDR FSUB.S FSUB.
PAGE 79
6.2.2 Fixed-Point Arithmetic Instructions The fixed-point arithmetic instructions perform all operations within the Data ALU. Arithmetic instructions are register-based (register direct addressing modes used for operands) so that the Data ALU operation indicated by the instruction does not use the X Data Bus, the Y Data Bus, or the Global Data Bus. This allows for parallel data movement over these buses during most Data ALU operations.
PAGE 80
6.2.3 Logical Instructions The logical instructions perform all of the logical operations, except ANDI and ORI, within the Data ALU. Logical instructions are register-based like the arithmetic instructions discussed previously. Optional data transfers may be specified in parallel with most logical instructions – over the X and Y data buses or over the Global Data Bus. This allows new data to be pre-fetched for use in following instructions and results calculated in previous instructions to be stored.
PAGE 81
6.2.5 Loop Instructions The loop instructions control hardware looping by initiating a program loop and setting up looping parameters, or by "cleaning" up the system stack when terminating a loop. Initialization includes saving registers used by a program loop (LA and LC) on the system stack so that program loops can be nested. The address of the first instruction in a program loop is also saved to allow no-overhead looping. See Figure 65 for a list of the four loop instructions.
PAGE 82
Bcc BRA BRCLR BRSET Branch Conditionally Branch Always Branch if Bit Clear Branch if Bit Set BScc BSCLR BSR BSSET DEBUG FBcc FBScc FFcc FFcc.U FJcc FJScc FTRAPcc IFcc IFcc.
PAGE 83
In an instruction word, one or more "effective addresses" may be specified. An effective address defines the way in which an operand location is derived. The effective address will include an addressing mode and may also include a selected register. The addressing mode selects the address update to be used (see Section 5.7). The register specified may be the location of an operand or it may be an address register used to calculate the address of an operand.
PAGE 84
field specifies the operands to be used by the adder/subtracter opcode. One of the Opcode fields must always be included in the source code. The X Bus Data field specifies an optional data transfer over the X Bus and the addressing mode to be used. The Y Bus Data field specifies an optional data transfer over the Y Bus and the addressing mode to be used. The address space qualifiers X:, Y: and L: indicate which address space is being referenced.
PAGE 85
6.4.2 Memory Access Processing One or more of the DSP96002 memory sources (X data memory, Y data memory and program memory) may be accessed during the execution of an instruction. Each of these memory sources may be internal or external to the DSP96002. Three address buses (XAB, YAB and PAB) and four data buses (XDB, YDB, PDB and GDB) are available for internal memory core (as opposed to DMA) accesses during one instruction cycle.
PAGE 86
6 - 10 DSP96002 USER’S MANUAL MOTOROLA
PAGE 87
SECTION 7 EXPANSION PORTS AND I/O PERIPHERALS 7.1 INTRODUCTION The upper 128 locations of the X and Y Data memories are defined as the I/O space. The Y memory I/O space is wholly external, while the X memory I/O space is internal. The X memory I/O space is used to address the I/O Interface registers as well as the bus, port select and interrupt control registers. Both I/O spaces may be accessed by regular X and Y memory MOVE instructions.
PAGE 88
31 RH 16 LH BS 15 XE YE PE SF1 SF0 MF NS 12 11 External X Memory Wait Control ** ** 87 External Y Memory Wait Control P3 P2 P1 43 External Prog Memory Wait Control 0 External I/O Memory Wait Control 31 RH Port A Bus Control Register (BCRA) X:$FFFFFFFE P0 16 LH BS 15 XE YE PE SF1 SF0 MF NS 12 11 External X Memory Wait Control ** ** 87 External Y Memory Wait Control P3 P2 P1 43 External Prog Memory Wait Control Port B Bus Control Register (BCRB) X:$FFFFFFFD P0 0 Ext
PAGE 89
7.2.1.3 BCRx Reserved bits (Bits 20, 21) These reserved bits read as zero and should be written with zero for future compatibility. 7.2.1.4 BCRx Non-Sequential Fault Enable (NS) Bit 22 Non-sequential fault detection is enabled if the NS control bit is set. Non-sequential faults are ignored by the page circuit if the NS control bit is cleared. See Section 7.2.2 on Page Circuit Operation. Cleared by hardware reset. 7.2.1.
PAGE 90
7.2.1.9 BCRx X Data Memory Fault Enable (XE) Bit 28 If the X Data Memory Fault Enable bit XE is set, the page fault circuit will monitor X Data memory bus cycles. — – If XE is set and a fault is detected during a X Data memory bus cycle, T T will be deasserted. If XE is set — – and no fault is detected during a X Data memory bus cycle, T T will be asserted. If XE is cleared, the — – page fault circuit will be inactive for X Data memory bus cycles and T T will remain deasserted.
PAGE 91
external memory may use a fast access mode (page, static column, nibble or serial shift) during the current bus cycle. The page circuit must be programmed with the characteristics of the external memory which allow — fast access modes. When the external memory cannot use a fast access mode in the current bus cycle, – T T remains deasserted.
PAGE 92
— – Non-Sequential Fault T T is deasserted if the current address A is not the increment (+1) of the latched address A’. The non-sequential fault is enabled if the NS control bit is set, otherwise disabled. Nibble mode accesses on the random port or serial accesses on the serial port can cause non-sequential faults. Page and static column mode RAMs cannot have non-sequential faults and NS should be cleared. The page circuit checks for non-sequential faults for addresses that are inside the defined page.
PAGE 93
SF1 SF0 Memory Spaces Mapped To Same Physical Address Memory Space Changes Detected as Faults 0 0 1 1 0 1 0 1 PXY share PY share XY share none, all none P→X,X→P,X→Y,Y→X P→X,X→P,P→Y,Y→P P→X,X→P,X→Y,Y→X,P→Y,Y→P same addresses same addresses same addresses addresses unique Figure 7-4a. Memory Space Change Detection — A DATA D SF1 CE Address A Data D PROGRAM Figure 7-4b.
PAGE 94
PE XE YE 0 0 0 — – T T Pin Activity for P Space X Space Deasserted Deasserted Current Bus Cycle Latched for Y Space P Space X Space Y Space Deasserted No No No 0 0 1 Deasserted Deasserted Active No No Yes 0 1 0 Deasserted Active Deasserted No Yes No 0 1 1 Deasserted Active Active No Yes Yes 1 0 0 Active Deasserted Deasserted Yes No No 1 0 1 Active Deasserted Active Yes No Yes 1 1 0 Active Active Deasserted Yes Yes No 1 1 1 Active Active Active Y
PAGE 95
— — – — — – — — machine is responsible for ensuring that R A S or C A S timeouts do not occur. Since typical R – — — – A S and C A S timeouts are 10-100 µsec, one of the simplest solutions is to perform a hardware refresh — — – — — – — — – — which deasserts both R A S and C A S. If refresh is performed often enough, R A S and — – C A S timeout will never happen. The serial port of VRAM devices is clocked by a serial clock SC.
PAGE 96
ment that is defined as internal remains internal. The Port Select Register format is shown in Figure 7-6 and is described below. 31 * 24 23 16 15 8 7 0 X X X X X X X X Y Y Y Y Y Y Y Y P P P P P P P P 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 PSR Port Select Register X:$FFFFFFFC * - reserved, read as zeros, should be written with zeros for future compatibility.
PAGE 97
7.3.1.3 PSR X Data Memory Port Select (X0-X7) Bits 16-23 The X Data Memory Port Select control bits (X0-X7) determine the assignment of the 8 X Data Memory segments to Port A or B. If the segment bit is cleared, the X Data Memory segment is assigned to Port A. If the segment bit is set, the memory segment is assigned to Port B. The memory segment to control bit correlation is shown in Figure 7-6.
PAGE 98
The HI appears as a memory mapped peripheral occupying 16 locations in the host processor address space. Separate transmit and receive data registers are double-buffered to allow the DSP96002 and host processor to efficiently transfer data at high speed. Host processor communication with the HI registers is accomplished using standard host processor instructions and addressing modes. Handshake flags are provided for polled or interrupt-driven data transfers with a host processor.
PAGE 99
— – H R is used and the host processor reads RX or writes TX when the DSP96002 is in the Stop state, — – then H R will only be deasserted after exiting the Stop state. . If Register Name Register Contents HW/SW Reset HOST Reset ICS HMRC HRST DMAE HF3-HF2 HF1-HF0 HREQ INIT TYEQ TREQ RREQ TRDY TXDE RXDF HC HV7-HV0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 $0E $0F $0F $0000 0 1 Note 1 1 1 0 - CVR INIT TREQ=1 RREQ=0 0 1 0 1 0 1 1 - IVR IV7-IV0 SEM SEM(15-0) Notes: 1. HREQ = TYEQ + TREQ 2.
PAGE 100
Register Name Register HW/SW Contents Reset HCR HYWE HYRE HXWE HXRE HPWE HPRE HRES HF3-HF2 HCIE HTIE HRIE HYWP HYRP HXWP HXRP HPWP HPRP HDMA HF1-HF0 HCP HTDE HRDF HSR HOST Reset 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 INIT TREQ=1 RREQ=0 0 0 0 0 0 0 0 INIT TREQ=0 RREQ=1 1 - INIT TREQ=1 RREQ=1 0 0 0 0 0 0 1 0 Comments Figure 7-8. Host Interface Reset - DSP96002 Side 7.4.4 HI Programming Model The HI block diagram is shown in Figure 7-9.
PAGE 101
Figure 7-9.
PAGE 102
31 ** 31 ** 7 ** 6 ** 5 HRES .... ** 14 ** 13 12 11 10 9 8 HYWE HYRE HXWE HXRE HPWE HPRE 7 HDMA 6 ** ....
PAGE 103
7 6 5 4 3 2 1 0 HREQ INIT TYEQ TREQ RREQ TRDY TXDE RXDF   15 14 13 12 11 10 9 8  HMRC ** HRST DMAE HF3 HF2 HF1 HF0   31 16  ** ** ** ** ** ** ** ** ** ** ** ** **   31 16 15 ** ** ** ** ** ** READ/WRITE INTERRUPT CONTROL/STATUS REGISTER ICS 0 SEM15 - SEM0 SEMAPHORE READ/WRITE REGISTER SEM 31 16 15 14 8 7 ** ** ** ** ** ** HC ** 0 HV READ/WRITE COMMAND VECTOR REGISTER CVR 31 8 7 0 ** ** ** ** ** ** ** ** ** IV7-IV0 31 READ/WRITE INTERRUPT VECTOR REGISTER IVR 0 RX 31 READ-ONLY RECEIVE DATA
PAGE 104
—H–R—H–A—H–SR/—WA5-A2Host Function x x x x x x x x x x x x x x x x x x x x x x 0 0 1 x x x x 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x x x x x x x x 1 0 1 0 1 0 x 1 0 1 0 x x 0 0 0 0 0 0 0 1 1 0 x x x x x xxxx 1000 1000 1001 1001 1010 1010 1011 1100 1100 1101 1101 1110 1111 0000 0001 0010 0011 0100 0101 011x 0xxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx Host Interface disabled ICS register read ICS register write SEM register read SEM register wri
PAGE 105
HI Interrupt Sources (96002 side) INTERRUPT SOURCE Receive Data Full Transmit Data Empty X Memory Read Y Memory Read P Memory Read X Memory Write Y Memory Write P Memory Write Host Command STATUS HRDF HTDE HXRP HYRP HPRP HXWP HYWP HPWP HCP MASK HRIE HTIE HXRE HYRE HPRE HXWE HYWE HPWE HCIE Exception Port A Starting Address Port B $00000020 $00000030 $00000022 $00000032 $00000024 $00000034 $00000026 $00000036 $00000028 $00000038 $0000002A $0000003A $0000002C $0000003C $0000002E $0000003E 2*HV ($00000000-
PAGE 106
7.4.8.6 HCR Host Reset (HRES) Bit 5 The Host Reset (HRES) bit is used to reset the status bits of the HI and to initialize the transmit/receive paths to the same state produced by hardware or software reset. The HOST reset (Host Interface personal reset) is generated when HRES is set. The Host Interface exits the HOST reset state after this bit is cleared. HRES is set by HW/SW reset. 7.4.8.
PAGE 107
rupt request will occur if HYRP is set. The starting address of this interrupt is shown in Figure 7-13. HYRE is cleared by HW/SW reset. 7.4.8.13 HCR Host Y Memory Write Interrupt Enable (HYWE) Bit 13 The Host Y Memory Write Interrupt Enable (HYWE) bit is used to enable the Y Memory Write interrupt when the Host Y Memory Write Command Pending (HYWP) status bit in the Host Status Register (HSR) is set. When HYWE is cleared, HYWP interrupts are disabled.
PAGE 108
7.4.9.6 HSR Reserved bits (Bits 5, 6, 14-31) These status bits are reserved for future expansion and read as zero during DSP96002 read operations. 7.4.9.7 HSR DMA Status (HDMA) Bit 7 The DMA Status bit (HDMA) indicates that the host processor has enabled the external DMA handshake mode of the HI. When HDMA is cleared, it indicates that the DMA Mode is disabled (DMAE=0) in the Interrupt Control Register ICS. When HDMA is set, it indicates that the DMA Mode is enabled (DMAE=1). Cleared by HW/SW reset.
PAGE 109
interrupt". HYRP is set when data is transferred from the TX register to the HRX register. HYRP is cleared when the HTXC register is written by the DSP96002. HYRP is cleared by INIT (TREQ=1), HOST reset, and HW/SW reset. 7.4.9.
PAGE 110
that the host processor can force any of the existing exception handlers (IRQA, IRQB, etc.) and can use any of the reserved or otherwise unused starting addresses provided they have been pre-programmed in the DSP96002. The HV is set to a predefined value for each port by HW/SW reset (see Figure 7-7). If HC is set, the host processor should not change HV. 7.4.12.2 CVR Reserved bits (Bits 8-14, 16-31) Reserved bits are read by the host processor as zeros.
PAGE 111
— – TXDE may be used to assert the Host Request H R pin if the Transmit Request Enable bit (TREQ) is set. TXDE provides valid status regardless of whether the TXDE interrupt is enabled or not so that polling techniques may be used by the host processor. 7.4.13.3 ICS Transmitter Ready (TRDY) Bit 2 The read-only Transmitter Ready (TRDY) status bit indicates that both the Transmit Register TX (on the host processor side) and Host Receive Data Register HRX (on the DSP96002 side) are empty.
PAGE 112
In DMA Mode (DMAE=1), TREQ must be set or cleared by software to select the direction of DMA transfers. — Setting TREQ defines the direction of DMA transfer to be from external DMA→96002, and enables the – H R pin to request these data transfers. See Figure 7-15 and Figure 7-16 for a summary of the effect of TREQ on the by HW/SW reset. 7.4.13.6 — – H R pin.
PAGE 113
2. When not using the —H–R pin for handshake, use polling of the INIT bit in ICS to make sure it is cleared by the hardware (which means the INIT execution is completed). Then, start writing/reading data. 3. If using neither the —H–R pin for handshake nor polling the INIT bit, wait at least 3Tc+Th after the deassertion of —T–S that wrote ICS, before writing/reading data. This ensures that the INIT is completed. See Figure 7-14. The type of initialization done depends on the state of TREQ and RREQ.
PAGE 114
TREQ RREQ TYEQ 0 0 1 1 x x 0 1 0 1 0 1 0 0 0 0 1 1 HREQ flag and — – H R pin No interrupts (polling). RX full or HTX full. TX empty or HRX empty. RX full, HTX full, TX empty or HRX empty. TX empty and HRX empty. All interrupts (no polling). Figure 7-15. HREQ and —H–R Definition - Interrupt Mode (DMAE=0) TREQ RREQ 0 0 1 1 x 0 1 0 1 x TYEQ HREQ flag and —H–R pin 0 0 0 0 1 Reserved DSP96002→DMA Request (RX full) DMAÆ→DSP96002 Request (TX empty) Reserved Reserved Figure 7-16.
PAGE 115
— – — – es. In this mode, the H R pin can be used as an interrupt request to the host processor, and the H A pin may be used to support a 68K family interrupt acknowledge. When DMAE is set, the HI operates in the DMA Mode. When in DMA Mode, the RX and TX registers are accessed without regard to the address lines A2-A5, permitting data transfers under control of external de— – vices, such as DMA controllers, that do not supply addresses.
PAGE 116
busy semaphore bit), several bits or write the whole 16 bits (which, for example, may be used as host processor ID). Host processors should use read/modify/write uninterruptable instructions (such as XMEM in the MC88000, CAS in the MC680x0, or BSET in the DSP96002) and examine which host processor has allocated the HI or set the semaphore bit by "bit test and set" instructions.
PAGE 117
— – — – The HI interrupt requests to the external host processor use the Host Request H R pin. H R is normally connected to a host processor interrupt input. The host processor acknowledges HI interrupts by executing — – — – an interrupt service routine. The MC680x0 processor family will assert the T S pin when both H R and — – H A are asserted to read the exception vector number from the IVR register of the HI.
PAGE 118
7.4.18 7.4.18.1 96002 Programmer Considerations Reading Status Bits HF1, HF0, HCP, HPRP, HPWP, HXRP, HXWP, HYRP, HYWP, HTDE, and HRDF status bits are set or cleared by the host processor side of the HI. These bits are individually synchronized to the DSP96002 clock. The only system problem with reading status is HF1 and HF0 if they are encoded as a pair, e.g. the four combinations 00, 01, 10, and 11 each have significance.
PAGE 119
DSP96002 Bus Master DMA Source DMA Request DSP96002 Bus Slave DMA Destination empty —— – I R Q space S1, S0 — – H R slave — – H S select A0–A31 address Bus Master DMA Request decode Vcc — – H A Host → Memory DMA Transfer Transmit Data Empty (TXDE=1) A2–A5 — – T A — – T S — – T S — R/ R/ Write Bus Cycle from Memory — W data W D0–D31 Host Data Full (HRDF=1) Figure 7-17.
PAGE 120
DSP96002 Bus Master DMA Destination DMA Request DSP96002 Bus Slave DMA Source —— – I R Q space S1, S0 A0–A31 address Bus Master full — – H R slave select DMA Request — – H S decode Vcc — – H A A2–A5 Memory → Host DMA Transfer Receive Data Full (RXDF=1) — – T A — – T S — – T S — R/ R/ Read Bus Cycle into Memory — W data W D0–D31 Host Data Empty (HTDE=1) Figure 7-18. DSP96002 to DSP96002 Data Read — – where this pin is defined as a DMA service request input.
PAGE 121
External DMA Controller Bus Master DMA Request DSP96002 Bus Slave — — – R E Q empty — – H R DMA Request Host → Memory DMA Transfer Transmit Data Empty (TXDE=1) Bus Master — — – A C K Write Bus Cycle from Memory data — – H A D0–D31 Host Data Full (HRDF=1) Figure 7-19. External DMA to DSP96002 Data Write should be cleared. Figure 7-19 contains a diagram showing the data paths and control lines used for the data transfers.
PAGE 122
External DMA Controller Bus Master DMA Request DSP96002 Bus Slave — — – R E Q full — – H R DMA Request Memory → Host DMA Transfer Bus Master Receive Data Full (RXDF=1) — — – A C K Read Bus Cycle into Memory data — – H A D0–D31 Host Data Empty (HTDE=1) Figure 7-20. DSP96002 to External DMA Data Read — – H R signal is asserted, indicating that its HI RX register — – — — – is full and the data is ready to be read by the external DMA Controller.
PAGE 123
7.4.21.1 Semaphore Control Whenever a host transfer is to be executed, the host processor must first obtain ownership of the slave’s HI. This is done by semaphore control. The following is an example of code used by the host processor to obtain ownership of the HI. The LSB bit of the SEM register is used as a semaphore bit: clock words SEMA BSET #0,Y:SEMR JCS SEMA start host activity cycles 1 1 4 4 1 4 . . .
PAGE 124
7.4.21.4 ICS Register Read — – — – HICSR points to the slave ICS register ( H S=0, H A=1, A5-A2=1000). The master executes the following instruction: MOVE Y:HICSR,R0 7.4.21.5 ICS Register Write — – — – HICSR points to the slave ICS register ( H S=0, H A=1, A5-A2=1000). The master executes the following instruction: MOVE 7.4.21.6 R0,Y:HICSR 68K Interrupt Acknowledge Sequence The MC680x0 interrupt acknowledge sequence is as follows: 1.
PAGE 125
MC680x0 PROCESSOR INTERRUPTING DEVICE - DSP96002 Acknowledge Interrupt Request Interrupt 1.Compare Interrupt Request Level with Interrupt Mask. — 2.Set R/ W to Read. 3.Set Function Code to CPU Space and output IACK address. — – 4.Assert Address Strobe ( A S) and — – Data Strobe ( D S). Provide 68K Vector Place IVR contents on Data Bus in response — – — – to H A=0 & T S=0, when — – H R=0. Acquire vector number 1.Latch Vector Number — – — – 2.
PAGE 126
5. In the DSP96002 side, the X/Y/P Memory Write interrupt vector should point to a routine that first reads HRX to get the address A, stores A in an address pointer Rn, and then again reads HRX to retrieve the data D and store D into the DSP96002 memory location pointed by Rn. 6. The host processor may test TRDY to see if both A and D were removed from the input double buffer (TX/HRX). DSP96002 MASTER PROCESSOR Semaphore Control DSP96002 SLAVE PROCESSOR Semaphore Register (SEM) 1.
PAGE 127
_LOOP1 JCLR MOVE JCLR MOVE _LOOP2 #TXDE,X:(R4),_LOOP1 R1,X:(R3) #TXDE,X:(R4),_LOOP2 R0,X:(R3) words 2 1 2 1 6 clock cycles 6 2 6 2 16 The minimal memory write is 6 program words and 16 clock cycles. The second move triggers the X Memory Write interrupt request in the slave. The interrupt service routine in the slave takes 10-14 clock cycles to execute. If there are other interrupts with higher priority the response to this interrupt may be delayed.
PAGE 128
4. The host processor polls the ICS register until HMRC is cleared and then reads the data D from the RX register. 96K MASTER PROCESSOR Semaphore Control 96K SLAVE PROCESSOR Semaphore Register (SEM) 1.Set Semaphore in slave’s Semaphore Register using BSET Instruction. 2.If Semaphore was set before repeat step 1 else continue X Memory Read Interrupt/Status Register 1.Check if the slave’s TX register is empty (TXDE=1) 2.If TXDE=0 repeat step 1 else continue 3.
PAGE 129
The minimal memory read procedure is 6 program words and 16 clock cycles. The first move triggers the X Memory Read interrupt request in the slave. The interrupt service routine in the slave takes 8-12 clock cycles to execute. If there are other interrupts with higher priority the response to this interrupt may be delayed. Only then can the master continue with the second move to read the data. 7.4.21.
PAGE 130
The table in Figure 7-24 shows the data transfers that the DMA Controller is capable of. The number of cycles specified in the Figure 7-24 notes are for the operation of one channel using a continuous block transfer. Int. Int. Ext. Ext. Ext. Int. Int. Ext. Ext. Int. DMA data transfers mem Int. mem Int. mem Int. mem Int. mem Ext. mem Int. mem Int. mem Int. mem Int. I/O Int.
PAGE 131
7.5.2 DMA Controller Programming Model The registers comprising the DMA Controller are shown in Figure 7-25 and Figure 7-26.
PAGE 132
0 DMA Source Modifier Register DSM1 addr X:$FFFFFFD7 DMA Source Address Register DSR1 addr X:$FFFFFFD6 DMA Source Offset Register DSN1 addr X:$FFFFFFD5 DMA Destination Modifier Register DDM1 addr X:$FFFFFFD3 DMA Destination Address Register DDR1 addr X:$FFFFFFD2 DMA Destination Offset Register DDN1 addr X:$FFFFFFD1 DMA Counter DCO1 addr X:$FFFFFFD4 31 31 30 29 28 27 26 25 24 DE DIE * DTD * 23 22 21 20 19 18 17 16 DCP * * * * * * * 15 14 13 12 11 10 9 8 * M6 M5 M4 M
PAGE 133
7.5.3.1 DCS DMA Destination Space Control (DDS2-DDS0) Bits 0,1,2 The DMA Destination Space control bits (DDS2-DDS0) specify the memory or I/O space that will be referenced as destination by the DMA. The DDS2-DDS0 bits are cleared by Hardware and Software Reset. DDS2 0 0 0 0 1 1 1 1 DDS1 DDS0 0 0 0 1 1 0 1 1 0 0 1 1 7.5.3.
PAGE 134
the DMA transfer. If an input is unmasked, asserting that input will set the latch and initiate a DMA transfer. The DMA state machine clears the latch when accessing the DMA source address. If more than one requesting device input is enabled, the first edge on any input is latched and triggers a DMA transfer, and any other edge that appears before the latch is cleared will be ignored. DMA Request Mask BitRequesting Device M0 M1 M2 M3 M4 M5 M6 7.5.3.
PAGE 135
7.5.3.7 DCS DMA Transfer Mode – (DTM1–DTM0) Bits 25,26 DMA Transfer Mode bits (DTM1-DTM0) specify the mode of operation of the DMA channel. DTM1-DTM0 are cleared by Hardware and Software Reset. When DTM1-DTM0=00, a single block is transferred, the length of the block is determined by the counter, the transfer is initiated by setting the DE bit, and the transfer is completed when the counter decrements to zero.
PAGE 136
Block transfer mode is selected. Clearing DE during DMA operation will stop the DMA only after the present DMA transfer has been completed (the data is stored in the destination), setting DTD. DE 0 1 DMA Operation Disabled Enabled 7.5.4 DMA Counter (DCO) The DMA Counter is a read/write 32-bit register that contains the number of DMA data transfers to be done.
PAGE 137
able memory would typically load the Offset register with +4 to perform 32-bit aligned accesses. DMA transfers to/from I/O peripherals would load the Offset register with zero to continuously access the same address. 7.5.9 DMA Addressing Modes The DMA Controller may be programmed for address calculation and updates in the same manner as the registers in the Address Generation Unit. The DMA Modifier registers are completely identical to the Modifier registers M0-M7.
PAGE 138
3. If the core is doing one external access and the DMA is also doing an external access thorough the other port, and the core access is delayed, the access by the DMA in the other port is also delayed. This happens because the chip clock generates wait states and the whole chip stops. Also, the arbitration between DMA and core cannot continue if the core is frozen. 4.
PAGE 139
$FFFFFFFF $FFFFFFFE $FFFFFFFD $FFFFFFFC : $FFFFFFF0 $FFFFFFEF $FFFFFFEE $FFFFFFED $FFFFFFEC : $FFFFFFE7 $FFFFFFE6 $FFFFFFE5 $FFFFFFE4 : $FFFFFFE0 $FFFFFFDF $FFFFFFDE $FFFFFFDD $FFFFFFDC $FFFFFFDB $FFFFFFDA $FFFFFFD9 $FFFFFFD8 $FFFFFFD7 $FFFFFFD6 $FFFFFFD5 $FFFFFFD4 $FFFFFFD3 $FFFFFFD2 $FFFFFFD1 $FFFFFFD0 $FFFFFFCF : $FFFFFF80 X DATA Memory Space IPR - Interrupt Priority Register BCRA - Port A Bus Control Register BCRB - Port B Bus Control Register PSR - Port Select Register RESERVED Reserved for OnCE Opera
PAGE 140
7 - 54 DSP96002 USER’S MANUAL MOTOROLA
PAGE 141
SECTION 8 EXCEPTION PROCESSING 8.1 INTRODUCTION This section describes the actions of the DSP96002 which are outside the normal processing associated with the execution of instructions. The sequence of actions taken by the DSP96002 on exception conditions is described. Also, the interrupt priority level (IPL) of the processor and interrupt sources is described. 8.2 PROCESSING STATES The DSP96002 is always in one of five processing states: normal, exception, reset, wait, or stop.
PAGE 142
8.2.3 Wait Processing State The wait processing state is a low power consumption mode entered by execution of the WAIT instruction. In wait mode, the internal clock is disabled from all internal circuitry except the internal peripherals (the interrupt controller and host interfaces). All internal processing is halted until any unmasked interrupt occurs, — – — – the DSP96002 is reset, or D R is asserted.
PAGE 143
Int ctl cyc1 Int ctl cyc2 Fetch Decode Execute * i i n3 n2 n1 i n4 n3 n2 ii1 n4 n3 ii2 ii1 n4 n5 ii2 ii1 n6 n5 ii2 n7 n6 n5 i n8 n7 n6 ii3 n8 n7 ii4 ii3 n8 ii4 ii3 i = interrupt ii = interrupt instruction word n = normal single word instruction * subsequent interrupts are enabled at this time Figure 8-1.
PAGE 144
interrupt instructions are being fetched, the PC is inhibited from being updated. After the two interrupt words have been fetched, the PC is used for any following instruction fetches. After both interrupt instructions words have been fetched, they are guaranteed to be executed. This is true even if the instruction that is currently being executed is a change of flow instruction (i.e., JMP, JSR, etc.) that would normally ignore the instructions in the pipe.
PAGE 145
Int ctl cyc1 Int ctl cyc2 Fetch Decode Execute f i ii n n4 n5 * = = = = = = * i i n3 n2 n1 i n4 n3 n2 ii1 n4 n3 ii2 f1 NOP n4 f2 f1 n5 n4 f2 n6 -n4 i n7 n6 -- ii3 n7 n6 ii4 f3 n7 n8 f4 f3 fast interrupt instruction word (non-control-flow-change) interrupt interrupt instruction word normal single word instruction 2-word instruction 2nd word of n4 subsequent interrupts are enabled at this time Figure 8-2.
PAGE 146
2. The status register is modified as follows: the interrupt mask bits I1, I0 in the MR are updated to mask interrupts of the same or lower priority (except that illegal instruction, stack error and (F)TRAPcc can always interrupt). 3. The PC will be altered by the JSR instruction so that instruction execution will continue with the instructions located in the address pointed to by the JSR instruction. 2. Long interrupt routines are interruptible by higher priority interrupts.
PAGE 147
Int ctl cyc1 Int ctl cyc2 Fetch Decode Execute i ii n n3 n4 n5 † * = = = = = = i n3 n2 n1 † i n4 REP n2 i* n5 NOP REP n4 NOP n4 n4 i n6 n5 n4 ii1 n6 n5 ii2 ii1 n6 n7 ii2 ii1 n8 n7 ii2 n9 n8 n7 interrupt interrupt instruction word normal instruction word REP #2 instruction instruction being repeated twice instruction that waits in the backup instruction latch interrupt rejected at this time interrupt can be reenabled at this time Figure 8-5.
PAGE 148
Interrupt Starting Address interrupt Source $FFFFFFFE $00000000 $00000002 $00000004 $00000006 $00000008 $0000000A $0000000C $0000000E $00000010 $00000012 $00000014 $00000016 $00000018 $0000001A $0000001C $0000001E $00000020 $00000022 $00000024 $00000026 $00000028 $0000002A $0000002C $0000002E $00000030 $00000032 $00000034 $00000036 $00000038 $0000003A $0000003C $0000003E $00000040 : $000000FE $00000100 : $000001FE Hardware RESET Hardware RESET Stack Error Illegal Instruction (F)TRAPcc (default) IRQA IRQB
PAGE 149
which are maskable. Additionally, each of these interrupts has independent enable control. When the IRQA, IRQB or IRQC interrupts are disabled in the interrupt priority register, pending requests will be discarded, no new requests will be accepted, and the edge-detection latch will remain in the reset state. Also, if the interrupt is defined as level-sensitive, its edge-detection latch will remain in the reset state.
PAGE 150
I1 0 0 1 1 I0 0 1 0 1 Exceptions Masked None IPL 0 IPL 0,1 IPL 0,1,2 Exceptions Permitted IPL 0, 1, 2, 3 IPL 1, 2, 3 IPL 2, 3 IPL 3 Figure 8-8.
PAGE 151
xxL1 xxL0 0 0 1 1 0 1 0 1 Enabled Int. Priority Level (IPL) no yes yes yes 0 1 2 Figure 8-10. Interrupt Priority Level Bits IxL2 0 1 Trigger Mode level neg. edge IRxS 0 1 Status Serviced Pending Figure 8-11. External Interrupt Trigger Mode and Status 8.4.
PAGE 152
8.5.2 Interrupt Priority Register (IPR) This read/write register specifies the interrupt priority level for each of the interrupting devices (Host, DMA, IRQA, IRQB, IRQC). In addition, this register specifies the trigger mode of each external interrupt source and shows the status of the external interrupt request. The register is cleared on Hardware reset or by the RESET instruction. The Interrupt Priority Register is shown in Figure 8-9. Figure 8-10 defines the interrupt priority level bits.
PAGE 153
IBL1 IBL0 0 0 0 1 1 0 1 1 Enabled no yes yes yes Int. Priority Level (IPL) 0 1 2 8.5.2.5 IRQB Trigger Mode - IBL2 (Bit 6) The IRQB Trigger Mode (IBL2) bit specifies the trigger method for the external interrupt input IRQB. IBL2 0 1 Trigger Mode level negative edge 8.5.2.6 IRQB Status - IRBS (Bit 7) The read-only IRQB Status (IRBS) bit indicates the status of the interrupt request for the external interrupt input IRQB.
PAGE 154
ICL2 0 1 Trigger Mode level negative edge 8.5.2.9 IRQC Status - IRCS (Bit 11) The read-only IRQC Status (IRCS) bit indicates the status of the interrupt request for the external interrupt input IRQC. If the IRQC interrupt is defined as edge-sensitive and it is enabled, the IRCS bit indicates the state of the edge-detection latch. If the IRQC interrupt is defined as level-sensitive or is disabled, the IRCS bit indicates the state of the IRQC pin after internal synchronization. IRCS 0 1 8.5.2.
PAGE 155
HAL1 HAL0 0 0 0 1 1 0 1 1 Enabled no yes yes yes Int. Priority Level (IPL) 0 1 2 8.5.2.14 Host B Interrupt Priority Level - HBL1-HBL0 (Bits 22-23) The Host B Interrupt Priority Level (HBL1-HBL0) bits are used to enable and specify the priority level of all interrupt sources located in the Port B Host Interface. HBL1 HBL0 0 0 0 1 1 0 1 1 MOTOROLA Enabled no yes yes yes Int.
PAGE 156
8.5.3 Exception Priorities within an IPL If more than one exception is pending when an instruction is executed, the interrupt with the highest priority level is serviced first. Within a given interrupt priority level, a second priority structure determines which interrupt is serviced when multiple interrupt requests with the same IPL are pending. The priority of equal IPL interrupts is given in Figure 8-12. Also given in Figure 8-12 are the interrupt enable bits for all interrupts.
PAGE 157
SECTION 9 CHIP OPERATING MODES AND MEMORY MAPS 9.1 OPERATING MODES AND PROGRAM MEMORY MAPS The operating mode bits MA, MB, and MC in the OMR register determine the bus expansion mode for program memory and the startup procedure when the DSP96002 leaves the RESET state. The Data ROM Enable bit DE in the OMR determines the bus expansion mode for the data memories. The MODA, MODB, and MODC pins are used to load MA, MB and MC with the initial operating mode of the DSP96002.
PAGE 158
9.1.2 Mode 1 (Internal PRAM enabled, Reset at $FFFFFFFE, Port B) In Mode 1, the internal program memory occupies the lower portion of the program memory space. Addresses higher than the highest internal program memory location are directed to external program memory. The address of the hardware reset vector is $FFFFFFFE, located in the Port B external program memory space. The program memory map for this mode is shown in Figure 9-2. 9.1.
PAGE 159
If the Host Interface flag HF1 is set, the bootstrap program assumes that the external host processor is a 32-bit wide source which will supply up to 1,024 32-bit words to load into the program RAM. The external host processor may terminate the bootstrap program by setting the Host Interface flag HF0. 5. Enter Mode 0 or 1 by writing to the OMR. This action will begin a timed delay to remove the bootstrap ROM from the program memory map. 6.
PAGE 160
PAGE 132,50,0,10 ; BOOTSTRAP CODE FOR DSP96002 -  Copyright 1988 Motorola Inc. ; ; Host algorithm / AND / external bus method. ; ; This is the Bootstrap program contained in the DSP96002. This program ; can load the internal program memory from one of 4 external sources. ; The program reads the OMR bits MA and MB to decide which external ; source to access. ; If MB:MA = 0X - load from 4,096 consecutive byte-wide P: memory ; locations (starting at P:$FFFF0000).
PAGE 161
; ; ; ; ; ; ; ; ; ; ; ; ; The second routine loads the internal PRAM using the Host Interface logic. If HF1=0, it will load 4,096 bytes from the external host processor. These will be condensed into 1,024 32-bit words and stored in contiguous internal PRAM memory locations starting at P:$0. Note that the routine loads data starting with the least significant byte of P:$0 first. If HF1=1, it will load 1,024 32-bit words from the external host processor.
PAGE 162
; Host load routine _HOSTR _LBL11 JCLR ENDDO JMP #3,X:(R2),_LBL22 _LBL22 JCLR #0,X:(R2),_LBL11 JCLR MOVE JMP #4,X:(R2),_LBL33 X:(R3),D0.L <_STORE _LBL33 DO LSR #4,_LOOP4 #8,D0 ; Get 4 bytes into D0.L ; Shift previous byte down _LBL1 JCLR ENDDO ENDDO JMP JCLR #3,X:(R2),_LBL2 ; if HF0=1, stop loading data. ; Must terminate the do loops <_BOOTEND #0,X:(R2),_LBL1 MOVE LSL OR X:(R3),D1.L #24,D1 D1,D0 ; ; ; ; ; MOVEM D0.L,P:(R0)+ ; Store 32-bit result in P mem.
PAGE 163
9.2.1 Internal Data RAMs The on-chip X and Y Data RAMs occupy locations $00000000 to $000001FF in X and Y Data Memory maps, respectively, and they are always enabled. 9.2.2 Internal Data ROMs The X and Y Data Memory expansion mode is affected by the DE bit located in the OMR. The on-chip X and Y Data ROMs occupy locations $00000400 to $000007FF in X and Y Data Memory maps, respectively, when enabled by setting DE=1 in the Operating Mode Register.
PAGE 164
X DATA $FFFFFFFF Y DATA $FFFFFFFF On-Chip Peripherals $FFFFFF80 External Peripherals $FFFFFF80 External X Data Memory $000007FF External Y Data Memory $000007FF Internal X Data ROM $000003FF Internal Y Data ROM $000003FF Internal Reserved $000001FF Internal Reserved $000001FF Internal X Data RAM $00000000 Internal Y Data RAM $00000000 Figure 9-5.
PAGE 165
PROGRAM MEMORY MMM HW RESET CBA VECTOR MODE INTERNAL PROGRAM SPACE EXTERNAL PROGRAM SPACE PORT 000 $FFFFFFFE 0 $00000000-$000003FF $00000400-$FFFFFFFF A 001 $FFFFFFFE 1 $00000000-$000003FF $00000400-$FFFFFFFF B 010 $00000000 2 none $00000000-$FFFFFFFF A 011 $00000000 3 none $00000000-$FFFFFFFF B 1X0 $00000000 4,6 in Bootstrap ROM For reading (Boot ROM): $00000400-$FFFFFFFF $00000000-$0000003F For writing (Prog RAM): $00000000-$000003FF A 1X1 $00000000 5,7 in Bootstrap ROM For rea
PAGE 166
9 - 10 DSP96002 USER’S MANUAL MOTOROLA
PAGE 167
SECTION 10 ON-CHIP EMULATOR 10.1 INTRODUCTION Conventional methods of system development (for example the DSP56001) consist of a program which resides in the DSP program memory (monitor). An interface circuit which either uses on-chip resources or an additional program memory address communicates with a host computer or terminal. This technique is not transparent, loads the DSP bus and sometimes interferes with the user system configuration.
PAGE 168
 Figure 10-1. OnCE Block Diagram 10.2.2 Debug Serial Clock/Chip Status 1 (DSCK/OS1) The serial clock is supplied to the OnCE through the DSCK/OS1 pin when it is an input. The serial clock provides pulses required to shift data into and out of the OnCE serial port. Data is clocked into the OnCE  on the falling edge and is clocked out of the OnCE serial port on the rising edge.
PAGE 169
10.2.3 Debug Serial Output (DSO) The debug serial output provides the data contained in one of the OnCE controller registers as specified by the last command received from the external command controller. When idle, this pin is held high. When the requested data is available, the DSO line will be asserted (negative true logic) for two T cycles (2T = period of DSP96002 master clock) to indicate that the serial shift registers are ready to receive clocks in order to deliver the data.
PAGE 170
Figure 10-2. OnCE Controller and Serial Interface been received), and two signals indicating that the core was halted and the DMA was halted. The ODEC generates all the strobes required for reading and writing the selected OnCE registers. 10.3.4 OnCE Status and Control Register (OSCR) The Status and Control Register is a 32-bit register used to select the events that will put the chip in Debug Mode (see Figure 10-3). Breakpoints may be disabled or enabled on one or more memory spaces.
PAGE 171
31 19 18 * TO 17 DBO 16 15 PBO 9 * 8 TME 7 6 DBS1 DBS0 5 4 DBE1 DBE0 3 PBS1 2 1 0 PBS0 PBE1 PBE0 * Read as zeroes, should be written with zero for future compatibility. Figure 10-3.
PAGE 172
10.3.4.5 Trace Mode Enable (TME) Bit 8 This control bit, when set, enables the Trace Mode causing the chip to enter the Debug Mode whenever the execution of an instruction is completed and the Trace Counter is zero. This bit is cleared on hardware reset. 10.3.4.6 Reserved (Bits 9-15, 20-31) These bits are reserved for future use. They are read as zero and should be written as zero for future compatibility. 10.3.4.
PAGE 173
Figure 10-4. Program Memory Breakpoint Logic cific point to examine/change registers or memory. Using address comparators to set breakpoints enables the user to set breakpoints in RAM or ROM and while in any operating mode. The low address comparator will cause a logic true signal when the address on the bus is greater than or equal to the low boundary. The high address comparator will cause a logic true signal when the address on the bus is less than or equal to the high boundary.
PAGE 174
Figure 10-5. Data Memory Breakpoint Logic dress is ignored. Program memory address breakpoints occur after the opcode or operand is executed and the breakpoint counter has been decremented to zero. Data memory address breakpoints also occur after the execution of the instruction which formed the data memory address and the breakpoint counter has decremented to zero. All breakpoint registers are controlled by the debug status and control register, OSCR. 10.4.
PAGE 175
provides a means of checking hot spots in program segments as well as peripheral or data memory accesses. Program hot spots may be statistically evaluated by setting the breakpoint counter to a value, setting a program address in the program address comparator registers, passing control of the DSP96002 back to the user program and checking to see if a breakpoint occurs after n iterations of the program memory access.
PAGE 176
10.4.9 Data Memory Address Latch (ODAL) The Data Memory Address Latch is a 32-bit register that latches the XAB or YAB on every cycle during the core or DMA slot according to the DBS1-DBS0 bits in OSCR. 10.4.10 Data Memory Upper Limit Register (ODULR) The Data Memory Upper Limit Register is a 32-bit register that stores the program memory breakpoint upper limit. ODULR can only be read or written through the serial interface. Before enabling breakpoints, ODULR must be loaded by the command controller. 10.
PAGE 177
software developer debug sections of code which do not have a normal flow or are getting hung up in infinite loops. The trace counter also enables the user to debug areas of code which are time critical.
PAGE 178
Figure 10-6. Breakpoint and Trace Counter Logic for any newly fetched instruction including instructions fetched by the interrupt processing or instructions that will be killed by the interrupt processing. 10.7.3 External Request During STOP — – Asserting D R when the chip is in the STOP state (i. e., has executed a STOP instruction) causes the chip to exit the STOP state and enter the Debug Mode. After receiving the acknowledge, the command con— – troller must negate D R .
PAGE 179
10.7.4 External Request During WAIT — – Asserting D R when the chip is in the WAIT state (i. e., has executed a WAIT instruction) causes the chip to exit the WAIT state and enter the Debug Mode. After receiving the acknowledge, the command controller — – must negate D R . Note that in this case, the chip completes the execution of the WAIT instruction and halts after the next instruction enters the instruction latch. 10.7.
PAGE 180
through the serial interface. This register is affected by the operations performed during the Debug Mode and must be restored by the command controller when returning to normal mode. 10.8.3 PIL Register (OPILR) The PIL Register is a 32-bit latch that stores the value of the Instruction Latch before the Debug Mode is entered. OPILR can only be read through the serial interface.
PAGE 181
Figure 10-8.
PAGE 182
10.9.1 PAB Register for Fetch (OPABFR) The PAB Register for Fetch is a 32-bit register that stores the address of the last instruction that was fetched before the Debug Mode was entered. OPABFR can only be read through the serial interface. This register is not affected by the operations performed during the Debug Mode. 10.9.2 PAB Register for Decode (OPABDR) The PAB Register for Decode is a 32-bit register that stores the address of the instruction currently in the Instruction Latch.
PAGE 183
RS4-RS0 00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 01011 01100 01101 01110 01111 10000 10001 Register Selected Debug Status/Control (OSCR) Breakpoint Counter Program (OPBC) Breakpoint Counter Data (ODBC) Trace Counter (OTC) Breakpoint Data Memory Higher-Equal (ODULR) Breakpoint Data Memory Lower-Equal (ODLLR) Breakpoint Program Memory Higher-Equal (OPULR) Breakpoint Program Memory Lower-Equal (OPLLR) Transfer Register (OGDBR) Program Data Bus Latch (OPDBR) Program Address Bus Latch f
PAGE 184
0 1 write the data associated with the command into the register specified by RS4-RS0 read the data contained in the register specified by RS4-RS0 10.11 DSP96002 TARGET SITE DEBUG SYSTEM REQUIREMENTS A typical DSP96002 debug environment consists of a target system where the DSP96002 resides in the user defined hardware. The debug serial port interfaces to the command convertor over a 6 wire link consisting of the 4 OnCE wires, a ground and reset wire.
PAGE 185
10.12.2 1. 6. CLK 7. Send command READ FIFO REGISTER (and increment pointer). 8. ACK 9. CLK 10. Send command READ FIFO REGISTER (and increment pointer). 11. ACK 12. CLK 13. Send command READ FIFO REGISTER (and increment pointer). 14. ACK 15. CLK 16. Send command READ FIFO REGISTER (and increment pointer). 17. ACK 18. CLK 19. Send command READ FIFO REGISTER (and increment pointer). 20. ACK 21. CLK Displaying a specified register Send command WRITE PDB REGISTER and GO (no EX).
PAGE 186
4. ACK 5. Send command READ GDB REGISTER (ODEC selects GDB as source for serial data and an acknowledge is issued to the command controller. 6. ACK 7. CLK (The command controller generates 32 clocks that shift out the contents of the GDB register. The value of R0 is thus saved and will be restored before exiting the Debug Mode.) 8. Send command WRITE PDB REGISTER (no GO, no EX). (ODEC selects PDB as destination for serial data.) 9. ACK 10.
PAGE 187
(ODEC releases the chip from the "halt" state and the instruction is executed again (in a "REPEAT-like" fashion. The signal that marks the end of the instruction returns the chip to the "halt" state and an acknowledge is issued to the command controller.) 24. ACK 25. Send command READ GDB REGISTER (ODEC selects GDB as source for serial data and an acknowledge is issued to the command controller.) 26. ACK 27. CLK 28. Repeat from step 23 until the entire memory area is examined.
PAGE 188
4. ACK 5. Send command WRITE PDB REGISTER (GO, EX). (ODEC selects PDB as destination for serial data.) 6. ACK 7. Send 32 bits of the target absolute address ($xxxxxxxx). The chip will resume fetching from the target address (you do not have to worry about the pipeline). Note that the trace counter will count this instruction so the current trace counter may need to be corrected if the trace mode enable bit in the OSCR has been set. (e. g.
PAGE 189
APPENDIX A INSTRUCTION SET DETAILS A.1 INTRODUCTION This appendix contains detailed information about each instruction defined in the DSP96002 instruction set. They are arranged in alphabetical order. A.2 ADDRESSING MODES Addressing modes are categorized by the ways in which they may be used. The following classifications will be used in the instruction definitions. Figure A-1 shows the various categories to which each addressing mode belongs.
PAGE 190
Mode Reg Addressing Mode Addressing Categories U P M A Assembler Syntax Register Direct Data or Control Register Address Register Address Offset Register Address Modifier Register – – – – – – – – X X X X Note 1 Rn Nn Mn Address Register Indirect No Update Postincrement by 1 Postdecrement by 1 Postincrement by Offset Nn Postdecrement by Offset Nn Indexed by Offset Nn Predecrement by 1 Long Displacement 100 011 010 001 000 101 111 – Rn Rn Rn Rn Rn Rn Rn Rn X X X X X X X X X X X X X X X X X X X
PAGE 191
Modifier MMMMMM M M Address Calculation Arithmetic 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 2 Reverse Carry Modulo 2 Modulo 3 . . . . . . . 0 0 0 0 0 0 1 2 F F x x F F x x F F x x F F x x F F x x E F x x D E F F F F F F F x x 0 0 0 0 3 7 F x x 0 0 0 0 F F F . . Modulo 16,777,215 Modulo 16,777,216 reserved reserved . . . . . . . F F F F F F F F F x x 0 0 0 0 F F F x x 0 0 0 0 F F F x x 0 0 0 0 F F F x x 0 1 3 7 F F F (Bit Reversed Update) ((2**24)-1) (2**24) . .
PAGE 192
– compare instruction. The R bit is cleared during processor reset. See the example for the FCMPG instruction for additional information. A(Accept) The A bit is only affected by the compare instructions CMP, CMPG, FCMP and FCMPG. The A bit is calculated based on its previous value and the results of the current compare instruction. The A bit is cleared during processor reset. See the example for the FCMPG instruction for additional information.
PAGE 193
Mnemonic A – R LR I N Z V C Special Definitions ABS ADD ADDC AND ANDC – – – – – – – – – – – – – – – – – – – – * – * * * * * ? * * * * * 0 0 – * * – – Note 1 ANDI ASL ASR ? – – ? – – ? – – ? – – ? * * ? * * ? ? 0 ? ? ? Note 2 Note 3,4 Note 3 Bcc – – – – – – – – BCHG ? ? ? ? ? ? ? ? Note 29 BCLR BFIND BRA BRCLR BRSET ? – – – – ? – – – – ? – – – – ? – – – – ? ? – – – ? ? – – – ? 0 – – – ? – – – – Note 30 Note 15,24 BScc BSCLR BSET BSR BSSET – – ? – –
PAGE 194
Mnemonic A – R LR I N Z V C FCOPYS.X FDEBUGcc FFcc FFcc.U FGETMAN – – – ? – – – – ? – – – – ? – * – – ? * * – – ? * * – – ? * – – – ? – – – – ? – FINT FJcc FJScc FLOAT.S FLOAT.X – – – – – – – – – – – – – – – * – – * * * – – * * * – – * * – – – – – – – – – – FLOATU.S FLOATU.X FLOOR FMPY//FADD.S FMPY//FADD.X – – – – – – – – – – – – – – – * * * ? ? * * * ? ? * * * ? ? – – – – – – – – – – FMPY//FADDSUB.S FMPY//FADDSUB.X FMPY//FSUB.S FMPY//FSUB.X FMPY.
PAGE 195
Mnemonic A – N Z V C JOIN JOINB JScc JSCLR JSET – – – – – – – – – – R LR I – – – – – – – – – – * * – – – * * – – – 0 0 – – – – – – – – Special Definitions JSR JSSET LEA LRA LSL – – ? ? – – – ? ? – – – ? ? – – – ? ? – – – ? ? * – – ? ? * – – ? ? 0 – – ? ? ? LSR MOVE MOVEC MOVEI MOVEM – – ? ? ? – – ? ? ? – – ? ? ? – – ? ? ? * – ? ? ? * – ? ? ? 0 – ? ? ? ? – ? ? ? Note 3 MOVEP MOVES MOVETA MPYS MPYU ? ? – – – ? ? – – – ? ? – – – ? ? – – – ? ? ? * 0 ? ? ? * * ? ? ? ? ?
PAGE 196
Note 1 Z - Cleared if the result is not zero. Unchanged otherwise. Note 2 All ? Bits - Cleared if corresponding bit in immediate data is cleared and the operand is CCR. Not affected otherwise. Note 3 C - Set if the last bit shifted out of the operand is set. Cleared otherwise. Cleared for a shift count of zero. Note 4 V - Set if the MSB is changed any time during the shift operation. Cleared otherwise. Note 5 C - Set if bit #n of the source operand is set. Cleared otherwise.
PAGE 197
Note 28 Note 29 Note 30 Note 31 All ? bits - If SR is specified as a destination operand, set according to the corresponding bit of the source operand. Not affected otherwise. – All ? bits - If SR is specified as destination operand, and A, R, LR, I, N, Z, V or C is selected, then the selected bit will be changed. If SR is not specified, then C will be set if bit #n of the source operand is set and cleared if bit #n of the source operand is set. Not affected otherwise.
PAGE 198
DZ (Division by Zero) - Set if the dividend is a finite nonzero number and the divisor is zero. The result will be a correctly signed infinity (generated by the exclusive OR of the signs of the source operands). Cleared otherwise. The DZ bit is not affected by fixed point operations. The DZ bit is cleared during processor reset.
PAGE 199
need for the UNCC bit. This would be true except for the way in which the 754-standard treats the equal and "not equal" predicates. From the condition code tables associated with the floating-point conditional instructions, it can be seen that the UNCC bit will not be set if one or both of the operands is a NaN. This is because the 754-standard recognizes that operands do not have to be ordered to be tested for equality (i. e., UNCC will not be affected when executing FBEQ or FBNE).
PAGE 200
Mnemonic UNCC NAN SNAN OPERR OVF UNF DZ INX Special Definitions ABS ADD ADDC1 AND ANDC – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – ANDI ASL ASR Bcc BCHG ? – – – ? ? – – – ? ? – – – ? ? – – – ? ? – – – ? ? – – – ? ? – – – ? ? – – – ? Note 9 BCLR BFIND BRA BRCLR BRSET ? – – – – ? – – – – ? – – – – ? – – – – ? – – – – ? – – – – ? – – – – ? – – – – Note 14 BScc BSCLR BSET BSR BSSET – – ? – – – – ? – – – – ? – – – – ? – – – – ? – – – – ? –
PAGE 201
Mnemonic UNCC NAN SNAN OPERR OVF UNF DZ INX Special Definitions FCOPYS.X FDEBUGcc FFcc FFcc.U FGETMAN 0 * * * 0 * – – ? * * – – ? * 0 – – ? ? * – – ? 0 * – – ? 0 0 – – ? 0 * – – ? 0 FINT FJcc FJScc FLOAT.S FLOAT.X 0 * * 0 0 * – – 0 0 * – – 0 0 0 – – 0 0 0 – – 0 0 0 – – 0 0 0 – – 0 0 * – – * * FLOATU.S FLOATU.X FLOOR FMPY//FADD.S FMPY//FADD.
PAGE 202
Mnemonic – N Z V C JOIN JOINB JScc JSCLR JSET – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – JSR JSSET LEA LRA LSL – – ? ? – – – ? ? – – – ? ? – – – ? ? – – – ? ? – – – ? ? – – – ? ? – – – ? ? – LSR MOVE MOVEC MOVEI MOVEM – – ? ? ? – – ? ? ? – – ? ? ? – – ? ? ? – – ? ? ? – – ? ? ? – – ? ? ? – – ? ? ? MOVEP MOVES MOVETA MPYS MPYU ? ? – – – ? ? – – – ? ? – – – ? ? – – – ? ? – – – ? ? – – – ? ? – – – ? ? – – – NEG NEGC NOP NOTB OR – –
PAGE 203
Note 1 SNAN - Set if anyone of the source operands is a signaling NaN. Cleared otherwise. Note 2 OPERR - Set if the operands of the floating-point addition are opposite-signed infinities or if the operands of the floating-point subtraction are like-signed infinities. Cleared otherwise. Note 3 UNF - Set if the addition or subtraction operation underflows. Cleared otherwise. Note 4 INX - Set if the addition or subtraction result is inexact. Cleared otherwise.
PAGE 204
Note 31 A.5 OPERR - Set if the source operand is less than zero. Cleared otherwise. IEEE EXCEPTION BITS COMPUTATION The IEEE Exception bits are the five exception bits required by the IEEE standard for trap disabled operations. They actually record a history of all floating-point exceptions which have occurred since the user last cleared the IER register.
PAGE 205
Operands Data ALU Dn Dn.S Dn.L Dn.M Dn.H Dn.ML Data ALU Registers, n= 0-9, SP/SEP/Integer reference as specified by the Data ALU operation. Floating-Point Registers, n= 0-9 (96 bits) SP reference Dn.D Floating-Point Registers, n= 0-9 (96 bits) DP reference Integer Registers, n= 0-9 (32 bits, Low part of Dn) Integer Registers, n= 0-9 (32 bits, Middle part of Dn) Integer Registers, n= 0-9 (32 bits, High part of Dn) Long Integer Register, n=0-9 (Dn.M:Dn.
PAGE 206
Operators Miscellaneous #xx Immediate short data (16 bits sign extended) #xxx Immediate short data (19 bits zero extended) #Data Immediate data (32 bits) #shift, #bit, or #bits Immediate short data (5 or 6 bits) #byte Immediate short data (8 bits) S,Sn Source operand register D,Dn Destination operand register D{n} Bit n of D affected D(8,9) Destination Operand Register D8 or D9 only D(MS) Most significant word of double precision or long integer destination D(LS) Least significand word of double precision o
PAGE 207
ABS Absolute Value Operation: |-D.L| → ABS Assembler Syntax: D.L (parallel data bus move) ABS D (move syntax - see the MOVE instruction description. ) Description: Take the absolute value of the destination operand low portion and store the result in the low portion of D. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: 32-bit integer. CCR Condition Codes: C - Not affected. V - Set if result overflows. Cleared otherwise. Z - Set if result is zero. Cleared otherwise.
PAGE 208
ADD Add Operation: ADD Assembler Syntax: D.L + S.L → D.L (parallel data bus move) ADD S,D (move syntax - see the MOVE instruction description.) Description: Add the low portion of the two specified operands and store the result in the low portion of the destination operand D. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: 32-bit integer. CCR Condition Codes: C - Set if carry is generated from MSB of the result. Cleared otherwise. V - Set if result overflows.
PAGE 209
ADDC Add with Carry ADDC Operation: Assembler Syntax: D.L + S.L + C → D.L (parallel data bus move) ADDC S,D (move syntax - see the MOVE instruction description.) Description: Add the low portion of the two specified operands along with the C bit of the condition code register and store the result in the low portion of destination operand D. When doing multiple precision addition, the higher precision long words of the input variables must be moved to the low portion of the Dn register.
PAGE 210
AND Logical AND Operation: AND Assembler Syntax: D.L & S.L → D.L (parallel data bus move) AND S,D (move syntax - see the MOVE instruction description.) Description: Logically AND the low portion of the two specified operands and store the result in the low portion of D. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: 32-bit integer. CCR Condition Codes: C - Not affected. V - Always cleared. Z - Set if result is zero. Cleared otherwise. N - Set if result is negative.
PAGE 211
ANDC Logical AND with Complement Operation: ANDC Assembler Syntax: D.L & ~S.L → D.L (parallel data bus move) ANDC S,D (move syntax - see the MOVE instruction description.) Description: Logically AND the low portion of D with the logical complement of the low portion of S, and store the result in the low portion of D. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: 32-bit integer. CCR Condition Codes: C - Not affected. V - Always cleared. Z - Set if result is zero.
PAGE 212
Instruction Fields: D ddd Dn.L nnn S sss Dn.
PAGE 213
ANDI AND Immediate to Control Register Operation: ANDI Assembler Syntax: D & #xx → D AND(I) #Byte,D Description: Logically AND the contents of the control register with an 8-bit immediate operand. The result is stored back into the specified control register. See Section A.10 for restrictions. CCR Condition Codes: For CCR operand: C - Cleared if bit 0 of the immediate operand is cleared. Not affected otherwise. V - Cleared if bit 1 of the immediate operand is cleared. Not affected otherwise.
PAGE 214
For OMR, MR, IER, CCR operands: INX - Not affected. DZ - Not affected. UNF - Not affected. OVF - Not affected. OPERR - Not affected. SNAN - Not affected. NAN - Not affected. UNCC - Not affected. IER Flags: For IER operand: SINX - Cleared if bit 0 of the immediate operand is cleared. Not affected otherwise. SDZ - Cleared if bit 1 of the immediate operand is cleared. Not affected otherwise. SUNF - Cleared if bit 2 of the immediate operand is cleared. Not affected otherwise.
PAGE 215
ASL Arithmetic Shift Left ASL Operation: 31 0 C 0 (parallel data bus move) Assembler Syntax: ASL D (move syntax - see the MOVE instruction description.) ASL S,D (move syntax - see the MOVE instruction description.) ASL #shift,D Description: Single-bit shift: Arithmetically shift the low portion of the specified operand one bit to the left. The carry bit receives the MSB shifted out of the low portion of the source operand.
PAGE 216
Instruction Format: ASL 31 D(move syntax - see the MOVE instruction description.) 14 13 DATA BUS MOVE FIELD 10 0101 0 uu01 1ddd OPTIONAL EFFECTIVE ADDRESS EXTENSION OR IMMEDIATE LONG DATA Instruction Format: ASL 31 S,D(move syntax - see the MOVE instruction description.
PAGE 217
ASR Arithmetic Shift Right ASR Operation: 31 0 C (parallel data bus move) Assembler Syntax: ASR D (move syntax - see the MOVE instruction description.) ASR S,D (move syntax - see the MOVE instruction description.) ASR #shift,D Description: Single-bit shift: Arithmetically shift the low portion of the specified operand one bit to the right. The carry bit receives the LSB shifted out of the low portion of the source operand. The MSB of the operand is held constant.
PAGE 218
Instruction Format: ASR 31 D(move syntax - see the MOVE instruction description.) 14 13 DATA BUS MOVE FIELD 10 0000 0 uu11 1ddd OPTIONAL EFFECTIVE ADDRESS EXTENSION OR IMMEDIATE LONG DATA Instruction Format: ASR 31 S,D(move syntax - see the MOVE instruction description.
PAGE 219
Bcc Branch Conditionally Operation: Bcc Assembler Syntax: If cc, then PC+xx else PC+1 → PC → PC Bcc label (short) If cc, then PC+xxxx else PC+1 → PC → PC Bcc label If cc, then PC+Rn else PC+1 → PC → PC Bcc Rn Description: If the specified condition is true, program execution continues at location PC+displacement. The PC contains the address of the next instruction. If the specified condition is false, the PC is incremented and program execution continues sequentially.
PAGE 220
Instruction Format: Bcc 31 0000 0011 label (short) 14 13 10aa Instruction Format: Bcc 31 0000 0011 aaaa aa 0 1c cccc 0aaa aaaa label 14 13 0000 0000 00 0 1c cccc 0000 0000 PC RELATIVE DISPLACEMENT Instruction Format: Bcc 31 0000 0011 Rn 14 13 0000 001R 0 1c cccc 0000 0000 Instruction Fields: Rn - R0-R7 Long Displacement - 32 bits Short Displacement - aaaaaaaaaaaaaaa (15 bits) Mnemonic EQ PL CC(HS) GE GT VC HI c 0 0 0 0 0 0 0 c 1 1 1 1 1 1 1 c 0 0 0 0 1 1 1 c 0 0 1 1 0 0 1 c
PAGE 221
BCHG Bit Test and Change Operation: D{n} ~D{n} D{n} ~D{n} D{n} ~D{n} D{n} ~D{n} D{n} ~D{n} → → → → → → → → → → BCHG Assembler Syntax: C; D{n} C; D{n} C; D{n} C; D{n} C; D{n} BCHG #bit,X: ea BCHG #bit,X: aa BCHG #bit,X: pp BCHG #bit,Y: ea D{n} → C; ~D{n} → D{n} BCHG #bit,Y: aa D{n} → C; ~D{n} → D{n} BCHG #bit,Y: pp BCHG #bit,D Description: The nth bit of the destination operand is tested and the state of the nth bit is reflected in the C condition code bit.
PAGE 222
For other destination operands: C - Set if bit tested is set. Cleared otherwise. V - Not affected. Z - Not affected. N - Not affected. I - Not affected. LR - Not affected. – R A - Not affected. - Not affected. ER Status Bits: For destination operand SR: INX - Changed if bit 8 is specified. Not affected otherwise. DZ - Changed if bit 9 is specified. Not affected otherwise. UNF - Changed if bit 10 is specified. Not affected otherwise. OVF - Changed if bit 11 is specified.
PAGE 223
For other destination operands: SINX - Not affected. SDZ - Not affected. SUNF - Not affected. SOVF - Not affected. SIOP - Not affected.
PAGE 224
D ddddddd D0.S-D7.S 0000nnn D0.L-D7.L 0001nnn D0.M-D7.M 0010nnn D0.H-D7.H 0011nnn D8.L 0100000 D9.L 0100001 D8.M 0100010 D9.M 0100011 D8.H 0100100 D9.H 0100101 D8.S 0100110 D9.
PAGE 225
BCLR Bit Test and Clear BCLR Assembler Syntax: Operation: D{n} → C; 0 → D{n} BCLR #bit,X: ea D{n} → C; 0 → D{n} BCLR #bit,X: aa D{n} → C; 0 → D{n} BCLR #bit,X: pp D{n} → C; 0 → D{n} BCLR #bit,Y: ea D{n} → C; 0 → D{n} BCLR #bit,Y: aa D{n} → C; 0 → D{n} BCLR #bit,Y: pp D{n} → C; 0 → D{n} BCLR #bit,D Description: The nth bit of the destination operand is tested and the state of the nth bit is reflected in the C condition code bit.
PAGE 226
For other destination operands: C - Set if bit tested is set. Cleared otherwise. V - Not affected. Z - Not affected. N - Not affected. I - Not affected. LR - Not affected. – R A - Not affected. - Not affected. ER Status Bits: For destination operand SR: INX - Cleared if bit 8 is specified. Not affected otherwise. DZ - Cleared if bit 9 is specified. Not affected otherwise. UNF - Cleared if bit 10 is specified. Not affected otherwise. OVF - Cleared if bit 11 is specified.
PAGE 227
For other destination operands: SINX - Not affected. SDZ - Not affected. SUNF - Not affected. SOVF - Not affected. SIOP - Not affected.
PAGE 228
D ddddddd D0.S-D7.S 0000nnn D0.L-D7.L 0001nnn D0.M-D7.M 0010nnn D0.H-D7.H 0011nnn D8.L 0100000 D9.L 0100001 D8.M 0100010 D9.M 0100011 D8.H 0100100 D9.H 0100101 D8.S 0100110 D9.
PAGE 229
BFIND Find Leading One BFIND Operation: Assembler Syntax: Leading One(S.L) → D.H (Parallel data bus move) BFIND S,D (move syntax - see the MOVE instruction description.) Description: Return the position of the source operand S leading one, considered from left to right, as a 2’s complement integer in the high portion of destination operand D. If the source operand is zero then return $80000000. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: 32-bit integer.
PAGE 230
BRA Branch Always Operation: BRA Assembler Syntax: → PC BRA label (short) PC+xxxx → PC BRA label BRA Rn PC+xx PC+Rn → PC Description: Program execution continues at location PC+displacement. The PC contains the address of the next instruction. The displacement is a 2’s complement 32-bit integer that represents the relative distance from the current PC to the destination PC. Short Displacement, Long Displacement and Address Register PC Relative addressing modes may be used.
PAGE 231
BRCLR Branch if Bit Clear Operation: BRCLR Assembler Syntax: If S{n} = 0, then PC + xxxx else PC + 1 → → PC PC BRCLR #bit,X: ea, label If S{n} = 0, then PC + xxxx else PC + 1 → → PC PC BRCLR #bit,X: aa, label If S{n} = 0, then PC + xxxx else PC + 1 → → PC PC BRCLR #bit,X: pp, label If S{n} = 0, then PC + xxxx else PC + 1 → → PC PC BRCLR #bit,Y: ea, label If S{n} = 0, then PC + xxxx else PC + 1 → → PC PC BRCLR #bit,Y: aa, label If S{n} = 0, then PC + xxxx else PC + 1 → → PC PC
PAGE 232
Instruction Format: 31 0000 BRCLR #bit,S,label 14 13 0010 1011 dddd dd 0 d0 0100 000b bbbb PC RELATIVE DISPLACEMENT Instruction Format: BRCLR BRCLR #bit,X: pp, label #bit,Y: pp, label 31 14 13 0000 0010 1010 1ppp pp 0 pp 010S 000b bbbb PC RELATIVE DISPLACEMENT Instruction Format: BRCLR BRCLR #bit,X: aa, label #bit,Y: aa, label 31 14 13 0000 0010 1010 0aaa aa 0 aa 010S 000b bbbb PC RELATIVE DISPLACEMENT Instruction Format: BRCLR BRCLR #bit,X: ea, label #bit,Y: ea,
PAGE 233
D ddddddd D0.S-D7.S 0000nnn D0.L-D7.L 0001nnn D0.M-D7.M 0010nnn D0.H-D7.H 0011nnn D8.L 0100000 D9.L 0100001 D8.M 0100010 D9.M 0100011 D8.H 0100100 D9.H 0100101 D8.S 0100110 D9.
PAGE 234
BRSET Branch if Bit Set Operation: If S{n} = 1, then PC + xxxx else PC + 1 If S{n} = 1, then PC + xxxx else PC + 1 If S{n} = 1, then PC + xxxx else PC + 1 If S{n} = 1, then PC + xxxx else PC + 1 If S{n} = 1, then PC + xxxx else PC + 1 If S{n} = 1, then PC + xxxx else PC + 1 If S{n} = 1, then PC + xxxx else PC + 1 BRSET Assembler Syntax: → → → → → → → → → → → → → → PC PC PC PC PC PC PC PC PC PC PC PC PC PC BRSET #bit,X: ea, label BRSET #bit,X: aa, label BRSET #bit,X: pp, label BRSET #bit,Y: ea,
PAGE 235
Instruction Format: 31 0000 BRSET #bit,S,label 14 13 0010 1011 dddd dd 0 d0 1100 000b bbbb PC RELATIVE DISPLACEMENT Instruction Format: BRSET BRSET #bit,X: pp, label #bit,Y: pp, label 31 14 13 0000 0010 1010 1ppp pp 0 pp 110S 000b bbbb PC RELATIVE DISPLACEMENT Instruction Format: BRSET BRSET #bit,X: aa, label #bit,Y: aa, label 31 14 13 0000 0010 1010 0aaa aa 0 aa 110S 000b bbbb PC RELATIVE DISPLACEMENT Instruction Format: BRSET BRSET #bit,X: ea, label #bit,X: ea,
PAGE 236
D ddddddd D0.S-D7.S 0000nnn D0.L-D7.L 0001nnn D0.M-D7.M 0010nnn D0.H-D7.H 0011nnn D8.L 0100000 D9.L 0100001 D8.M 0100010 D9.M 0100011 D8.H 0100100 D9.H 0100101 D8.S 0100110 D9.
PAGE 237
BScc Branch to Subroutine Conditionally Operation: BScc Assembler Syntax: If cc, then PC → SSH; SR → SSL; PC+xx else PC + 1 → PC → PC BScc label (short) If cc, then PC → SSH; SR → SSL; PC+xxxx else PC + 1 → PC → PC BScc label If cc, then PC → SSH; SR → SSL; PC+Rn else PC + 1 → PC → PC BScc Rn Description: If the specified condition is true, the address of the instruction immediately following the BScc instruction and the status register are pushed onto the stack.
PAGE 238
ER Status Bits: Not affected. IER Flags: Not affected.
PAGE 239
BSCLR Branch to Subroutine if Bit Clear Operation: If S{n} = 0,then else If S{n} = 0,then else If S{n} = 0,then else If S{n} = 0,then else If S{n} = 0,then else If S{n} = 0,then else If S{n} = 0,then else BSCLR Assembler Syntax: PC → SSH; SR → SSL; PC + xxxx→ PC PC + 1 → PC PC → SSH; SR → SSL; PC + xxxx→ PC PC + 1 → PC PC → SSH; SR → SSL; PC + xxxx→ PC PC + 1 → PC PC → SSH; SR → SSL; PC + xxxx→ PC PC + 1 → PC PC → SSH; SR → SSL; PC + xxxx→ PC PC + 1 → PC PC → SSH; SR → SSL; PC + xxxx→ PC PC + 1 → PC PC
PAGE 240
Instruction Format: 31 0000 BSCLR #bit,S,label 14 13 0010 1111 dddd dd 0 d0 0100 000b bbbb PC RELATIVE DISPLACEMENT Instruction Format: BSCLR BSCLR #bit,X: pp, label #bit,Y: pp, label 31 14 13 0000 0010 1110 1ppp pp 0 pp 010S 000b bbbb PC RELATIVE DISPLACEMENT Instruction Format: BSCLR BSCLR #bit,X: aa, label #bit,Y: aa, label 31 14 13 0000 0010 1110 0aaa aa 0 aa 010S 000b bbbb PC RELATIVE DISPLACEMENT Instruction Format: BSCLR BSCLR #bit,X: ea, label #bit,Y: ea, l
PAGE 241
D0.S-D7.S 0000nnn D0.L-D7.L 0001nnn D0.M-D7.M 0010nnn D0.H-D7.H 0011nnn D8.L 0100000 D9.L 0100001 D8.M 0100010 D9.M 0100011 D8.H 0100100 D9.H 0100101 D8.S 0100110 D9.
PAGE 242
BSET Bit Test and Set Operation: D{n} 1 D{n} 1 D{n} 1 D{n} 1 D{n} 1 D{n} 1 D{n} 1 → → → → → → → → → → → → → → BSET Assembler Syntax: C; D{n} C; D{n} C; D{n} C; D{n} C; D{n} C; D{n} C; D{n} BSET #bit,X: ea BSET #bit,X: aa BSET #bit,X: pp BSET #bit,Y: ea BSET #bit,Y: aa BSET #bit,Y: pp BSET #bit,D Description: The nth bit of the destination operand is tested and the state of the nth bit is reflected in the C condition code bit. After the test, the nth bit is set in the destination.
PAGE 243
For other destination operands: C - Set if bit tested is set. Cleared otherwise. V - Not affected. Z - Not affected. N - Not affected. I - Not affected. LR - Not affected. – R A - Not affected. - Not affected. ER Status Bits: For destination operand SR: INX -Set if bit 8 is specified. Not affected otherwise. DZ -Set if bit 9 is specified. Not affected otherwise. UNF -Set if bit 10 is specified. Not affected otherwise. OVF -Set if bit 11 is specified. Not affected otherwise.
PAGE 244
For other destination operands: SINX - Not affected. SDZ - Not affected. SUNF - Not affected. SOVF - Not affected. SIOP - Not affected.
PAGE 245
D ddddddd D0.S-D7.S 0000nnn D0.L-D7.L 0001nnn D0.M-D7.M 0010nnn D0.H-D7.H 0011nnn D8.L 0100000 D9.L 0100001 D8.M 0100010 D9.M 0100011 D8.H 0100100 D9.H 0100101 D8.S 0100110 D9.
PAGE 246
BSR Branch to Subroutine Operation: Assembler Syntax: PC → SSH; SR → SSL; PC+xx→ PC PC → SSH; SR → SSL; PC+xxxx→ PC BSR label (short) BSR label BSR Rn PC → SSH; SR → SSL; PC+Rn→ PC BSR Description: The address of the instruction immediately following the BSR instruction and the status register are pushed onto the stack. Program execution then continues at location PC+displacement. The PC contains the address of the next instruction.
PAGE 247
Instruction Fields: Rn - R0-R7 Long PC Relative Displacement - 32 bits Short PC Relative Displacement - aaaaaaaaaaaaaaa (15 bits) Timing: 6 + jx oscillator clock cycles Memory: 1 + ea program words MOTOROLA DSP96002 USER’S MANUAL A - 59
PAGE 248
BSSET BSSET Branch to Subroutine if Bit Set Assembler Syntax: Operation: BSSET #bit,X: ea, label If S{n} = 1, then PC → SSH; SR → SSL; PC + xxxx → PC else PC + 1 → PC BSSET #bit,X: aa, label BSSET #bit,X: pp, label If S{n} = 1, then PC → SSH; SR → SSL; PC + xxxx → PC else PC + 1 → PC BSSET #bit,Y: ea, label If S{n} = 1, then PC → SSH; SR → SSL; PC + xxxx → PC else PC + 1 → PC BSSET #bit,Y: aa, label If S{n} = 1, then PC → SSH; SR → SSL; PC + xxxx → PC else PC + 1 → PC BSSET #bit,Y: pp, lab
PAGE 249
Instruction Format: 31 0000 BSSET #bit,S,label 14 13 0010 1111 dddd dd 0 d0 1100 000b bbbb PC RELATIVE DISPLACEMENT Instruction Format: BSSET BSSET #bit,X: pp, label #bit,Y: pp, label 31 14 13 0000 0010 1110 1ppp pp 0 pp 110S 000b bbbb PC RELATIVE DISPLACEMENT Instruction Format: BSSET BSSET #bit,X: aa, label #bit,Y: aa, label 31 14 13 0000 0010 1110 0aaa aa 0 aa 110S 000b bbbb PC RELATIVE DISPLACEMENT Instruction Format: BSSET BSSET #bit,X: ea, label #bit,Y: ea, la
PAGE 250
D ddddddd D0.S-D7.S 0000nnn D0.L-D7.L 0001nnn D0.M-D7.M 0010nnn D0.H-D7.H 0011nnn D8.L 0100000 D9.L 0100001 D8.M 0100010 D9.M 0100011 D8.H 0100100 D9.H 0100101 D8.S 0100110 D9.
PAGE 251
BTST Bit Test Operation: BTST Assembler Syntax: S{n} → C BTST #bit,X: ea S{n} → C BTST #bit,X: aa S{n} → C BTST #bit,X: pp S{n} → C BTST #bit,Y: ea S{n} → C BTST #bit,Y: aa S{n} → C BTST #bit,Y: pp S{n} → C BTST #bit,S Description: The nth bit of the source operand is tested and the state of the nth bit is reflected in the C condition code bit. All memory alterable addressing modes may be used. Register Direct, Absolute Short and I/O Short addressing may also be used.
PAGE 252
Instruction Format: 31 0000 BTST #bit,S 14 13 0010 0111 dddd dd 0 d0 1100 000b bbbb PC RELATIVE DISPLACEMENT Instruction Format: BTST BTST #bit,X: pp #bit,Y: pp 31 14 13 0000 0010 0110 1ppp pp 0 pp 110S 000b bbbb PC RELATIVE DISPLACEMENT Instruction Format: BTST BTST #bit,X: aa #bit,Y: aa 31 14 13 0000 0010 0110 0aaa aa 0 aa 110S 000b bbbb PC RELATIVE DISPLACEMENT Instruction Format: BTST BTST #bit,X: ea #bit,Y: ea 31 14 13 0000 0010 0100 MMMR 0 00 110S 000
PAGE 253
D ddddddd D0.S-D7.S 0000nnn D0.L-D7.L 0001nnn D0.M-D7.M 0010nnn D0.H-D7.H 0011nnn D8.L 0100000 D9.L 0100001 D8.M 0100010 D9.M 0100011 D8.H 0100100 D9.H 0100101 D8.S 0100110 D9.
PAGE 254
CLR Clear an Operand Operation: CLR Assembler Syntax: 0 → D.L (parallel data bus move) CLR D (move syntax - see the MOVE instruction description.) Description: The low portion of the destination operand is cleared to zero. This instruction is implemented by executing ANDC D,D. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: 32-bit integer. CCR Condition Codes: C - Not affected. V - Always cleared. Z - Always set. N - Always cleared. I - Not affected.
PAGE 255
CMP Compare Operation: CMP Assembler Syntax: S2.L - S1.L (parallel data bus move) CMP S1,S2 (move syntax - see the MOVE instruction description.) Description: Subtract the low portion of the two operands as specified in the operation column above. No result is stored; however, the condition codes are affected as described below. CMPG and CMP differ primarily in the definition of the CCR condition code bits LR and R.
PAGE 256
A - Cleared if result is negative without overflow. Cleared if result is positive with overflow. Not affected otherwise. See the example for the FCMPG instruction. ER Status Bits: Not affected. IER Flags: Not affected. Instruction Format: CMP S1,S2 (move syntax - see the MOVE instruction description.) 31 14 13 DATA BUS MOVE FIELD 00 0 0sss uu11 1ddd OPTIONAL EFFECTIVE ADDRESS EXTENSION OR IMMEDIATE LONG DATA Instruction Fields: S2 S1 sss Dn.L nnn where nnn = 0-7 (u u) D ddd Dn.
PAGE 257
CMPG Graphics Compare with Trivial Accept/Reject Flags Operation: Assembler Syntax: S2.L - S1.L (parallel data bus move) CMPG CMPG S1,S2 (move syntax - see the MOVE instruction description.) Description: Subtract the low portion of the two operands as specified in the operation column above. No result is stored; however, the condition codes are affected as described below. CMPG and CMP differ primarily in the definition of the CCR condition code bits LR and R.
PAGE 258
A - Cleared if result is negative without overflow. Cleared if result is positive with overflow. Not affected otherwise. See the example for the FCMPG instruction. ER Status Bits: Not affected. IER Flags: Not affected. Instruction Format: CMPG S1,S2 (move syntax - see the MOVE instruction description.) 31 14 13 DATA BUS MOVE FIELD 11 0 0sss 0110 1ddd OPTIONAL EFFECTIVE ADDRESS EXTENSION OR IMMEDIATE LONG DATA Instruction Fields: S1 sss Dn.L nnn where nnn = 0-7 S2 ddd Dn.
PAGE 259
DEBUGcc Enter Debug Mode Conditionally Operation: Assembler Syntax: If cc, then enter debug mode. DEBUGcc DEBUGcc Description: If the specified condition is true, enter Debug mode and wait for OnCE commands. If the specified condition is false, continue with the next instruction.
PAGE 260
Instruction Fields: Mnemonic EQ PL CC(HS) GE GT VC HI AL c 0 0 0 0 0 0 0 1 c 1 1 1 1 1 1 1 1 c 0 0 0 0 1 1 1 1 c 0 0 1 1 0 0 1 1 c 0 1 0 1 0 1 0 1 Mnemonic NE(Q) MI CS(LO) LT LE VS LS c 1 1 1 1 1 1 1 c 1 1 1 1 1 1 1 cc 00 00 01 01 10 10 11 c 0 1 0 1 0 1 0 Timing: 4 oscillator clock cycles Memory: 1 program words A - 72 DSP96002 USER’S MANUAL MOTOROLA
PAGE 261
DEC Decrement by One Operation: D.L - 1 → DEC Assembler Syntax: D.L (parallel data bus move) DEC D (move syntax - see the MOVE instruction description.) Description: Decrement by one the low portion of the specified operand. The result is stored in the low portion of D. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: 32-bit integer. CCR Condition Codes: C - Set if a borrow is generated from the MSB of the result. Cleared otherwise. V - Set if result overflows.
PAGE 262
DO Start Hardware Loop Operation: DO Assembler Syntax: LA → SSH; LC → SSL; X: → LC DO X: ea, label DO Y: ea, label DO S,label DO #count,label PC → SSH; SR → SSL; expr → LA; 1 → LF LA → SSH; LC → SSL; Y: → LC PC → SSH; SR → SSL; expr → LA; 1 → LF LA → SSH; LC → SSL; S → LC PC → SSH; SR → SSL; expr → LA; 1 → LF LA → SSH; LC → SSL; #xxx → LC PC → SSH; SR → SSL; expr → LA; 1 → LF Description: Begin a hardware DO loop that is to be repeated the number of times specified in the instruction’
PAGE 263
"end-of-loop" processing begins. When executing a DO loop, the instructions are actually fetched each time through the loop. Therefore, a DO loop can be interrupted. DO loops can also be nested. When DO loops are nested, the end-of-loop addresses must also be nested and are not allowed to be equal. The assembler generates an error message when DO loops are improperly nested. Nested DO loops are illustrated in the example.
PAGE 264
Other restrictions: BSR (F)BScc JSR (F)JScc JSCLR JSSET BSCLR BSSET to (LA), if Loop Flag is set to (LA), if Loop Flag is set to (LA), if Loop Flag is set to (LA), if Loop Flag is set to (LA), if Loop Flag is set to (LA), if Loop Flag is set to (LA), if Loop Flag is set to (LA), if Loop Flag is set A DO instruction cannot be repeated using the REP instruction.
PAGE 265
Example: DO #n1,END1 DO #n2,END2 MOVE D0,X:(R0)+ END2 ADD D1,D2 X:(R1)+,D3 END1 The assembler calculates the end of loop address (LA) (absolute address extension word xxxx) by evaluating the end of loop expression and subtracting one. Thus the end of loop expression in the source code represents the "next address" after the end of the loop. If a simple end of loop address label is used, it should be placed after the last instruction in the loop.
PAGE 266
Instruction Format: DO S,label 31 14 13 0000 0001 1010 0000 00 0 00 0000 1ddd dddd ABSOLUTE ADDRESS Instruction Format: DO DO X: ea, label X: ea, label 31 14 13 0000 0001 100S MMMR 00 0 0000 1000 0000 ABSOLUTE ADDRESS Instruction Fields: Rn - R0-R7 (Address Register Indirect Modes except (Rn+xxx) ) Absolute Address - 32 bits Immediate Short Data - iiiiiiiiiiiiiiiiiii (19 bits) Memory Space S X Memory 0 Y Memory 1 D D0.S-D7.S D0.L-D7.L D0.M-D7.M D0.H-D7.H D8.L D9.L D8.
PAGE 267
DOR Start PC Relative Hardware Loop Operation: Assembler Syntax: LA → SSH; LC → SSL; X: → LC PC → SSH; SR → SSL; PC+xxxx → LA; 1 → LF DOR X: ea, label LA → SSH; LC → SSL; Y: → LC PC → SSH; SR → SSL; PC+xxxx → LA; 1 → LF DOR Y: ea, label LA → SSH; LC → SSL; S → LC PC → SSH; SR → SSL; PC+xxxx → LA; 1 → LF DOR S,label LA → SSH; LC → SSL; #xxx → LC PC → SSH; SR → SSL; PC+xxxx → LA; 1 → LF DOR #count,label DOR Description: This instruction initiates the beginning of a PC relative hardwar
PAGE 268
represents the "next address" after the end of the loop. If a simple end of loop address label is used, it should be placed after the last instruction in the loop. The LA register is compared to the PC to determine when the end of loop is reached. If the end of loop is reached, the loop counter (LC) is tested for one. If LC is not equal to one then it is decremented by one. If LC is equal to one, the system stack is purged and instruction fetches continue at the incremented PC address.
PAGE 269
PC displacement - 32 bits Immediate Short Data - iiiiiiiiiiiiiiiiiii (19 bits) Memory Space S X Memory 0 Y Memory 1 D ddddddd D0.S-D7.S 0000nnn D0.L-D7.L 0001nnn D0.M-D7.M 0010nnn D0.H-D7.H 0011nnn D8.L 0100000 D9.L 0100001 D8.M 0100010 D9.M 0100011 D8.H 0100100 D9.H 0100101 D8.S 0100110 D9.
PAGE 270
ENDDO End Current DO Loop Operation: Assembler Syntax: SSL(LF) → SR; SP-1 → SP SSH → LA; SSL → LC; SP-1 → SP ENDDO ENDDO Description: This instruction will cause the termination of the current hardware DO loop before the current loop counter (LC) equals one. If the value of the current DO loop counter is needed, it must be read before the execution of the ENDDO instruction.
PAGE 271
EOR Logical Exclusive OR Operation: EOR Assembler Syntax: D.L && S.L → D.L (parallel data bus move) EOR S,D (move syntax - see the MOVE instruction description.) Description: Logically exclusive OR the low portion of the two specified operands and store the result in the low portion of D. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: 32-bit integer. CCR Condition Codes: C - Not affected. V - Always cleared. Z - Set if result is zero. Cleared otherwise.
PAGE 272
EXT Sign Extend Half Word Operation: Assembler Syntax: D.L {15:0} → D.L {15:0} (parallel data bus move) D.L {15} → D.L {31:16} EXT D EXT (move syntax - see the MOVE instruction description.) Description: Sign extend the lower 16 bits of D.L into the upper 16 bits of D.L. Input Operand(s) Precision: 16-bit integer. Output Operand Precision: 32-bit integer. CCR Condition Codes: C - Not affected. V - Always cleared. Z - Set if result is zero. Cleared otherwise. N - Set if result is negative.
PAGE 273
EXTB Sign Extend Byte EXTB Operation: Assembler Syntax: D.L {7:0} → D.L {7:0} (parallel data bus move) D.L {7} → D.L {31:8} EXTB D (move syntax - see the MOVE instruction description.) Description: Sign extend the lower byte of D.L into the upper 24 bits of D.L. Input Operand(s) Precision: 8-bit integer. Output Operand Precision: 32-bit integer. CCR Condition Codes: C - Not affected. V - Always cleared. Z - Set if result is zero. Cleared otherwise. N - Set if result is negative.
PAGE 274
FABS.S Absolute Value Operation: FABS.S Assembler Syntax: D → ROUND(SP) → D (parallel data bus move) FABS.S D (move syntax - see the MOVE instruction description.) Description: Take the absolute value of the destination operand, round to single precision and store the result in the destination operand D. Input Operand(s) Precision: SEP Floating-Point. Output Operand Precision: SP Floating-Point. CCR Condition Codes: C - Not affected. V - Not affected. Z - Set if result is zero.
PAGE 275
Instruction Format: FABS.S D (move syntax - see the MOVE instruction description.
PAGE 276
FABS.X Absolute Value Operation: D → D FABS.X Assembler Syntax: (parallel data bus move) FABS.X D (move syntax - see the MOVE instruction description.) Description: Take the absolute value of the destination operand and store the result in the destination operand D. Input Operand(s) Precision: SEP Floating-Point. Output Operand Precision: SEP Floating-Point. CCR Condition Codes: C - Not affected. V - Not affected. Z - Set if result is zero. Cleared otherwise. N - Always cleared.
PAGE 277
Instruction Format: FABS.X D 31 (move syntax - see the MOVE instruction description.
PAGE 278
FADD.S Floating-Point Add Operation: Assembler Syntax: D + S → ROUND(SP) → D (parallel data bus move) FADD.S S,D FADD.S (move syntax - see the MOVE instruction description.) Description: Add the two specified operands, round to single precision and store the result in the destination operand D. Input Operand(s) Precision: SEP Floating-Point. Output Operand Precision: SP Floating-Point. CCR Condition Codes: C - Not affected. V - Not affected. Z - Set if result is zero. Cleared otherwise.
PAGE 279
Instruction Fields: 31 14 13 DATA BUS MOVE FIELD 01 0 0sss uu01 1ddd OPTIONAL EFFECTIVE ADDRESS EXTENSION OR IMMEDIATE LONG DATA (u u) D ddd Dn nnn S sss Dn nnn where nnn = 0-7 where nnn = 0-7 Timing: 2 + mv + da oscillator clock cycles Memory: 1 + mv program words MOTOROLA DSP96002 USER’S MANUAL A - 91
PAGE 280
FADD.X Floating-Point Add FADD.X Operation: Assembler Syntax: D + S → ROUND(SEP) → D (parallel data bus move) FADD.X S,D (move syntax - see the MOVE instruction description.) Description: Add the two specified operands, round to single extended precision and store the result in the destination operand D. Input Operand(s) Precision: SEP Floating-Point. Output Operand Precision: SEP Floating-Point. CCR Condition Codes: C - Not affected. V - Not affected. Z - Set if result is zero.
PAGE 281
Instruction Format: FADD.X S,D 31 (move syntax - see the MOVE instruction description.
PAGE 282
FADDSUB.S Add and Subtract Operation: D1 + D2 D1 - D2 FADDSUB.S Assembler Syntax: → ROUND(SP) → ROUND(SP) → D2 (parallel data bus move) → D1 FADDSUB.S D1,D2 (move syntax - see the MOVE instruction description.) Description: Add and subtract the two specified operands and round to single precision. Store the rounded result of the addition in D2 and of the subtraction in D1. Input Operand(s) Precision: SEP Floating-Point. Output Operand(s) Precision: SP Floating-Point.
PAGE 283
Instruction Format: FADDSUB.S D1,D2 (move syntax - see the MOVE instruction description.
PAGE 284
FADDSUB.X Add and Subtract FADDSUB.X Operation: Assembler Syntax: D1 + D2 → ROUND(SEP) → D2 FADDSUB.X D1,D2 (move syntax - see the MOVE instruction description.) (parallel data bus move) D1 - D2 → ROUND(SEP) → D1 Description: Add and subtract the two specified operands and round to single extended precision. Store the result of the addition in D2 and of the subtraction in D1. Input Operand(s) Precision: SEP Floating-Point. Output Operand(s) Precision: SEP Floating-Point.
PAGE 285
Instruction Format: FADDSUB.X D1,D2 (move syntax - see the MOVE instruction description.
PAGE 286
FBcc Floating-Point Branch Conditionally Operation: FBcc Assembler Syntax: If cc, then PC+xx else PC+1 → PC → PC FBcc label (short) If cc, then PC+xxxx else PC+1 → PC → PC FBcc label If cc, then PC+Rn else PC+1 → PC → PC FBcc Rn Description: If the specified floating-point condition is true, the address of the instruction immediately following the FBScc instruction and the status register are pushed onto the stack. Program execution then continues at location PC+displacement.
PAGE 287
ER Status Bits: INX - Not affected. DZ - Not affected. UNF - Not affected. OVF - Not affected. OPERR- Not affected. SNAN - Not affected. NAN - Not affected. UNCC - Set if NAN is set and a non-aware floating-point condition is tested ("cc" conditions marked "YES" above). Not affected otherwise. IER Flags: SINX - Not affected. SDZ - Not affected. SUNF - Not affected. SOVF - Not affected.
PAGE 288
Rn - R0-R7 Long Displacement - 32 bits Short Displacement - aaaaaaaaaaaaaaa (15 bits) Mnemonic GT LT GE LE GL INF GLE OR EQ PL ERR c 0 0 0 0 0 0 0 0 0 0 0 c 0 0 0 0 0 0 0 0 1 1 1 c 0 0 0 0 1 1 1 1 0 0 1 c 0 0 1 1 0 0 1 1 0 0 1 c 0 1 0 1 0 1 0 1 0 1 1 Mnemonic NGT NLT NGE NLE NGL NINF NGLE UN NE(Q) MI c 1 1 1 1 1 1 1 1 1 1 c 0 0 0 0 0 0 0 0 1 1 cc 00 00 01 01 10 10 11 11 00 00 c 0 1 0 1 0 1 0 1 0 1 Timing: 6 + jx oscillator clock cycles Memory: 1 + ea program words A - 100 DSP96002 USER’S MANUAL
PAGE 289
FBScc Floating-Point Branch To Subroutine Conditionally Operation: Assembler Syntax: If cc, then PC → SSH; SR → SSL; PC+xx → PC else PC+1 → PC FBScc label (short) If cc, then PC → SSH; SR → SSL; PC+xxxx → PC else PC+1 → PC FBScc label If cc, then PC → SSH; SR → SSL; PC+Rn → PC else PC+1 → PC FBScc Rn FBScc Description: If the specified floating-point condition is true, the address of the instruction immediately following the FBScc instruction and the status register are pushed onto the stack.
PAGE 290
"cc" may specify the following conditions: Non-aware* Mnemonic EQ - equal ERR - error Condition Set UNCC Z=1 No UNCC v SNAN v OPERR v No OVF v UNF v DZ = 1 GE - greater than or equal NAN v (N & ~Z) = 0 Yes GL - greater or less than NAN v Z = 0 Yes GLE - greater, less or equal NAN = 0 Yes GT - greater than NAN v Z v N = 0 Yes INF - infinity I=1 Yes LE - less than or equal NAN v ~(N v Z) = 0 Yes LT - less than NAN v Z v ~N = 0 Yes MI - minus N=1 No NE(Q) - not equal Z=0 No NGE - not(greater than or equal) NA
PAGE 291
Instruction Format: FBScc 31 0000 0011 label (short) 14 13 11aa aaaa aa 0 1c cccc 0aaa aaaa OPTIONAL LONG DISPLACEMENT EXTENSION Instruction Format: FBScc 31 0000 0011 Rn 14 13 0100 001R 0 1c cccc 0000 0000 OPTIONAL LONG DISPLACEMENT EXTENSION Instruction Format: FBScc 31 0000 0011 label 14 13 0100 0000 00 0 1c cccc 0000 0000 OPTIONAL LONG DISPLACEMENT EXTENSION Instruction Fields: Rn - R0-R7 Long Displacement - 32 bits Short Displacement - aaaaaaaaaaaaaaa (15 bits) Mnemonic
PAGE 292
FCLR Clear Floating-Point Register Operation: FCLR Assembler Syntax: +0 → D (parallel data bus move) FCLR D (move syntax - see the MOVE instruction description.) Description: All 96 bits of the destination operand are cleared to zero. Input Operand(s) Precision: DEP Floating-Point. Output Operand Precision: DEP Floating-Point. CCR Condition Codes: C - Not affected. V - Not affected. Z - Always set. N - Always cleared. I - Always cleared. LR - Not affected. – R A - Not affected.
PAGE 293
Instruction Fields: (u u) D ddd Dn nnn where nnn = 0-7 Timing: 2 + mv + da oscillator clock cycles Memory: 1 + mv program words MOTOROLA DSP96002 USER’S MANUAL A - 105
PAGE 294
FCMP Compare Two Floating-Point Operands Operation: FCMP Assembler Syntax: S2 - S1 (parallel data bus move) FCMP S1,S2 (move syntax - see the MOVE instruction description.) Description: Subtract the two operands as specified in the operation column above. No result is stored; however, the condition codes are affected as described. This instruction differs from FSUB when S1=S2; in this case, the result is always +0 and therefore, N is cleared. Note that this is true even if S1, S2 are infinity.
PAGE 295
Instruction Format: FCMP 31 S1,S2 (move syntax - see the MOVE instruction description.
PAGE 296
FCMPG Graphics Compare with Trivial Accept/Reject Flags Operation: S2 - S1 FCMPG Assembler Syntax: (parallel data bus move) FCMPG S1,S2 (move syntax - see the MOVE instruction description.) Description: Subtract the two operands as specified in the operation column above. No result is stored; however, the condition codes are affected as described. This instruction differs from FSUB when S1=S2; in this case, the result is always +0 and therefore, N is cleared.
PAGE 297
CCR Condition Codes: (Note: Since there is no destination, there is no rounding and therefore the condition codes are set assuming an infinite precision result) C - Set if result is a NaN. Set if result is negative and not zero. Cleared otherwise. V - Not affected. Z - Set if source operands are equal. Cleared otherwise. N - Set if result is negative. Cleared otherwise. I - Set if anyone of the operands is infinity. Cleared otherwise.
PAGE 298
FCMPM Compare Magnitude of Two Floating-Point Operands Operation: S2 - S1 FCMPM Assembler Syntax: (parallel data bus move) FCMPM S1,S2 (move syntax - see the MOVE instruction description.) Description: Subtract the absolute value (magnitude) of the two operands as specified in the operation column above. No result is stored; however, the condition codes are affected as described. This instruction differs from FSUB when S1=S2; in this case, the result is always +0 and therefore, N is cleared.
PAGE 299
IER Flags: Flags changed according to standard definition. Instruction Format: FCMPM S1,S2 31 (move syntax - see the MOVE instruction description.
PAGE 300
FCOPYS.S Copy Sign Operation: Sign of S → FCOPYS.S Assembler Syntax: D → ROUND(SP) → D FCOPYS.S S,D (parallel data bus move) (move syntax - see the MOVE instruction description.) Description: Copy the sign of the floating-point operand S to the floating-point operand D, round the resulting operand D to single precision and store the result in the specified destination D. Input Operand(s) Precision: SEP Floating-Point. Output Operand Precision: SP Floating-Point.
PAGE 301
Instruction Format: FCOPYS.S S,D 31 (move syntax - see the MOVE instruction description.
PAGE 302
FCOPYS.X Copy Sign Operation: FCOPYS.X Assembler Syntax: Sign of S → D (parallel data bus move) FCOPYS.X S,D (move syntax - see the MOVE instruction description.) Description: Copy the sign of the floating-point operand S to the floating-point operand D. Since both S and D are single extended precision operands, rounding is not performed. Input Operand(s) Precision: SEP Floating-Point. Output Operand Precision: SEP Floating-Point. CCR Condition Codes: C - Not affected. V - Not affected.
PAGE 303
Instruction Format: FCOPYS.X S,D 31 (move syntax - see the MOVE instruction description.
PAGE 304
FDEBUGcc Enter Debug Mode Conditionally Operation: Assembler Syntax: If cc, then enter debug mode. FDEBUGcc FDEBUGcc Description: If the specified floating-point condition is true, enter Debug mode and wait for OnCE commands. If the specified condition is false, continue with the next instruction. Non-aware floating-point conditions set the SIOP flag in the IER register and the UNCC bit in the ER register if the NAN bit is set.
PAGE 305
ER Status Bits: INX - Not affected. DZ - Not affected. UNF - Not affected. OVF - Not affected. OPERR- Not affected. SNAN - Not affected. NAN - Not affected. UNCC - Set if NAN is set and a non-aware floating-point condition is tested ("cc" conditions marked "YES" above). Not affected otherwise. IER Flags: SINX - Not affected. SDZ - Not affected. SUNF - Not affected. SOVF - Not affected.
PAGE 306
FGETMAN Extract the Mantissa Operation: Assembler Syntax: Normalized mantissa of S → D FGETMAN S,D (parallel data bus move) FGETMAN (move syntax - see the MOVE instruction description.) Description: Extract the mantissa and sign of the floating-point operand S, normalizes the mantissa, forces the exponent to "ebias" so the result is in the range 1-2, and stores the result as a floating-point value in the specified destination D regardless of whether the mantissa is denormalized or not.
PAGE 307
CCR Condition Codes: C - Not affected. V - Not affected. Z - Set if result is zero. Cleared otherwise. N - Set if result is negative. Cleared otherwise. I - Always cleared. LR - Not affected. – R A - Not affected. - Not affected. ER Status Bits: INX -Always cleared. DZ -Always cleared. UNF -Always cleared. OVF -Always cleared. OPERR-Set if the source operand is infinity. Cleared otherwise. SNAN -Set if operand is a signaling NaN. Cleared otherwise. NAN -Set if result is a NaN.
PAGE 308
FINT Extract the Integer Part FINT Operation: Assembler Syntax: S → ROUND TO INTEGER → D FINT S,D (move syntax - see the MOVE instruction description.) (parallel data bus move) Description: Round the floating-point source operand S to an integer value using the current rounding mode specified by bits R1-R0 in the IER register, and store the result as a floating-point number in the specified destination D. The rounding precision is always SEP. For example: if the rounding is to +∞, then 110.
PAGE 309
Instruction Format: FINT S,D (move syntax - see the MOVE instruction description.
PAGE 310
FJcc Floating-Point Jump Conditionally Operation: Assembler Syntax: If cc, then xx → PC else PC+1 → PC FJcc label (short) If cc, then ea → PC else PC+1 → PC FJcc ea FJcc Description: If the specified floating-point condition is true, program execution then continues at a location specified by an effective address in the instruction. If the specified condition is false, the PC is incremented and the effective address is ignored.
PAGE 311
CCR Condition Codes: Not affected. ER Status Bits: INX - Not affected. DZ - Not affected. UNF - Not affected. OVF - Not affected. OPERR- Not affected. SNAN - Not affected. NAN - Not affected. UNCC - Set if NAN is set and a non-aware floating-point condition is tested ("cc" conditions marked "YES" above). Not affected otherwise. IER Flags: SINX - Not affected. SDZ - Not affected. SUNF - Not affected. SOVF - Not affected.
PAGE 312
Instruction Fields: ea Rn - R0-R7 (Memory alterable addressing modes only) Absolute Address - 32 bits Short Jump Address - aaaaaaaaaaaaaaa (15 bits) Mnemonic c c c c c Mnemonic GT 00000 NGT LT 00001 NLT GE 00010 NGE LE 00011 NLE GL 00100 NGL INF 00101 NINF GLE 00110 NGLE OR 00111 UN EQ 01000 NE(Q) PL 01001 MI ERR 01111 c 1 1 1 1 1 1 1 1 1 1 c 0 0 0 0 0 0 0 0 1 1 cc 00 00 01 01 10 10 11 11 00 00 c 0 1 0 1 0 1 0 1 0 1 Timing: 6 + jx oscillator clock cycles Memory: 1 + ea program words A - 124 DSP96002
PAGE 313
FJScc Floating-Point Jump To Subroutine Conditionally Operation: Assembler Syntax: If cc, then PC → SSH; SR → SSL; xx → PC else PC+1 → PC FJScc FJScc label (short) If cc, then PC → SSH; SR → SSL; ea → PC else PC+1 → PC Description: If the specified floating-point condition is true, the address of the instruction immediately following the FJScc instruction and the status register are pushed onto the stack. Program execution then continues at the effective address in program memory.
PAGE 314
ER Status Bits: INX - Not affected. DZ - Not affected. UNF - Not affected. OVF - Not affected. OPERR- Not affected. SNAN - Not affected. NAN - Not affected. UNCC -Set if NAN is set and a non-aware floating-point condition is tested ("cc" conditions marked "YES" above). Not affected otherwise. IER Flags: SINX - Not affected. SDZ - Not affected. SUNF - Not affected. SOVF - Not affected.
PAGE 315
ea Rn - R0-R7 (Memory alterable addressing modes only) Absolute Address - 32 bits Short Jump Address - aaaaaaaaaaaaaaa (15 bits) Mnemonic c c c c c Mnemonic GT 00000 NGT LT 00001 NLT GE 00010 NGE LE 00011 NLE GL 00100 NGL INF 00101 NINF GLE 00110 NGLE OR 00111 UN EQ 01000 NE(Q) PL 01001 MI ERR 01111 c 1 1 1 1 1 1 1 1 1 1 c 0 0 0 0 0 0 0 0 1 1 cc 00 00 01 01 10 10 11 11 00 00 c 0 1 0 1 0 1 0 1 0 1 Timing: 2 + mv + da oscillator clock cycles Memory: 1 + mv program words MOTOROLA DSP96002 USER’S MANUAL
PAGE 316
FLOAT.S Integer to Floating-Point Conversion Operation: Assembler Syntax: D.L → CONVERT TO FP → ROUND(SP) → D FLOAT.S D (parallel data bus move) FLOAT.S (move syntax - see the MOVE instruction description.) Description: Convert the 2’s complement 32-bit integer located in the low portion of the operand D into a floating-point operand, round to single precision and store the result in the operand D. Input Operand(s) Precision: 32-bit 2’s complement integer.
PAGE 317
Instruction Format: FLOAT.S D 31 (move syntax - see the MOVE instruction description.
PAGE 318
FLOAT.X Integer to Floating-Point Conversion Operation: Assembler Syntax: D.L → CONVERT TO FP → D FLOAT.X D (parallel data bus move) FLOAT.X (move syntax - see the MOVE instruction description.) Description: Convert the 2’s complement 32-bit integer located in the low portion of the operand D into a floating-point operand and store the result in the operand D. The rounding precision is SEP. Input Operand(s) Precision: 32-bit 2’s complement integer. Output Operand Precision: SEP Floating-Point.
PAGE 319
Instruction Format: FLOAT.X D 31 (move syntax - see the MOVE instruction description.) 14 13 DATA BUS MOVE FIELD 10 0100 uu10 0 0ddd OPTIONAL EFFECTIVE ADDRESS EXTENSION OR IMMEDIATE LONG DATA Instruction Fields: (u u) D ddd Dn.
PAGE 320
FLOATU.S Unsigned Integer to Floating-Point Conversion Operation: Assembler Syntax: D.L → CONVERT TO FP → ROUND(SP) → D FLOATU.S D (parallel data bus move) FLOATU.S (move syntax - see the MOVE instruction description.) Description: Convert the unsigned 32-bit integer located in the low portion of the operand D into a floating-point operand, round to single precision and store the result in the operand D. Input Operand(s) Precision: 32-bit unsigned integer.
PAGE 321
Instruction Format: FLOATU.S D move syntax - see the MOVE instruction description.
PAGE 322
FLOATU.X Unsigned Integer to Floating-Point Conversion Operation: Assembler Syntax: D.L → CONVERT TO FP → D FLOATU.X D (parallel data bus move) FLOATU.X (move syntax - see the MOVE instruction description.) Description: Convert the unsigned 32-bit integer located in the low portion of the operand D into a floating-point operand and store the result in the operand D. The rounding precision is SEP. Input Operand(s) Precision: 32-bit unsigned integer. Output Operand Precision: SEP Floating-Point.
PAGE 323
Instruction Fields: (u u) D ddd Dnnnn where nnn = 0-7 Timing: 2 + mv + da oscillator clock cycles Memory: 1 + mv program words MOTOROLA DSP96002 USER’S MANUAL A - 135
PAGE 324
FLOOR Extract the Integer Part FLOOR Operation: Assembler Syntax: S→ ROUND TO INTEGER → D FLOOR S,D (move syntax - see the MOVE instruction description.) (parallel data bus move) Description: Round the floating-point source operand S to an integer value using the round to minus infinity mode and store the result as a floating-point number in the specified destination D. The rounding precision is always SEP.
PAGE 325
IER Flags: Flags changed according to standard definition. Instruction Format: FLOOR S,D (move syntax - see the MOVE instruction description.
PAGE 326
FMPY//FADD.S Floating-Point Multiply and Add FMPY//FADD.S Operation: Assembler Syntax: S1 * S2 → ROUND(MP) → D1 FMPY S1,S2,D1 FADD.S S3,D2 (parallel data bus move) S3 + D2 → ROUND(SP) → D2 (move syntax - see the MOVE instruction description.) FMPY S2,S1,D1 FADD.S S3,D2 (move syntax - see the MOVE instruction description.) Description: Multiply the two operands S1 and S2, round to the precision indicated by the MP mode bit and store the result in the specified destination register D1.
PAGE 327
SNAN -Set if anyone of the source operands is a signaling NaN. Cleared otherwise. NAN -Set if result of the addition is a NaN. Cleared otherwise. UNCC -Always cleared. IER Flags: Flags changed according to standard definition. Instruction Format: FMPY S1,S2,D1 FADD.S S3,D2 (move syntax - see the MOVE instruction description.
PAGE 328
FMPY//FADD.X Floating-Point Multiply and Add FMPY//FADD.X Operation: Assembler Syntax: S1 * S2 → ROUND(SEP) → D1 FMPY S1,S2,D1 FADD.X S3,D2 (parallel data bus move) S3 + D2 → ROUND(SEP) → D2 (move syntax - see the MOVE instruction description.) FMPY S2,S1,D1 FADD.X S3,D2 (move syntax - see the MOVE instruction description.) Description: Multiply the two operands S1 and S2, round to single extended precision and store the result in the specified destination register D1.
PAGE 329
IER Flags: Flags changed according to standard definition. Instruction Format: FMPY S1,S2,D1 FADD.X S3,D2 (move syntax - see the MOVE instruction description.) FMPY S2,S1,D1 FADD.X S3,D2 (move syntax - see the MOVE instruction description.
PAGE 330
FMPY//FADDSUB.S FMPY//FADDSUB.S Floating-Point Multiply, Add, and Subtract Operation: Assembler Syntax: S1 * S2 → ROUND(MP) → D1 FMPY S1,S2,D1 FADDSUB.S D3,D2 (parallel data bus move) D3 + D2 → ROUND(SP) → D2 (move syntax - see the MOVE instruction description.) FMPY S2,S1,D1 FADDSUB.S D3,D2 D3 - D2 → ROUND(SP) → D3 (move syntax - see the MOVE instruction description.
PAGE 331
ER Status Bits: INX -Set if the result of the addition, subtraction or multiplication is inexact. Cleared otherwise. DZ -Always cleared. UNF -Set if the result of the addition, subtraction or multiplication underflows. Cleared otherwise. OVF -Set if the result of the addition, subtraction or multiplication overflows. Cleared otherwise. OPERR-Set if one of the multiply operands is infinity and the other is zero. Set if the addition operands are opposite-signed infinities.
PAGE 332
FMPY//FADDSUB.X FMPY//FADDSUB.X Floating-Point Multiply, Add, and Subtract Operation: Assembler Syntax: S1 * S2 → ROUND(SEP) → D1 FMPY S1,S2,D1 FADDSUB.X D3,D2 (parallel data bus move) D3 + D2 → ROUND(SEP) → D2 (move syntax - see the MOVE instruction description.) FMPY S2,S1,D1 FADDSUB.X D3,D2 D3 - D2 → ROUND(SEP) → D3 (move syntax - see the MOVE instruction description.
PAGE 333
ER Status Bits: INX - Set if the result of the addition, subtraction or multiplication is inexact. Cleared otherwise. DZ - Always cleared. UNF - Set if the result of the addition, subtraction or multiplication underflows. Cleared otherwise. OVF - Set if the result of the addition, subtraction or multiplication overflows. Cleared otherwise. OPERR- Set if one of the multiply operands is infinity and the other is zero. Set if the addition operands are opposite-signed infinities.
PAGE 334
FMPY//FSUB.S Floating-Point FMPY//FSUB.S Multiply and Subtract Operation: Assembler Syntax: S1 * S2 → ROUND(MP) → D1 FMPY S1,S2,D1 FSUB.S S3,D2 (move syntax - see the MOVE instruction description.) (parallel data bus move) D2 - S3 → ROUND(SP) → D2 FMPY S2,S1,D1 FSUB.S S3,D2 (move syntax - see the MOVE instruction description.) Description: Multiply the two operands S1 and S2, round to the precision indicated by the MP mode bit and store the result in the specified destination register D1.
PAGE 335
OPERR-Set if one of the multiply operands is infinity and the other is zero. Set if the subtract operands are like-signed infinities. Cleared otherwise. SNAN -Set if anyone of the source operands is a signaling NaN. Cleared otherwise. NAN -Set if result of the subtraction is a NaN. Cleared otherwise. UNCC -Always cleared. IER Flags: Flags changed according to standard definition. Instruction Format: FMPY S1,S2,D1 FSUB.S S3,D2 scription.
PAGE 336
FMPY S2,S1,D1 FSUB.S S3,D2 (move syntax - see the MOVE instruction description.
PAGE 337
MOTOROLA DSP96002 USER’S MANUAL A - 149
PAGE 338
FMPY//FSUB.X Floating-Point FMPY//FSUB.X Multiply and Subtract Operation: Assembler Syntax: S1 * S2 → ROUND(SEP) → D1 (parallel data bus move) FMPY S1,S2,D1 FSUB.X S3,D2 (move syntax - see the MOVE instruction description.) D2 - S3 → ROUND(SEP) → D2 FMPY S2,S1,D1 FSUB.X S3,D2 (move syntax - see the MOVE instruction description.) Description: Multiply the two operands S1 and S2, round to single extended precision and store the result in the specified destination register D1.
PAGE 339
NAN -Set if result of the subtraction is a NaN. Cleared otherwise. UNCC -Always cleared. IER Flags: Flags changed according to standard definition. Instruction Format: FMPY S1,S2,D1 FSUB.X S3,D2 (move syntax - see the MOVE instruction description.) FMPY S2,S1,D1 FSUB.X S3,D2 (move syntax - see the MOVE instruction description.
PAGE 340
FMPY.S Floating-Point Multiply FMPY.S Operation: Assembler Syntax: S1 * S2 → ROUND(SP) → D (parallel data bus move) FMPY.S S1,S2,D (move syntax - see the MOVE instruction description.) S1 * S2 → ROUND(SP) → D (parallel data bus move) FMPY.S S2,S1,D (move syntax - see the MOVE instruction description.) Description: Multiply the two specified operands S1 and S2, round to single precision and store the result in the destination operand D. Input Operand(s) Precision: SEP Floating-Point.
PAGE 341
Instruction Format: FMPY.S 31 S1,S2,D (move syntax - see the MOVE instruction description.) 14 13 DATA BUS MOVE FIELD 11 1sss SSS1 0 0ddd OPTIONAL EFFECTIVE ADDRESS EXTENSION OR IMMEDIATE LONG DATA Instruction Format: FMPY.S S1,S2(8,9),D (move syntax - see the MOVE instruction description.
PAGE 342
FMPY.X Floating-Point Multiply FMPY.X Operation: Assembler Syntax: S1 * S2 → ROUND(SEP) → D (parallel data bus move) FMPY.X S1,S2,D (move syntax - see the MOVE instruction description.) S1 * S2 → ROUND(SEP) → D (parallel data bus move) FMPY.X S2,S1,D (move syntax - see the MOVE instruction description.) Description: Multiply the two specified operands S1 and S2, round to single extended precision and store the result in the destination operand D. Input Operand(s) Precision: SEP Floating-Point.
PAGE 343
Instruction Format: FMPY.X 31 S1,S2,D (move syntax - see the MOVE instruction description.) 14 13 DATA BUS MOVE FIELD 11 1sss SSS0 0 0ddd OPTIONAL EFFECTIVE ADDRESS EXTENSION OR IMMEDIATE LONG DATA Instruction Format: FMPY.X 31 S1,S2(8,9),D (move syntax - see the MOVE instruction description.
PAGE 344
FNEG.S Negate FNEG.S Operation: Assembler Syntax: 0 - D → ROUND(SP) → D (parallel data bus move) FNEG.S D (move syntax - see the MOVE instruction description.) Description: Subtract the destination operand D from zero, round to single precision and store the result in the destination operand D. Input Operand(s) Precision: SEP Floating-Point. Output Operand Precision: SP Floating-Point. CCR Condition Codes: C - Not affected. V - Not affected. Z - Set if result is zero. Cleared otherwise.
PAGE 345
Instruction Format: FNEG.S D 31 (move syntax - see the MOVE instruction description.
PAGE 346
FNEG.X Negate Operation: FNEG.X Assembler Syntax: 0-D→D (parallel data bus move) FNEG.X D (move syntax - see the MOVE instruction description.) Description: Subtract the destination operand D from zero and store the result in the destination operand D. Input Operand(s) Precision: SEP Floating-Point. Output Operand Precision: SEP Floating-Point. CCR Condition Codes: C - Not affected. V - Not affected. Z - Set if result is zero. Cleared otherwise. N - Set if result is negative. Cleared otherwise.
PAGE 347
Instruction Fields: D Dn (u u) ddd nnn where nnn = 0-7 Timing: 2 + mv + da oscillator clock cycles Memory: 1 + mv program words MOTOROLA DSP96002 USER’S MANUAL A - 159
PAGE 348
FSCALE.S Scale a Floating-Point Operand FSCALE.S Operation: Assembler Syntax: 2S.H * D → ROUND(SP) → D (parallel data bus move) FSCALE.S S,D (move syntax - see the MOVE instruction description.) FSCALE.S #byte,D 2nn * D → ROUND(SP) → D Description: Scale the destination operand D according to the scale factor contained in the 11 LSBs of the high portion of the source register S, round to single precision and store the result in the destination operand D.
PAGE 349
ER Status Bits: INX -Set if result is inexact. Cleared otherwise. DZ -Always cleared. UNF -Set if result underflows. Cleared otherwise. OVF -Set if result overflows. Cleared otherwise. OPERR-Always cleared. SNAN -Set if operand is a signaling NaN. Cleared otherwise. NAN -Set if result is a NaN. Cleared otherwise. UNCC -Always cleared. IER Flags: Flags changed according to standard definition. Instruction Format: FSCALE.S S,D (move syntax - see the MOVE instruction description.
PAGE 350
FSCALE.X Scale a Floating-Point Operand Operation: FSCALE.X Assembler Syntax: 2S.H * D → ROUND(SEP) → D FSCALE.X S,D (move syntax - see the MOVE instruction description.) (parallel data bus move) FSCALE.X #byte,D 2nn * D → ROUND(SEP) → D Description: Scale the destination operand D according to the scale factor contained in the 11 LSBs of the high portion of the source register S, round to single extended precision and store the result in the destination operand D.
PAGE 351
ER Status Bits: INX -Set if result is inexact. Cleared otherwise. DZ -Always cleared. UNF -Set if result underflows. Cleared otherwise. OVF -Set if result overflows. Cleared otherwise. OPERR-Always cleared. SNAN -Set if operand is a signaling NaN. Cleared otherwise. NAN -Set if result is a NaN. Cleared otherwise. UNCC -Always cleared. Instruction Format: FSCALE.X S,D (move syntax - see the MOVE instruction description.
PAGE 352
FSEEDD Reciprocal Approximation Operation: Assembler Syntax: Approximation(1/S) → D FSEEDD S,D FSEEDD Description: Take the contents of the specified source operand S, determine an approximation to 1.0/S, and store the result in the destination operand D. The 9 MSBs of the destination significand are determined by using a lookup ROM. The remaining bits of the significand are zeroed. This instruction is useful for initializing floating-point divide algorithms.
PAGE 353
ER Status Bits: INX -Always cleared. DZ -Always cleared. UNF -Set if result underflows. Cleared otherwise. OVF -Set if result overflows. Cleared otherwise. OPERR-Always cleared. SNAN -Set if the source operand is a signaling NaN. Cleared otherwise. NAN -Set if result is a NaN. Cleared otherwise. UNCC -Always cleared. IER Flags: Flags changed according to standard definition.
PAGE 354
FSEEDR Square Root Reciprocal Approximation Operation: Assembler Syntax: Approximation(1/SQRT(S)) → D FSEEDR S,D FSEEDR Description: Take the contents of the specified source operand S, determine an approximation to sqrt(1.0/S), and store the result in the destination operand D. The 9 MSBs of the destination significand are determined by using a lookup ROM. The remaining bits of the significand are zeroed. This instruction is useful for initializing floating-point square root algorithms.
PAGE 355
ER Status Bits: INX -Always cleared. DZ -Always cleared. UNF -Always cleared. OVF -Always cleared. OPERR-Set if the source operand is less than zero. Cleared otherwise. SNAN -Set if the source operand is a signaling NaN. Cleared otherwise. NAN -Set if result is a NaN. Cleared otherwise. UNCC -Always cleared. IER Flags: Flags changed according to standard definition.
PAGE 356
FSUB.S Floating-Point Subtract FSUB.S Operation: Assembler Syntax: D - S → ROUND(SP) → D (parallel data bus move) FSUB.S S,D (move syntax - see the MOVE instruction description.) Description: Subtract the two specified operands, round to single precision and store the result in the destination operand D. For the special case of |S| = |D|, the result can be +0 or -0; the sign of the resulting zero will be the sign of the input operand in D. Input Operand(s) Precision: SEP Floating-Point.
PAGE 357
Instruction Format: FSUB.S S,D (move syntax - see the MOVE instruction description.
PAGE 358
FSUB.X Floating-Point Subtract FSUB.X Operation: Assembler Syntax: D - S → ROUND(SEP) → D (parallel data bus move) FSUB.X S,D (move syntax - see the MOVE instruction description.) Description: Subtract the two specified operands, round to single extended precision and store the result in the destination operand D. For the special case of |S| = |D|, the result can be +0 or -0; the sign of the resulting zero will be the sign of the input operand in D. Input Operand(s) Precision: SEP Floating-Point.
PAGE 359
Instruction Format: FSUB.X S,D (move syntax - see the MOVE instruction description.
PAGE 360
FTFR.S Transfer Floating-Point Data ALU Register FTFR.S Operation: Assembler Syntax: S → ROUND(SP) → D (parallel data bus move) FTFR.S S,D (move syntax - see the MOVE instruction description.) Description: Take the contents of the specified source operand S, round to single precision and store the result in the destination operand D. If S and D are the same register, this is equivalent to “Round to Single Precision” instruction. Input Operand(s) Precision: SEP Floating-Point.
PAGE 361
Instruction Format: FTFR.S S,D (move syntax - see the MOVE instruction description.
PAGE 362
FTFR.X Transfer Floating-Point Data ALU Register Operation: FTFR.X Assembler Syntax: S→D (parallel data bus move) FTFR.X S,D (move syntax - see the MOVE instruction description.) Description: Take the contents of the specified source operand S and store in the destination operand D. Input Operand(s) Precision: SEP Floating-Point. Output Operand Precision: SEP Floating-Point. CCR Condition Codes: C - Not affected. V - Not affected. Z - Set if result is zero. Cleared otherwise.
PAGE 363
Instruction Format: FTFR.X S,D (move syntax - see the MOVE instruction description.
PAGE 364
FTRAPcc Conditional Software Interrupt Operation: Assembler Syntax: If cc, then begin software exception processing. FTRAPcc FTRAPcc Description: If the specified floating-point condition is true, normal instruction execution is suspended and software exception processing is initiated. The interrupt priority level (I1,I0) is set to 3 in the status register if a long interrupt service routine is used. If the specified condition is false, continue with the next instruction. See Section A.
PAGE 365
INX - Not affected. DZ - Not affected. UNF - Not affected. OVF - Not affected. OPERR- Not affected. SNAN - Not affected. UNCC - Set if NAN is set and a non-aware floating-point condition is tested ("cc" conditions marked "YES" above). Not affected otherwise. IER Flags: SINX - Not affected. SDZ - Not affected. SUNF - Not affected. SOVF - Not affected. SIOP - Set if NAN is set and a non-aware floating-point condition is tested ("cc" conditions marked "YES" above). Not affected otherwise.
PAGE 366
FTST Test a Floating-Point Operand Operation: S-0 FTST Assembler Syntax: (parallel data bus move) FTST S (move syntax - see the Move instruction description.) Description: Compare the specified operand with zero. No result is stored, however, the condition codes are affected as described. Input Operand(s) Precision: SEP Floating-Point. Output Operand Precision: n.a.
PAGE 367
Instruction Format: FTST 31 S (move syntax - see the Move instruction description.
PAGE 368
GETEXP Extract Exponent Operation: GETEXP Assembler Syntax: Exponent(S) → D.L (parallel data bus move) GETEXP S,D (move syntax - see the Move instruction description.) Description: Extract the exponent of the single extended precision floating-point operand S and store it as an unbiased, 2’s complement, 32-bit integer in the low portion of D . The exponent value is decremented by the number of shifts needed to normalize the mantissa if the floating-point number was denormalized.
PAGE 369
ER Status Bits: INX -Always cleared. DZ -Always cleared. UNF -Always cleared. OVF -Always cleared. OPERR-Set if the source operand is infinity, zero or NaN. Cleared otherwise. SNAN -Set if the source operand is a signaling NaN. Cleared otherwise. NAN -Set if the source operand is a NaN. Cleared otherwise. UNCC -Always cleared. IER Flags: Flags changed according to standard definition.
PAGE 370
ILLEGAL Illegal Instruction Interrupt Operation: Assembler Syntax: Begin Illegal Instruction exception processing. ILLEGAL ILLEGAL Description: Normal instruction execution is suspended and Illegal Instruction exception processing is initiated. The interrupt priority level (I1,I0) is set to 3 in the status register if a long interrupt service routine is used. CCR Condition Codes: Not affected. ER Status Bits: Not affected. IER Flags: Not affected.
PAGE 371
INC Increment by One Operation: INC Assembler Syntax: D.L + 1 → D.L (parallel data bus move) INC D (move syntax - see the Move instruction description.) Description: Increment by one the low portion of the specified operand. The result is stored in the low portion of D. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: 32-bit integer. CCR Condition Codes: C - Set if carry is generated from the MSB of the result. Cleared otherwise. V - Set if result overflows.
PAGE 372
INT Floating-Point to Integer Conversion Operation: INT Assembler Syntax: Integer(D) → D.L (parallel data bus move) INT D (move syntax - see the Move instruction description.) Description: Convert the specified floating-point operand to 32-bit, 2’s complement integer. The rounding mode is that programmed in the SR. The result is stored in the low portion of D. The high and middle portions of D remain unchanged.
PAGE 373
OVF -Always cleared. OPERR-Set if source operand is a NaN or infinity. Set if overflow occurred. Cleared otherwise. SNAN -Set if operand is a signaling NaN. Cleared otherwise. NAN -Set if source operand is a NaN. Cleared otherwise. UNCC -Always cleared. IER Flags: Flags changed according to standard definition. Instruction Format: INT D (move syntax - see the Move instruction description.
PAGE 374
INTRZ Floating-Point INTRZ to Integer Conversion with Round to Zero Operation: Assembler Syntax: Integer(D) → D.L (parallel data bus move) INTRZ D (move syntax - see the Move instruction description.) Description: Convert the specified floating-point operand to 32-bit, 2’s complement integer rounding towards zero. The result is stored in the low portion of D. The high and middle portions of D remain unchanged. Since this operation is frequently required (e. g.
PAGE 375
ER Status Bits: INX -Set if the floating-point operand has no exact integer representation. Cleared otherwise. DZ -Always cleared. UNF -Always cleared. OVF -Always cleared. OPERR-Set if source operand is a NaN or infinity. Set if overflow occurred. Cleared otherwise. SNAN -Set if operand is a signaling NaN. Cleared otherwise. NAN -Set if source operand is a NaN. Cleared otherwise. UNCC -Always cleared. IER Flags: Flags changed according to standard definition.
PAGE 376
INTU Floating-Point to Unsigned Integer Conversion INTU Operation: Assembler Syntax: Unsigned Integer(D) → D.L (parallel data bus move) INTU D (move syntax - see the Move instruction description.) Description: Convert the specified floating-point operand to 32-bit, unsigned integer. The rounding mode is that specified in the SR. The result is stored in the low portion of D. The high and middle portions of D remain unchanged.
PAGE 377
ER Status Bits: INX -Set if the floating-point operand has no exact integer representation. Cleared otherwise. DZ -Always cleared. UNF -Always cleared. OVF -Always cleared. OPERR-Set if source operand is a NaN, infinity or negative non-zero. Also set if overflow occurred. Cleared otherwise. SNAN -Set if operand is a signaling NaN. Cleared otherwise. NAN -Set if source operand is a NaN. Cleared otherwise. UNCC -Always cleared. IER Flags: Flags changed according to standard definition.
PAGE 378
INTURZ Floating-Point INTURZ to Unsigned Integer with Round to Zero Operation: Assembler Syntax: Unsigned Integer(D) → D.L (parallel data bus move) INTURZ D (move syntax - see the Move instruction description.) Description: Convert the specified floating-point operand to 32-bit, unsigned integer rounding towards zero. The result is stored in the low portion of D. The high and middle portions of D remain unchanged. Since this operation is frequently required (e. g.
PAGE 379
ER Status Bits: INX -Set if the floating-point operand has no exact integer representation. Cleared otherwise. DZ -Always cleared. UNF -Always cleared. OVF -Always cleared. OPERR-Set if source operand is a NaN, infinity or negative non-zero. Also set if overflow occurred. Cleared otherwise. SNAN -Set if operand is a signaling NaN. Cleared otherwise. NAN -Set if source operand is a NaN. Cleared otherwise. UNCC -Always cleared. IER Flags: Flags changed according to standard definition.
PAGE 380
Jcc Jump Conditionally Operation: Assembler Syntax: If cc, then xx → PC else PC + 1 → PC Jcc label (short) If cc, then ea → PC else PC + 1 → PC Jcc ea Jcc Description: If the specified condition is true, program execution continues at a location specified by an effective address in the instruction. If the specified condition is false, the PC is incremented and the effective address is ignored.
PAGE 381
Instruction Format: Jcc 31 0000 label (short) 14 13 0011 10aa Instruction Format: Jcc aaaa aa 0 1c cccc 1aaa aaaa ea 31 14 13 0000 0011 0000 MMMR 0 1c cccc 1000 0000 OPTIONAL EFFECTIVE ADDRESS EXTENSION Instruction Fields: ea Rn - R0-R7 (Memory alterable addressing modes only) Absolute Address - 32 bits Short Jump Address - aaaaaaaaaaaaaaa (15 bits) Mnemonic EQ PL CC(HS) GE GT VC HI c 0 0 0 0 0 0 0 c 1 1 1 1 1 1 1 c 0 0 0 0 1 1 1 c 0 0 1 1 0 0 1 c 0 1 0 1 0 1 0 Mnemonic NE(Q) M
PAGE 382
JCLR Jump if Bit Clear Operation: Assembler Syntax: If S{n} = 0, then xxxx → PC else PC + 1 → PC If S{n} = 0, then xxxx → PC else PC + 1 → PC If S{n} = 0, then xxxx → PC else PC + 1 → PC If S{n} = 0, then xxxx → PC else PC + 1 → PC If S{n} = 0, then xxxx → PC else PC + 1 → PC If S{n} = 0, then xxxx → PC else PC + 1 → PC If S{n} = 0, then xxxx → PC else PC + 1 → PC JCLR #bit,X: ea, label JCLR #bit,X: aa, label JCLR #bit,X: pp, label JCLR #bit,Y: ea, label JCLR #bit,Y: aa, label JCLR #bit,Y: p
PAGE 383
Instruction Format: JCLR 31 0000 0010 #bit,S,label 14 13 1011 dddd dd d0 0 0100 100b bbbb ABSOLUTE ADDRESS EXTENSION Instruction Format: JCLR JCLR #bit,X: pp, label #bit,Y: pp, label 31 14 13 0000 0010 1010 1ppp pp pp 0 010S 100b bbbb ABSOLUTE ADDRESS EXTENSION Instruction Format: JCLR JCLR #bit,X: aa, label #bit,Y: aa, label 31 14 13 0000 0010 1010 0aaa aa aa 0 010S 100b bbbb ABSOLUTE ADDRESS EXTENSION Instruction Format: JCLR JCLR #bit,X: ea, label #bit,Y: ea, label
PAGE 384
D ddddddd D0.S-D7.S 0000nnn D0.L-D7.L 0001nnn D0.M-D7.M 0010nnn D0.H-D7.H 0011nnn D8.L 0100000 D9.L 0100001 D8.M 0100010 D9.M 0100011 D8.H 0100100 D9.H 0100101 D8.S 0100110 D9.
PAGE 385
JMP Jump JMP Operation: Assembler Syntax: xx → PC JMP label (short) ea → PC JMP ea Description: Program execution continues at the effective address in program memory. All memory alterable addressing modes may be used for the effective address. A fast Short Jump addressing mode may also be used. The 15-bit data is sign extended to form the effective address. See Section A.10 for restrictions. CCR Condition Codes: Not affected. ER Status Bits: Not affected. IER Flags: Not affected.
PAGE 386
JOIN Join Two 16-bit Integers JOIN Operation: Assembler Syntax: S.L {15:0} → D.L {31:16} (parallel data bus move) JOIN S,D (move syntax - see the Move instruction description.) D.L {15:0} → D.L {15:0} Description: Transfer the 16 LSBs of the lower portion of source operand S into the 16 MSBs of the lower portion of destination D. The 16 LSBs of the lower portion of D remain unchanged. Input Operand(s) Precision: 16-bit integer. Output Operand Precision: 32-bit integer.
PAGE 387
JOINB Join Two 8-bit Integers JOINB Operation: Assembler Syntax: D.L {7:0} → D.L {7:0} (parallel data bus move) JOINB S,D (move syntax - see the Move instruction description.) S.L {7:0} → D.L {15:8} 0 → D.L {31:16} Description: Transfer the 8 LSBs of the lower portion of source operand S into bits 15-8 of the lower portion of destination D. The 8 LSBs of the lower portion of D remain unchanged. The 16 MSBs of the lower portion of D are zeroed. Input Operand(s) Precision: 8-bit integer.
PAGE 388
JScc Jump to Subroutine Conditionally Operation: Assembler Syntax: If cc, then PC → SSH; SR → SSL; xx → PC else PC + 1 → PC JScc label (short) If cc, then PC → SSH; SR → SSL; ea → PC else PC + 1 → PC JScc ea JScc Description: If the specified condition is true, the address of the instruction immediately following the JScc instruction and the status register are pushed onto the stack. Program execution then continues at the effective address in program memory.
PAGE 389
Instruction Format: JScc 31 0000 label (short) 0011 11aa Instruction Format: JScc 14 13 aaaa aa 0 1c cccc 1aaa aaaa ea 31 14 13 0000 0011 0100 MMMR 0 1c cccc 1000 0000 OPTIONAL EFFECTIVE ADDRESS EXTENSION Instruction Fields: ea Rn - R0-R7 (Memory alterable addressing modes only) Absolute Address - 32 bits Short Jump Address - aaaaaaaaaaaaaaa (15 bits) Mnemonic EQ PL CC(HS) GE GT VC HI AL c 0 0 0 0 0 0 0 1 c 1 1 1 1 1 1 1 1 c 0 0 0 0 1 1 1 1 c 0 0 1 1 0 0 1 1 c 0 1 0 1 0 1 0 1
PAGE 390
JSCLR Jump to Subroutine if Bit Clear Operation: Assembler Syntax: If S{n} = 0, then PC → SSH; SR → SSL; xxxx → PC else PC + 1 → PC JSCLR #bit,X: ea, label If S{n} = 0, then PC → SSH; SR → SSL; xxxx → PC else PC + 1 → PC JSCLR #bit,X: aa, label If S{n} = 0, then PC → SSH; SR → SSL; xxxx → PC else PC + 1 → PC JSCLR #bit,X: pp, label If S{n} = 0, then PC → SSH; SR → SSL; xxxx → PC else PC + 1 → PC JSCLR #bit,Y: ea, label If S{n} = 0, then PC → SSH; SR → SSL; xxxx → PC else PC + 1 → PC JSCLR
PAGE 391
Instruction Format: JSCLR 31 0000 0010 #bit,S,label 1111 14 13 dddd dd d0 0 0100 100b bbbb ABSOLUTE ADDRESS EXTENSION Instruction Format: JSCLR JSCLR #bit,X: pp, label #bit,Y: pp, label 31 14 13 0000 0010 1110 1ppp pp pp 0 010S 100b bbbb ABSOLUTE ADDRESS EXTENSION Instruction Format: JSCLR JSCLR #bit,X: aa, label #bit,Y: aa, label 31 14 13 0000 0010 1110 0aaa aa aa 0 010S 100b bbbb ABSOLUTE ADDRESS EXTENSION Instruction Format: JSCLR JSCLR #bit,X: ea, label #bit,Y: ea,
PAGE 392
Rn - R0-R7 (Address Register Indirect Modes except (Rn+xxx) ) Absolute Address - 32 bits Immediate Short Data - bbbbb (5 bits) Absolute Short Address - aaaaaaa (7 bits) I/O Short Address - ppppppp (7 bits) D ddddddd D0.S-D7.S 0000nnn D0.L-D7.L 0001nnn D0.M-D7.M 0010nnn D0.H-D7.H 0011nnn D8.L 0100000 D9.L 0100001 D8.M 0100010 D9.M 0100011 D8.H 0100100 D9.H 0100101 D8.S 0100110 D9.
PAGE 393
JSET Jump if Bit Set Operation: If S{n} = 1, then xxxx → PC else PC + 1 → PC Assembler Syntax: JSET #bit,X: ea, label If S{n} = 1, then xxxx → PC else PC + 1 → PC JSET #bit,X: aa, label If S{n} = 1, then xxxx → PC else PC + 1 → PC JSET #bit,X: pp, label If S{n} = 1, then xxxx → PC else PC + 1 → PC JSET #bit,Y: ea, label JSET #bit,Y: aa, label JSET #bit,Y: pp, label JSET #bit,S,label If S{n} = 1, then xxxx → PC else PC + 1 → PC If S{n} = 1, then xxxx → PC else PC + 1 → PC JSET If S{n} =
PAGE 394
Instruction Format: JSET 31 0000 0010 #bit,S,label 14 13 1011 dddd dd 0 d0 1100 100b bbbb ABSOLUTE ADDRESS EXTENSION Instruction Format: JSET JSET #bit,X: pp, label #bit,Y: pp, label 31 14 13 0000 0010 1010 1ppp pp 0 pp 110S 100b bbbb ABSOLUTE ADDRESS EXTENSION Instruction Format: JSET JSET #bit,X: aa, label #bit,Y: aa, label 31 14 13 0000 0010 1010 0aaa aa 0 aa 110S 100b bbbb ABSOLUTE ADDRESS EXTENSION Instruction Format: JSET JSET #bit,X: ea, label #bit,X: ea, label
PAGE 395
D ddddddd D0.S-D7.S 0000nnn D0.L-D7.L 0001nnn D0.M-D7.M 0010nnn D0.H-D7.H 0011nnn D8.L 0100000 D9.L 0100001 D8.M 0100010 D9.M 0100011 D8.H 0100100 D9.H 0100101 D8.S 0100110 D9.
PAGE 396
JSR Jump to Subroutine Operation: Assembler Syntax: PC → SSH; SR → SSL; xx → PC JSR label (short) PC → SSH; SR → SSL; ea → PC JSR ea JSR Description: The address of the instruction immediately following the JSR instruction and the status register are pushed onto the stack. Program execution then continues at the effective address in program memory. All memory alterable addressing modes may be used for the effective address. A fast Short Jump addressing mode may also be used.
PAGE 397
JSSET Jump to Subroutine if Bit Set Operation: If S{n} = 1, then PC → SSH; SR → SSL; xxxx → PC else PC + 1 → PC Assembler Syntax: JSSET #bit,X: ea, label JSSET #bit,X: aa, label JSSET #bit,X: pp, label JSSET #bit,Y: ea, label If S{n} = 1, then PC → SSH; SR → SSL; xxxx → PC else PC + 1 → PC JSSET #bit,Y: aa, label If S{n} = 1, then PC → SSH; SR → SSL; xxxx → PC else PC + 1 → PC JSSET #bit,Y: pp, label JSSET #bit,S,label If S{n} = 1, then PC → SSH; SR → SSL; xxxx → PC else PC + 1 → PC If S{n}
PAGE 398
Instruction Format:JSSET 31 0000 0010 #bit,S,label 14 13 1011 dddd dd 0 d0 1100 100b bbbb ABSOLUTE ADDRESS EXTENSION Instruction Format: JSSET JSSET #bit,X: pp, label #bit,Y: pp, label 31 14 13 0000 0010 1010 1ppp pp 0 pp 110S 100b bbbb ABSOLUTE ADDRESS EXTENSION Instruction Format: JSSET JSSET #bit,X: aa, label #bit,Y: aa, label 31 14 13 0000 0010 1010 0aaa aa 0 aa 110S 100b bbbb ABSOLUTE ADDRESS EXTENSION Instruction Format: JSSET JSSET #bit,X: ea, label #bit,Y: ea,
PAGE 399
D ddddddd D0.S-D7.S 0000nnn D0.L-D7.L 0001nnn D0.M-D7.M 0010nnn D0.H-D7.H 0011nnn D8.L 0100000 D9.L 0100001 D8.M 0100010 D9.M 0100011 D8.H 0100100 D9.H 0100101 D8.S 0100110 D9.
PAGE 400
LEA Load Effective Address Operation: Assembler Syntax: ea → D LEA ea,D Rn+xxxx → D LEA (Rn+displacement),D LEA Description: The address calculation specified is executed and the resulting effective address is stored in the destination register. The source address registers are not affected. Post-update and Long Displacement address register indirect addressing modes may be used. Note that if D is SSH, the SP will be preincremented by one. CAUTION See restrictions in Section A.10.
PAGE 401
ER Status Bits: For destination operand SR: INX -Set according to bit 8 of the source operand. DZ -Set according to bit 9 of the source operand. UNF -Set according to bit 10 of the source operand. OVF -Set according to bit 11 of the source operand. OPERR-Set according to bit 12 of the source operand. SNAN -Set according to bit 13 of the source operand. NAN -Set according to bit 14 of the source operand. UNCC -Set according to bit 15 of the source operand.
PAGE 402
Instruction Format: LEA 31 0000 0000 ea,D 14 13 0100 Instruction Format: LEA 31 0000 0000 0MMR 10 0 0000 1ddd dddd (Rn+displacement),D 14 13 0100 000R 00 0 0000 1ddd dddd LONG DISPLACEMENT Instruction Fields: ea Rn - R0-R7 (Post-update addressing modes only) Long Displacement - 32 bits D D0.S-D7.S D0.L-D7.L D0.M-D7.M D0.H-D7.H D8.L D9.L D8.M D9.M D8.H D9.H D8.S D9.
PAGE 403
LRA LRA Load PC Relative Address Assembler Syntax: Operation: LRA Rn,D PC+Rn → D LRA label,D Description: The PC is added to the specified displacement and the result is stored in destination D. The PC contains the address of the next instruction. The displacement is a 2’s complement 32-bit integer that represents the relative distance from the current PC to the destination PC. Long Displacement and Address Register PC Relative addressing modes may be used. See Section A.10 for restrictions.
PAGE 404
PC+xxxx → D A - 216 DSP96002 USER’S MANUAL MOTOROLA
PAGE 405
ER Status Bits: For destination operand SR: INX -Set according to bit 8 of the source operand. DZ -Set according to bit 9 of the source operand. UNF -Set according to bit 10 of the source operand. OVF -Set according to bit 11 of the source operand. OPERR-Set according to bit 12 of the source operand. SNAN -Set according to bit 13 of the source operand. NAN -Set according to bit 14 of the source operand. UNCC -Set according to bit 15 of the source operand.
PAGE 406
Instruction Format: LRA 31 0000 Rn,D 14 13 0000 0100 Instruction Format: LRA 001R 00 0 0000 0ddd dddd label,D 31 14 13 0000 0000 0100 000R 00 00 0 0000 0ddd dddd OPTIONAL LONG DISPLACEMENT EXTENSION Instruction Fields: Rn - R0-R7 Long Displacement - 32 bits D D0.S-D7.S D0.L-D7.L D0.M-D7.M D0.H-D7.H D8.L D9.L D8.M D9.M D8.H D9.H D8.S D9.
PAGE 407
LSL Logical Shift Left LSL Operation: 31 0 C 0 (parallel data bus move) Assembler Syntax: LSL D (move syntax - see the Move instruction description.) LSL S,D (move syntax - see the Move instruction description.) LSL #bits,D Description: Single-bit shift: Logically shift the low portion of the specified operand one bit to the left. The carry bit receives the MSB shifted out of the low portion of the source operand. A zero is shifted into the least significant bit of the destination operand.
PAGE 408
Instruction Format: LSL 31 D (move syntax - see the Move instruction description.) 14 13 DATA BUS MOVE FIELD 10 0100 0 uu01 1ddd OPTIONAL EFFECTIVE ADDRESS EXTENSION OR IMMEDIATE LONG DATA Instruction Format: LSL S.H,D (move syntax - see the Move instruction description.
PAGE 409
LSR Logical Shift Right LSR Operation: 31 0 0 C (parallel data bus move) Assembler Syntax: LSR D (move syntax - see the Move instruction description.) LSR S,D (move syntax - see the Move instruction description.) LSR #shift,D Description: Single-bit shift: Logically shift the low portion of the specified operand one bit to the right. The carry bit receives the LSB shifted out of the low portion of the source operand. A zero is shifted into bit 31 of the operand.
PAGE 410
Instruction Format: LSR 31 D (move syntax - see the Move instruction description.) 14 13 DATA BUS MOVE FIELD 10 0000 0 uu01 1ddd OPTIONAL EFFECTIVE ADDRESS EXTENSION OR IMMEDIATE LONG DATA Instruction Format: LSR S.H,D (move syntax - see the Move instruction description.
PAGE 411
MOVE Move Data Registers Operation: Assembler Syntax: Parallel data bus move MOVE MOVE (See the MOVE instruction description.) Description: Move the contents of the specified source to the specified destination. This instruction is a Data ALU NOP instruction with the parallel data move operations described in the following pages. Some parallel data move operations differentiate between integer or floating-point operands according to the kind of Data ALU operation specified.
PAGE 412
Move Move No Parallel Data Move Operation: Assembler Syntax: Opcode Operation – none Opcode-Operands Description: No data bus move activity. Instruction Format: Opcode-operands 31 14 13 0000 0000 0110 0000 01 uu 0 uuuu uuuu uuuu Instruction Fields: None.
PAGE 413
Move R Register To Register Parallel Move Operation: Move R Assembler Syntax: Opcode Operation S1 → D1 Opcode-Operands S1,D1 Opcode Operation S2 → D2 Opcode-Operands S2,D2 Description: Move the source register to the destination register. Single precision to single precision moves (S1,D1) or double precision to double precision moves (S2,D2) may be specified.
PAGE 414
Instruction Fields: S1 or D DD DD D D1 dddddd D0.S-D7.S 000nnn D0.L-D7.L 001nnn D0.M-D7.M 010nnn D0.H-D7.H 011nnn D8.S D9.S D8.L D9.L D8.M D9.M D8.H D9.H 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 00 01 10 11 00 01 10 11 R0-R7 N0-N7 M0-M7 101nnn 110nnn 111nnn where nnn = 0-7 S2 or D2 D0.ML-D7.ML D0.D-D7.D D DD DD ddddd 1 1 n n n where nnn = 0-7 10nnn reserved 01x xx D9.ML D8.ML D9.D D8.
PAGE 415
Move U Move Update (Effective Address Calculation) Operation: Move U Assembler Syntax: Opcode Operation ea Opcode-Operands ea Description: The specified effective address calculation is executed. The specified address register is updated according to the addressing mode. All update addressing modes may be used. The No Update mode (Rn) is useful, in conjunction with the MOVETA instruction, to test address registers.
PAGE 416
Move X: X Memory Move Operation: X: → Move X: Assembler Syntax: D X: ea, D X: → D X:(Rn+displacement),D S → X: S,X: ea S → X: S,X:(Rn+displacement) #xxxx → D #Data,D Description: Move one word operand to/from X memory. One effective address is specified. All memory addressing modes, including absolute address and immediate data, may be used. Long displacement addressing may also be used. A memory to register or register to memory direction may be specified.
PAGE 417
Instruction Format - Opcode-operands: S,X: ea X: ea, D #Data,D 31 14 13 0011 W0DD DDDD MMMR uu 0 uuuu uuuu uuuu OPTIONAL EFFECTIVE ADDRESS EXTENSION OR IMMEDIATE LONG DATA Instruction Format - Opcode-operands: 31 0000 11DD DDDD S,X:(Rn+displacement) X:(Rn+displacement),D 14 13 0W1R uu 0 uuuu uuuu uuuu LONG DISPLACEMENT MOTOROLA DSP96002 USER’S MANUAL A - 229
PAGE 418
Instruction Fields: Rn - R0-R7 (Memory addressing modes only) Register Read S Write D W 0 1 S1 or D1 D0.S-D7.S D0.L-D7.L D0.M-D7.M D0.H-D7.H D DD DD D dddddd 000nnn 001nnn 010nnn 011nnn D8.S D9.S D8.L D9.L D8.M D9.M D8.H D9.
PAGE 419
Move X: R Move X: R X Memory and Register Move Operation: Assembler Syntax: X: → D1 S2 → D2 X: ea, D1 S2,D2 S1 → S2 → D2 S1,X: ea S2,D2 S2 → #Data,D1 S2,D2 X: #xxxx → D1 D2 Description: Move one word operand to/from X memory and one word operand from register to register. One effective address is specified. A memory to register or register to memory direction may be specified in the effective address.
PAGE 420
Instruction Fields: Rn - R0-R7 (Memory addressing modes only) Register W Read S1 0 Write D1 1 Integer Opcodes Floating-Point Opcodes S1,D1 D0.L-D7.L S1,D1 D0.S-D7.S XXX nnn XXX nnn where nnn = 0-7 S2 d D4.L 0 D5.L 0 D6.L 1 D7.L 1 d 0 1 0 1 D2 D0.L D1.L D2.L D3.L Y YY 000 001 010 011 S2 D4.S D5.S D6.S D7.S d 0 0 1 1 d 0 1 0 1 D2 D0.S D1.S D2.S D3.S Y YY 000 001 010 011 D0.L 0 D1.L 0 D2.L 1 D3.L 1 0 1 0 1 D4.L D5.L D6.L D7.L 1 1 1 1 D0.S D1.S D2.S D3.S 0 0 1 1 0 1 0 1 D4.S D5.S D6.
PAGE 421
Move Y: Move Y: Y Memory Move Operation: Assembler Syntax: Opcode Operation Y: → Opcode Operation D Opcode-Operands Y: ea, D Y: → D Opcode-Operands Y:(Rn+displacement),D Opcode Operation S → Y: Opcode-Operands S,Y: ea Opcode Operation S → Y: #xxxx → D Opcode-Operands S,Y:(Rn+displacement) Opcode Operation Opcode-Operands #Data,D Description: Move one word operand to/from Y memory. One effective address is specified.
PAGE 422
Instruction Format - Opcode-operands: S,Y: ea Y: ea, D #Data,D 31 14 13 0011 W1DD DDDD MMMR uu 0 uuuu uuuu uuuu OPTIONAL EFFECTIVE ADDRESS EXTENSION OR IMMEDIATE LONG DATA Instruction Format - Opcode-operands: S,Y:(Rn+displacement) Y:(Rn+displacement),D 31 14 13 0000 11DD DDDD 1W1R uu 0 uuuu uuuu uuuu LONG DISPLACEMENT Register W Read S 0 Write D 1 S1 or D1 D0.S-D7.S D0.L-D7.L D0.M-D7.M D0.H-D7.H D DD DD D dddddd 000nnn 001nnn 010nnn 011nnn D8.S D9.S D8.L D9.L D8.M D9.M D8.H D9.
PAGE 423
Move Y: R Move Y: R Y Memory and Register Move Operation: Assembler Syntax: Opcode Operation S1 → D1 Y: → Opcode Operation S1 → D1 S2 → Opcode Operation S1 → D1 #xxxx → D2 D2 Y: Opcode-Operands Opcode-Operands Opcode-Operands S1,D1 S1,D1 S1,D1 Y: ea, D2 S2,Y: ea #Data,D2 Description: Move one word operand to/from Y memory and one word operand from register to register. One effective address is specified.
PAGE 424
Register Read S2 Write D2 W 0 1 Integer Opcodes Floating-Point Opcodes S2,D2 D0.L-D7.L S2,D2 D0.S-D7.S YYY nnn YYY nnn where nnn = 0-7 S1 d D4.L 0 D5.L 0 D6.L 1 D7.L 1 d 0 1 0 1 D1 D0.L D1.L D2.L D3.L X XX 000 001 010 011 S1 D4.S D5.S D6.S D7.S d 0 0 1 1 d 0 1 0 1 D1 D0.S D1.S D2.S D3.S X XX 000 001 010 011 D0.L 0 D1.L 0 D2.L 1 D3.L 1 0 1 0 1 D4.L D5.L D6.L D7.L 1 1 1 1 D0.S D1.S D2.S D3.S 0 0 1 1 0 1 0 1 D4.S D5.S D6.S D7.
PAGE 425
Move L: Move L: Long Memory Move Operation: Assembler Syntax: X: → D(MS) Y: → X: → D(MS) Y: → D(LS) S(MS) → X: S(LS) → Y: S(MS) → X: S(LS) → Y: D(LS) L: ea, D L:(Rn+displacement),D S,L: ea S,L:(Rn+displacement) Description: This instruction allows long word operand data moves to/from one effective address in L (X:Y) memory.
PAGE 426
Instruction Fields: Rn - R0-R7 (Memory alterable addressing modes only) Register Read S Write D W 0 1 S2 or D2 D0.ML-D7.ML D0.D-D7.D D DD DD ddddd 1 1 n n n where nnn = 0-7 10nnn D9.ML D8.ML D9.D D8.
PAGE 427
Move X: Y: Move X: Y: XY Memory Operation: Assembler Syntax: X: → D1 Y: → D2 X: ea, D1 Y: ea, D2 X: → D1 S2 → Y: X: ea, D1 S2,Y: ea S1 → X: Y: → D2 S1,X: ea Y: ea, D2 S1 → X: S2 → Y: S1,X: ea S2,Y: ea X: → D1 Y:<> → D2 X: ea, D1 Y:,D2 S1 → S2 → Y:<> S1,X: ea S2,Y: X: → D1 Y:<> → D2 X:(Rn+displacement),D1 Y:,D2 S1 → S2 → Y:<> S1,X:(Rn+displacement) S2,Y: X: X: Description: Move two word operands to/from X and
PAGE 428
Instruction Format - Opcode-operands: 31 1mmw WrYY YXXX rMMR Instruction Format - Opcode-operands: 31 0010 1WYY YXXX X: ea, D1 X: ea, D1 S1,X: ea S1,X: ea 14 13 Y: ea, D2 S2,Y: ea Y: ea, D2 S2,Y: ea 0 uu uuuu X: ea, D1 S1,X: ea 14 13 MMMR uuuu uuuu Y:,D2 S2,Y: 0 uu uuuu uuuu uuuu Instruction Format - Opcode-operands:X: ea, D1(8,9) Y:,D2(8,9) 31 14 13 0001 010W Y11X MMMR uu uuuu Instruction Format - Opcode-operands: X:(Rn+displacement),D1 S1,X:(Rn+displacement) S2,Y: 31 14 13 0000
PAGE 429
For a single effective address: Register Read S1,S2 Write D1,D2 W 0 1 Effective Address X: ea = Y: ea MMM RRR X: ea = Y: ea RRR (Memory alterable addressing modes only) (Long displacement addressing mode) Integer Opcodes S1,D1 X XX D0.L-D7.L n n n Floating-Point Opcodes S1,D1 X XX D0.7.S nnn where nnn = 0-7 S2,D2 Y YY D0.7-D7.L n n n S2,D2 Y YY D0.S-D7.S n n n S1,D1 D8.L D9.L X 0 1 S1,D1 D8.S D9.S X 0 0 S2,D2 D8.L D9.L Y 0 1 S2,D2 D8.S D9.
PAGE 430
Move FFcc Move FFcc Floating-Point iF Conditional Instruction without CCR, ER, IER update Operation: Assembler Syntax: If cc, then Opcode-Operands S,D Opcode-Operands FFcc Opcode Operation S → D FFcc Description: If the specified floating-point condition is true, transfer data from the specified source S to the specified destination D. Also, store result(s) of the specified Data ALU operation. If the specified condition is false, no destinations are altered.
PAGE 431
CAUTION See restrictions in Section A.10.6 concerning Rn, Mn, and Nn registers as a destination. CCR Condition Codes: C - Not affected. V - Not affected. Z - Not affected. N - Not affected. I - Not affected. LR - Not affected. – R - Not affected. A - Not affected. ER Status Bits: INX - Not affected. DZ - Not affected. UNF - Not affected. OVFD - Not affected. OPERR - Not affected. SNAN - Not affected. NAN - Not affected.
PAGE 432
Instruction Fields: S t t t Rn n n n D TTT Rn nn n Mnemonic GT LT GE LE GL INF GLE OR EQ PL ERR where nnn = 0-7 where nnn = 0-7 c 0 0 0 0 0 0 0 0 0 0 0 c 0 0 0 0 0 0 0 0 1 1 1 c 0 0 0 0 1 1 1 1 0 0 1 c 0 0 1 1 0 0 1 1 0 0 1 c 0 1 0 1 0 1 0 1 0 1 1 Mnemonic NGT NLT NGE NLE NGL NINF NGLE UN NE(Q) MI c 1 1 1 1 1 1 1 1 1 1 c 0 0 0 0 0 0 0 0 1 1 cc 00 00 01 01 10 10 11 11 00 00 c 0 1 0 1 0 1 0 1 0 1 Timing: 2 + da oscillator clock cycles Memory: 1 program words A - 244 DSP96002 USER’S MANUAL
PAGE 433
Move FFcc.U Move FFcc.U Floating-Point iF Conditional Instruction with CCR, ER, IER Update Operation: Assembler Syntax: If cc, then opcode operation Opcode-Operands S → D S,D FFcc.U FFcc.U Description: If the specified floating-point condition is true, transfer data from the specified source S to the specified destination D. Also, store result(s) of the specified Data ALU operation and update the CCR, ER and IER registers with the status information generated by the Data ALU operation.
PAGE 434
CAUTION See restrictions in Section A.10.6 concerning Rn, Mn, and Nn registers as a destination. CCR Condition Codes: C- Affected by the accompanying Data ALU operation if the specified condition is true. Not affected otherwise. V- Affected by the accompanying Data ALU operation if the specified condition is true. Not affected otherwise. Z - Affected by the accompanying Data ALU operation if the specified condition is true. Not affected otherwise.
PAGE 435
Instruction Format - Opcode-operands: S,D FFcc.U FFcc.
PAGE 436
Move IFcc Move IFcc Integer iF Conditional Instruction without CCR Update Operation: Assembler Syntax: If cc, then opcode operation Opcode-Operands S,D S → D IFcc IFcc Description: If the specified integer condition is true, transfer data from the specified source S to the specified destination D. Also, store result(s) of the specified Data ALU operation. If the specified condition is false, no destinations are altered.
PAGE 437
CCR Condition Codes: C - Not affected. V - Not affected. Z - Not affected. N - Not affected. I - Not affected. LR - Not affected. – R - Not affected. A - Not affected. ER Status Bits: INX - Not affected. DZ - Not affected. UNF - Not affected. OVFD - Not affected. OPERR - Not affected. SNAN - Not affected. NAN - Not affected. UNCC - Not affected. IER Flags: SINX - Not affected. SDZ - Not affected. SUNF - Not affected. SOVF - Not affected. SIOP - Not affected.
PAGE 438
Instruction Fields: S t t t Rn n n n D TTT Rn nn n Mnemonic EQ PL CC(HS) GE GT VC HI AL where nnn = 0-7 where nnn = 0-7 c 0 0 0 0 0 0 0 1 c 1 1 1 1 1 1 1 1 c 0 0 0 0 1 1 1 1 c 0 0 1 1 0 0 1 1 c 0 1 0 1 0 1 0 1 Mnemonic NE(Q) MI CS(LO) LT LE VS LS c 1 1 1 1 1 1 1 c 1 1 1 1 1 1 1 cc 00 00 01 01 10 10 11 c 0 1 0 1 0 1 0 Timing: 2 + da oscillator clock cycles Memory: 1 program words A - 250 DSP96002 USER’S MANUAL MOTOROLA
PAGE 439
Move IFcc.U Integer iF Conditional Instruction with CCR, ER, and IER Update Operation: Assembler Syntax: If cc, then opcode operation Opcode-Operands S,D S → D Move IFcc.U IFcc.U IFcc.U Description: If the specified integer condition is true, transfer data from the specified source S to the specified destination D. Also, store result(s) of the specified Data ALU operation and update the CCR, ER and IER registers with the status information generated by the Data ALU operation.
PAGE 440
CCR Condition Codes: C - Affected by the accompanying Data ALU operation if the specified condition is true. Not affected otherwise. V - Affected by the accompanying Data ALU operation if the specified condition is true. Not affected otherwise. Z - Affected by the accompanying Data ALU operation if the specified condition is true. Not affected otherwise. N - Affected by the accompanying Data ALU operation if the specified condition is true. Not affected otherwise.
PAGE 441
Instruction Fields: S t t t Rn n n n D TTT Rn nn n where nnn = 0-7 where nnn = 0-7 Mnemonic ccccc Mnemonic c c cc c EQ PL CC(HS) GE GT VC HI AL 0 0 0 0 0 0 0 1 NE(Q) MI CS(LO) LT LE VS LS 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 0 0 1 1 0 0 1 0 1 0 1 0 1 0 Timing: 2 + da oscillator clock cycles Memory: 1 program words MOTOROLA DSP96002 USER’S MANUAL A - 253
PAGE 442
MOVE(C) Operation: Move Control Register MOVE(C) MOVE(C) S3,D2 S3 → D2 MOVE(C) S2,D1 S2 → D1 MOVE(C) #Data,D1 #xxxx → D1 MOVE(C) X: ea, D1 X: → D1 MOVE(C) X:(Rn+displacement),D1 MOVE(C) S1,X: ea MOVE(C) S1,X:(Rn+displacement) MOVE(C) Y: ea, D1 MOVE(C) Y:(Rn+displacement),D1 MOVE(C) S1,Y: ea MOVE(C) S1,Y:(Rn+displacement) X: → D1 S1 → X: S1 → X: Y: → D1 Y: → D1 S1 → Y: S1 → Y: Description: Assembler Syntax: Move the contents o
PAGE 443
For destination operand SR: C - Set according to bit 0 of the source operand. V - Set according to bit 1 of the source operand. Z - Set according to bit 2 of the source operand. N - Set according to bit 3 of the source operand. I - Set according to bit 4 of the source operand. LR - Set according to bit 5 of the source operand. – R A - Set according to bit 6 of the source operand. - Set according to bit 7 of the source operand. For destination operands other than SR: C - Not affected.
PAGE 444
For destination operand SR: SINX -Set according to bit 16 of the source operand. SDZ -Set according to bit 17 of the source operand. SUNF -Set according to bit 18 of the source operand. SOVF -Set according to bit 19 of the source operand. SIOP -Set according to bit 20 of the source operand. For destination operands other than SR: SINX - Not affected. SDZ - Not affected. SUNF - Not affected. SOVF - Not affected. SIOP Instruction Format: - Not affected.
PAGE 445
S3 S1, D1 SR OMR SP SSH SSL LA LC DD DD DD D ddddddd 1111001 1111010 1111011 1111100 1111101 1111110 1111111 S2 D2 D0.S-D7.S D0.L-D7.L D0.M-D7.M D0.H-D7.H D8.L D9.L D8.M D9.M D8.H D9.H D8.S D9.
PAGE 446
MOVE(I) Immediate Short Data Move Operation: Assembler Syntax: #xx → D MOVE(I) MOVE(I) #Data,D Description: The 16-bit immediate short operand is sign extended to a word operand and is stored in the destination register D. Care should be taken if the specified destination register is D0.S-D9.S, since there is no special formatting for short floating-point data and the sign extended immediate short operand may produce small positive denormalized numbers or a negative NANs. See Section A.
PAGE 447
ER Status Bits: For destination operand SR: INX -Set according to bit 8 of the source operand. DZ -Set according to bit 9 of the source operand. UNF -Set according to bit 10 of the source operand. OVF -Set according to bit 11 of the source operand. OPERR-Set according to bit 12 of the source operand. SNAN -Set according to bit 13 of the source operand. NAN -Set according to bit 14 of the source operand. UNCC -Set according to bit 15 of the source operand.
PAGE 448
Instruction Fields: Immediate Short Data - iiiiiiiiiiiiiiii (16 bits) D ddddddd D0.S-D7.S 0000nnn D0.L-D7.L 0001nnn D0.M-D7.M 0010nnn D0.H-D7.H 0011nnn D8.L 0100000 D9.L 0100001 D8.M 0100010 D9.M 0100011 D8.H 0100100 D9.H 0100101 D8.S 0100110 D9.
PAGE 449
MOVE(M) Move Program Memory Operation: Assembler Syntax: P: → D MOVE(M) P: ea, D S → P: MOVE(M) S,P: ea MOVE(M) Description: Move the specified program memory word operand to the specified destination register or move the specified source register to the specified program memory location. The registers S and D may be any register. All memory alterable addressing modes may be used.
PAGE 450
For destination operands other than SR: C - Not affected. V - Not affected. Z - Not affected. N - Not affected. I - Not affected. LR - Not affected. – R - Not affected. A - Not affected. ER Status Bits: For destination operand SR: INX -Set according to bit 8 of the source operand. DZ -Set according to bit 9 of the source operand. UNF -Set according to bit 10 of the source operand. OVF -Set according to bit 11 of the source operand. OPERR-Set according to bit 12 of the source operand.
PAGE 451
Instruction Format: MOVE(M) 31 0000 0001 P: ea, D 0110 MMMR MOVE(M) 14 13 RR 1W S,P: ea 0 0001 0ddd dddd OPTIONAL EFFECTIVE ADDRESS EXTENSION Instruction Fields: Rn - R0-R7 (Memory alterable addressing modes only) Absolute Address - 32 bits Register W Read S 0 Write D 1 D D0.S-D7.S D0.L-D7.L D0.M-D7.M D0.H-D7.H D8.L D9.L D8.M D9.M D8.H D9.H D8.S D9.
PAGE 452
MOVE(P) Move Peripheral Data Operation: Assembler Syntax: X: → D MOVE(P) X: pp, D S → X: MOVE(P) S,X: pp #xxxx → X: MOVE(P) #Data,X: pp Y: → D MOVE(P) Y: pp, D S → Y: #xxxx → Y: X: → X: X: → X: X: → Y: Y: → X: MOVE(P) MOVE(P) S,Y: pp MOVE(P) #Data,Y: pp MOVE(P) X: pp, X: ea MOVE(P) X: ea, X: pp MOVE(P) X: pp, Y: ea Y: → X: MOVE(P) Y: ea, X: pp X: → Y: MOVE(P) Y: pp, X: ea Y: → Y: MOVE(P) X: ea, Y: pp Y:
PAGE 453
the system stack pointer SP is preincremented by 1 before SSH is written. This allows the system stack to be efficiently extended using software stack pointer operations. See Section A.10 for restrictions that apply to this instruction. CAUTION See restrictions in Section A.10.6 concerning Rn, Mn, and Nn registers as a destination. CCR Condition Codes: For destination operand SR: C - Set according to bit 0 of the source operand. V - Set according to bit 1 of the source operand.
PAGE 454
For destination operands other than SR: INX - Not affected. DZ - Not affected. UNF - Not affected. OVF - Not affected. OPERR- Not affected. SNAN - Not affected. NAN - Not affected. UNCC - Not affected. IER Flags: For destination operand SR: SINX -Set according to bit 16 of the source operand. SDZ -Set according to bit 17 of the source operand. SUNF -Set according to bit 18 of the source operand. SOVF -Set according to bit 19 of the source operand. SIOP -Set according to bit 20 of the source operand.
PAGE 455
Instruction Format: MOVE(P) X: pp, P: ea MOVE(P) P: ea, X: pp MOVE(P) Y: pp, P: ea MOVE(P) P: ea, Y: pp 31 14 13 0000 0000 0111 MMMR RR 11 0 01SW 1ppp pppp OPTIONAL EFFECTIVE ADDRESS EXTENSION Instruction Format: MOVE(P) X: pp, D MOVE(P) S,X: pp MOVE(P) Y: pp, D MOVE(P) S,Y: pp 14 13 31 0000 0000 0111 dddd dd d0 00SW 0 1ppp pppp Instruction Fields: Rn - R0-R7 X: or Y: reference (Memory addressing modes only) P: reference (Memory Alterable addressing modes only) Absolute Addres
PAGE 456
S,D ddddddd D0.S-D7.S 0000nnn D0.L-D7.L 0001nnn D0.M-D7.M 0010nnn D0.H-D7.H 0011nnn D8.L 0100000 D9.L 0100001 D8.M 0100010 D9.M 0100011 D8.H 0100100 D9.H 0100101 D8.S 0100110 D9.
PAGE 457
MOVE(S) Move Absolute Short Operation: Assembler Syntax: X: → D1 MOVE(S) X: aa, D1 S1 → X : MOVE(S) S1,X: aa #xxxx → X: MOVE(S) #Data,X: aa Y: → D1 MOVE(S) Y: aa, D1 S1 → Y: #xxxx → Y: L: → D2 S2 → L: X: → X: X: → X: MOVE(S) MOVE(S) S1,Y: aa MOVE(S) #Data,Y: aa MOVE(S) L: aa, D2 MOVE(S) S2,L: aa MOVE(S) X: aa, X: ea X: → Y: MOVE(S) X: ea, X: aa Y: → X: MOVE(S) X: aa, Y: ea Y: → X: MOVE(S) Y: ea, X: aa X: →
PAGE 458
If the system stack register SSH is specified as a source operand, the system stack pointer SP is postdecremented by 1 after SSH is read. If the system stack register SSH is specified as a destination operand, the system stack pointer SP is preincremented by 1 before SSH is written. This allows the system stack to be efficiently extended using software stack pointer operations. See Section A.10 for restrictions that apply to this instruction. CAUTION See restrictions in Section A.10.
PAGE 459
For destination operands other than SR: INX - Not affected. DZ - Not affected. UNF - Not affected. OVF - Not affected. OPERR- Not affected. SNAN - Not affected. NAN - Not affected. UNCC - Not affected. IER Flags: For destination operand SR: SINX -Set according to bit 16 of the source operand. SDZ -Set according to bit 17 of the source operand. SUNF -Set according to bit 18 of the source operand. SOVF -Set according to bit 19 of the source operand.
PAGE 460
SIOP A - 272 - Not affected.
PAGE 461
Instruction Format: MOVE(S) X: aa, P: ea MOVE(S) P: ea, X: aa 31 0000 0000 0111 MMMR MOVE(S) Y: aa, P: ea MOVE(S) P: ea, Y: aa 14 13 RR 11 01SW 0 0aaa aaaa OPTIONAL EFFECTIVE ADDRESS EXTENSION Instruction Format: MOVE(S) X: aa, D1 MOVE(S) S1,X: aa MOVE(S) Y: aa, D1 MOVE(S) S1,Y: aa 14 13 31 0000 0000 Instruction Format: 0111 dddd dd d0 00SW 0 0aaa aaaa MOVE(S) L: aa, D2 MOVE(S) S2,L: aa 31 14 13 0000 0000 0111 DDDD DD D1 0 000W 0aaa aaaa Instruction Fields: Rn - R0-R
PAGE 462
S1, D1 ddddddd D0.S-D7.S 0000nnn D0.L-D7.L 0001nnn D0.M-D7.M 0010nnn D0.H-D7.H 0011nnn D8.L 0100000 D9.L 0100001 D8.M 0100010 D9.M 0100011 D8.H 0100100 D9.H 0100101 D8.S 0100110 D9.S 0100111 R0-R7 0101nnn N0-N7 0110nnn M0-M7 0111nnn SR 1111001 OMR 1111010 SP 1111011 SSH 1111100 SSL 1111101 LA 1111110 LC 1111111 where nnn = 0-7 S2,D2 DD DD DD D D0.ML-D7.ML 1 0 1 1 n n n where nnn = 0-7 D0.D-D7.D 1010nnn D9.ML 1000111 D8.ML 1000110 D9.D 1000101 D8.
PAGE 463
MOVETA Move Data Registers and Test Address MOVETA Operation: Assembler Syntax: parallel data bus move MOVETA (move syntax - see the Move instruction description). Description: Move the contents of the specified source to the specified destination and update the C, V, N and Z flags in the CCR according to the result of the address calculation. Only Address Register Indirect addressing modes will give meaningful flag updates.
PAGE 464
Instruction Format: MOVETA (Integer NOP) Instruction Fields: See the MOVE instruction description for Data Bus Move Field encoding.
PAGE 465
MPYS Signed Multiply MPYS Operation: Assembler Syntax: S1.L * S2.L → D.M:D.L (parallel data bus move) MPYS S1,S2,D ( See the MOVE instruction description.) MPYS S2,S1,D ( See the MOVE instruction description.) Description: Multiply two signed operands and store the product in the specified destination register. The two source operands are 32-bit integers and are taken from the low portion of S1 and S2. The result is a 64-bit signed integer stored in the middle and low portions of D.
PAGE 466
Instruction Format: MPYS 31 S1,S2,D ( See the MOVE instruction description.) 14 13 DATA BUS MOVE FIELD 11 1sss 0 SSS0 1ddd OPTIONAL EFFECTIVE ADDRESS EXTENSION OR IMMEDIATE LONG DATA Instruction Format: MPYS 31 S2(8,9),S1,D ( See the MOVE instruction description.) 14 13 DATA BUS MOVE FIELD 11 0sss 0 11S0 1ddd OPTIONAL EFFECTIVE ADDRESS EXTENSION OR IMMEDIATE LONG DATA IER Flags: Not affected.
PAGE 467
MPYU Unsigned Multiply Operation: MPYU Assembler Syntax: S1.L * S2.L → D.M:D.L (parallel data bus move) MPYU S1,S2,D ( See the MOVE instruction description.) MPYU S2,S1,D ( See the MOVE instruction description.) Description: Multiply two unsigned operands and store the product in the specified destination register. The two source operands are 32-bit integers and are taken from the low portion of S1 and S2. The result is a 64-bit unsigned integer stored in the middle and low portions of D.
PAGE 468
Instruction Format: MPYU 31 S1,S2,D ( See the MOVE instruction description.) 14 13 DATA BUS MOVE FIELD 11 1sss 0 SSS1 1ddd OPTIONAL EFFECTIVE ADDRESS EXTENSION OR IMMEDIATE LONG DATA Instruction Format: MPYU 31 S2(8,9),S1,D ( See the MOVE instruction description.
PAGE 469
NEG Negate Operation: NEG Assembler Syntax: 0 - D.L → D.L (parallel data bus move) NEG D ( See the MOVE instruction description.) Description: The low portion of the destination operand is subtracted from zero. The result is stored in the low portion of D. This instruction is preferable to using the SUB instruction since it is not necessary to zero an input operand. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: 32-bit integer.
PAGE 470
NEGC Negate with Carry Operation: NEGC Assembler Syntax: 0 - D.L - C → D.L (parallel data bus move) NEGC D ( See the MOVE instruction description.) Description: Subtract the low portion of the destination operand D from zero along with the C bit of the condition code register and store the result in the low portion of D. This instruction is useful when negating a multiple precision number since it is not necessary to first zero an input operand as would be the case if the SUB instruction were used.
PAGE 471
NOP No Operation NOP Operation: Assembler Syntax: None NOP Description: No operation occurs. The processor state, other than the program counter, is not affected. Execution continues with the instruction following the NOP. CCR Condition Codes: Not affected. ER Status Bits: Not affected. IER Flags: Not affected.
PAGE 472
NOT Logical Complement NOT Operation: Assembler Syntax: ~D{31:0} → D{31:0} (parallel data bus move) NOT D ( See the MOVE instruction description.) Description: The one’s complement of the low portion of the destination operand is taken and the result is stored in D. This instruction is a 32-bit operation and is performed on bits 0-31 of D. The remaining bits of D are not affected. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: 32-bit integer.
PAGE 473
OR Logical Inclusive OR Operation: OR Assembler Syntax: D.L v S.L → D.L (parallel data bus move) OR S,D ( See the MOVE instruction description.) Description: Logically inclusive OR the low portion of the two specified operands and store the result in the low portion of D. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: 32-bit integer. CCR Condition Codes: C - Not affected. V - Always cleared. Z - Set if result is zero. Cleared otherwise. N - Set if result is negative.
PAGE 474
ORC Logical Inclusive OR with Complement Operation: ORC Assembler Syntax: D.L v ~S.L → D.L (parallel data bus move) ORC S,D ( See the MOVE instruction description.) Description: Logically inclusive OR the low portion of D with the logical complement of the low portion of S, and store the result in the low portion of D. This instruction is useful for manipulating bit maps in graphic operations. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: 32-bit integer.
PAGE 475
ORI OR Immediate to Control Register Operation: Assembler Syntax: D v #xx → D OR(I) ORI #Mask,D Description: Logically OR the contents of the control register with an 8-bit immediate operand. The result is stored back into the specified control register. See Section A.10 for restrictions. CCR Condition Codes: For CCR operand: C - Set if bit 0 of the immediate operand is set. Not affected otherwise. V - Set if bit 1 of the immediate operand is set. Not affected otherwise.
PAGE 476
For OMR, MR, IER, CCR operands: INX - Not affected. DZ - Not affected. UNF - Not affected. OVF - Not affected. OPERR- Not affected. SNAN - Not affected. NAN - Not affected. UNCC - Not affected. IER Flags: For IER operand: SINX -Set if bit 0 of the immediate operand is set. Not affected otherwise. SDZ -Set if bit 1 of the immediate operand is set. Not affected otherwise. SUNF -Set if bit 2 of the immediate operand is set. Not affected otherwise.
PAGE 477
REP REP Repeat Next Instruction Assembler Syntax: Operation: REP X: ea REP Y: ea REP S REP #Count LC → TEMP; X: → LC Repeat next instruction until LC = 1. TEMP → LC LC → TEMP; Y: → LC Repeat next instruction until LC = 1. TEMP → LC LC → TEMP; S → LC Repeat next instruction until LC = 1. TEMP → LC LC → TEMP; #xxx → LC Repeat next instruction until LC = 1.
PAGE 478
Instruction Format: REP 31 0000 0001 #Count 14 13 1111 Instruction Format: REP iiii ii ii 0 iiii 1iii iiii S 31 14 13 0000 0001 Instruction Format: 1110 REP REP 0000 00 00 0 0000 1ddd dddd X: ea Y: ea 31 14 13 0000 0001 110s MMMR RR 00 0 0000 1000 0000 Instruction Fields: Rn - R0-R7 (Address Register Indirect Modes except (Rn+xxx) ) Immediate Short Data - iiiiiiiiiiiiiiiiiii (19 bits) Memory Space s X Memory 0 Y Memory 1 A - 290 DSP96002 USER’S MANUAL MOTOROLA
PAGE 479
S ddddddd D0.S-D7.S 0000nnn D0.L-D7.L 0001nnn D0.M-D7.M 0010nnn D0.H-D7.H 0011nnn D8.L 0100000 D9.L 0100001 D8.M 0100010 D9.M 0100011 D8.H 0100100 D9.H 0100101 D8.S 0100110 D9.
PAGE 480
RESET SET Reset Peripheral Devices RE- the Interrupt Priority Register. Operation: Assembler Syntax: Reset all on-chip peripherals and RESET Description: All on-chip peripherals and the Interrupt Priority Register are reset. See Chapter 7 for a description of the effect of the RESET instruction on the peripherals. The processor state is not affected and execution continues with the next instruction, but all maskable interrupt sources are disabled.
PAGE 481
ROL Rotate Left ROL Operation: 31 0 C (parallel data bus move) Assembler Syntax: ROL D ( See the MOVE instruction description.) Description: Rotate the low portion of the specified operand one bit to the left. The carry bit receives the previous value of bit 31 of the operand. The previous value of the carry bit is shifted into bit 0 of the operand. The result is stored in the low portion of D. This instruction is a 32 bit operation and is performed on bits 0-31 of D.
PAGE 482
ROR Rotate Right ROR Operation: 31 0 C (parallel data bus move) Assembler Syntax: ROR D ( See the MOVE instruction description.) Description: Rotate the low portion of the specified operand one bit to the right. The carry bit receives the previous value of bit 0 of the operand. The previous value of the carry bit is shifted into bit 31 of the operand. The result is stored in the low portion of D. This instruction is a 32 bit operation and is performed on bits 0-31 of D.
PAGE 483
RTI RTI Return from Interrupt Assembler Syntax: Operation: RTI Description: The program counter and the status register are pulled from the system stack. The interrupt routine program counter and status register are lost. RTI if functionally identical to RTR but has been made a separate instruction to be upward compatible with future parts and to simplify porting software. Due to pipelining, the RTI instruction must not be immediately preceded by some instructions. See Section A.
PAGE 484
SSH → PC; SSL → SR; SP – 1 → SP A - 296 DSP96002 USER’S MANUAL MOTOROLA
PAGE 485
Instruction Format: RTI 31 0000 0000 14 13 0000 0000 00 00 0 0000 0000 1100 Instruction Fields: None.
PAGE 486
RTR Return from Subroutine with Restore Operation: Assembler Syntax: SSH → PC; SSL → SR; SP – 1 → SP RTR RTR Description: The program counter and the status register are pulled from the system stack. The subroutine program counter and status register are lost. RTR if functionally identical to RTI but has been made a separate instruction to be upward compatible with future parts and to simplify porting software.
PAGE 487
Instruction Format: RTR 31 0000 0000 14 13 0000 0000 00 00 0 0000 0000 1000 Instruction Fields: None.
PAGE 488
RTS Return from Subroutine Operation: Assembler Syntax: SSH → PC; SP – 1 → SP RTS RTS Description: The program counter is pulled from the system stack. The status register is not affected. The subroutine program counter is lost. Due to pipelining, the RTS instruction must not be immediately preceded by some instructions. See Section A.10 for the list of restricted instructions. CCR Condition Codes: Not affected. ER Status Bits: Not affected. IER Flags: Not affected.
PAGE 489
SETW Set Long Word Operand Operation: SETW Assembler Syntax: $FFFFFFFF → D.L (parallel data bus move) SETW D (move syntax - see the Move instruction description.) Description: The low portion (long word) of the destination operand is set to all ones. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: 32-bit integer. CCR Condition Codes: C - Not affected. V - Always cleared. Z - Always cleared. N - Always set. I - Not affected. LR - Not affected.
PAGE 490
SPLIT Extract a 16-bit Integer Operation: Assembler Syntax: S.L {31:16} → D.L {15:0} S.L {31} SPLIT (parallel data bus move) → D.L {31:16} SPLIT S,D (move syntax - see the Move instruction description.) Description: Transfer the 16 MSBs of the lower portion of source operand S into the 16 LSBs of the lower portion of destination D and sign-extend to 32 bits. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: 32-bit integer. CCR Condition Codes: C - Not affected.
PAGE 491
SPLITB Extract an 8-bit Integer Operation: SPLITB Assembler Syntax: S.L {15:8} → D.L {7:0} (parallel data bus move) S.L {15} → D.L {31:8} SPLITB S,D (move syntax - see the Move instruction description.) Description: Transfer bits 15-8 of the lower portion of source operand S into the 8 LSBs of the lower portion of destination D and sign-extend to 32 bits. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: 32-bit integer. CCR Condition Codes: C - Not affected.
PAGE 492
STOP Stop Instruction Processing Operation: Assembler Syntax: Enter the STOP processing state and STOP STOP stop the clock oscillator. Description: When a STOP instruction is executed, the processor enters the STOP processing state. The clock oscil— — — — – —— — – lator is gated off. All activity in the processor is suspended until the R E S E T or I R Q A pin is asserted. The STOP processing state is the lowest-power stand-by state.
PAGE 493
SUB Subtract SUB Operation: Assembler Syntax: D.L - S.L → D.L (parallel data bus move) SUB S,D (move syntax - see the Move instruction description.) Description: Subtract the low portion of the specified source operand S from the low portion of the destination operand D and store the result in the low portion of D. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: 32-bit integer. CCR Condition Codes: C - Set if a borrow is generated from the MSB of the result. Cleared otherwise.
PAGE 494
SUBC Subtract with Carry SUBC Operation: Assembler Syntax: D.L - S.L - C → D.L (parallel data bus move) SUBC S,D (move syntax - see the Move instruction description.) Description: Subtract the low portion of the specified source operand S from the low portion of the destination operand D along with the C bit of the condition code register and store the result in the low portion of D. This instruction is useful in multiple precision integer arithmetic routines.
PAGE 495
TFR Transfer Data ALU Register TFR Operation: Assembler Syntax: S.L → D.L (parallel data bus move) TFR S,D (move syntax - see the Move instruction description.) Description: Transfer data from the low portion of the specified source Data ALU register to the low portion of the specified destination Data ALU register. TFR uses the internal Data ALU paths but does not affect the condition code bits. When the S and D registers are the same, this instruction is equivalent to an integer rounding operation.
PAGE 496
TRAPcc Conditional Software Interrupt Operation: Assembler Syntax: If cc, then TRAPcc TRAPcc begin software exception processing. Description: If the specified integer condition is true, normal instruction execution is suspended and software exception processing is initiated. The interrupt priority level (I1,I0) is set to 3 in the status register if a long interrupt service routine is used. If the specified condition is false, continue with the next instruction. See Section A.10 for restrictions.
PAGE 497
Instruction Fields: Mnemonic EQ PL CC(HS) GE GT VC HI AL c 0 0 0 0 0 0 0 1 c 1 1 1 1 1 1 1 1 c 0 0 0 0 1 1 1 1 c 0 0 1 1 0 0 1 1 c 0 1 0 1 0 1 0 1 Mnemonic NE(Q) MI CS(LO) LT LE VS LS c 1 1 1 1 1 1 1 c 1 1 1 1 1 1 1 cc 00 00 01 01 10 10 11 c 0 1 0 1 0 1 0 Timing: 10 oscillator clock cycles Memory: 1 program words MOTOROLA DSP96002 USER’S MANUAL A - 309
PAGE 498
TST Test an Operand Operation: S-0 TST Assembler Syntax: (parallel data bus move) TST S (move syntax - see the Move instruction description.) Description: Compare the low portion of the specified operand with zero. No result is stored, however the condition codes are affected. Input Operand(s) Precision: 32-bit integer. Output Operand Precision: n.a. CCR Condition Codes: C - Not affected. V - Always cleared. Z - Set if result is zero. Cleared otherwise. N - Set if result is negative.
PAGE 499
WAIT Wait for Interrupt WAIT Operation: Assembler Syntax: Enter WAIT processing state and stop all internal processing. WAIT Wait for an unmasked interrupt to occur. Description: When a WAIT instruction is executed, the processor enters the WAIT state. The internal clocks to the processor core, memories, and DMA are gated off and all activity in the processor is suspended until an unmasked interrupt occurs. However the clock oscillator and the internal I/O peripheral clocks remain active.
PAGE 500
A.8 INSTRUCTION ENCODING SUMMARY The encoding for each instruction is provided with the instruction descriptions in subsection A.7. An instruction encoding summary is available upon request. Some instructions have legal operation codes but specify the same destination for two or more simultaneous operations. These instructions are called insane instructions.
PAGE 501
A.9 INSTRUCTION TIMING Figure A-7 shows the number of words and the number of clock cycles required for instruction execution. The symbols used reference other tables to complete the instruction word and cycle count. The number of words per instruction is dependent on the addressing mode and the type of parallel data bus move operation specified.
PAGE 502
BScc BSCLR 1 + ea 6 + jx 2 8 + jx BSET 1 + ea 4 + mvb BSR 1 + ea 6 + jx BSSET 2 8 + jx BTST 1 + ea 4 + mvb CLR 1 + mv 2 + mv CMP 1 + mv 2 + mv CMPG 1 + mv 2 + mv DEBUGcc DEC 1 1 + mv 4 2 + mv DO 2 6 + mv DOR 2 8 + mv ENDDO 1 2 EOR 1 + mv 2 + mv EXT 1 + mv 2 + mv EXTB 1 + mv 2 + mv FABS.S 1 + mv 2+mv+da FABS.X 1 + mv 2+mv+da FADD.S 1 + mv 2+mv+da FADD.X 1 + mv 2+mv+da FADDSUB.S 1 + mv 2+mv+da FADDSUB.
PAGE 503
FJScc 1 + ea 6 + jx FLOAT.S 1 + mv 2+mv+da FLOAT.X 1 + mv 2+mv+da FLOATU.S 1 + mv 2+mv+da FLOATU.X 1 + mv 2+mv+da FLOOR 1 + mv 2+mv+da FMPY//FADD.S 1 + mv 2+mv+da FMPY//FADD.X 1 + mv 2+mv+da FMPY//FADDSUB.S 1 + mv 2+mv+da FMPY//FADDSUB.X 1 + mv 2+mv+da FMPY//FSUB.S 1 + mv 2+mv+da FMPY//FSUB.X 1 + mv 2+mv+da FMPY.S 1 + mv 2+mv+da FMPY.X 1 + mv 2+mv+da FNEG.S 1 + mv 2+mv+da FNEG.X 1 + mv 2+mv+da FSCALE.S 1 + mv 2+mv+da FSCALE.X 1 + mv 2+mv+da FSCALE.
PAGE 504
Jcc 1 + ea 4 + jx JCLR 2 6 + jx JMP 1 + ea 4 + jx JOIN 1 + mv 2 + mv JOINB 1 + mv 2 + mv JScc 1 + ea 4 + jx JSCLR 2 6 + jx JSET 2 6 + jx JSR 1 + ea 4 + jx 2 6 + jx LEA 1 + ea 4 + le LRA 1 + lr 4 + lr LSL 1 + mv 2 + mv JSSET LSL #shift LSR 1 1 + mv LSR #shift 1 2 2 + mv 2 MOVE 1 + mv 2 + mv MOVEC 1 + ea 2 + mvc MOVEI 1 MOVEM 1 + ea 6 + mvm MOVEP 1 + ea 2 + mvp MOVES 1 + ea 2 + mvs MOVETA 1 + mv 2 + mv MPYS 1 + mv 2 + mv MPYU 1 + mv 2 + mv
PAGE 505
RTR 1 4 + rx RTS 1 4 + rx SETW 1 + mv 2 + mv SPLIT 1 + mv 2 + mv SPLITB 1 + mv 2 + mv STOP 1 n/a SUB 1 + mv 2 + mv SUBC 1 + mv 2 + mv TFR 1 + mv 2 + mv TRAPcc TST WAIT 1 1 + mv 1 Note 1 10 2 + mv n/a Note 2 Figure A-7 Instruction Timing Summary (Continued) Note 1: The STOP instruction disables all internal clocks. Note 2: The WAIT instruction takes a minimum of 16 clock cycles to execute when an internal interrupt is pending during the execution of the WAIT instruction. A.9.
PAGE 506
(IEEE Mode) + da Cycles worst case Comments FABS.S das 6 Worst case: res=1, den=1 FABS.X dax 4 Worst case: FADD.S das 8 Worst case: res=1, den=2 FADD.X dax 6 Worst case: FADDSUB.S das 10 Worst case: res=2, den=2 FADDSUB.X dax 6 Worst case: den=2 FCLR 0 0 FCMP dax 6 Worst case: den=2 FCMPG dax 6 Worst case: den=2 FCMPM dax 6 Worst case: den=2 FCOPYS.S das 8 Worst case: res=1, den=2 FCOPYS.X dax 6 Worst case: den=2 FFcc daff n/a FFcc.
PAGE 507
Data ALU Operation (IEEE Mode) + da Cycles +da Cycles worst case Comments FNEG.X dax 4 Worst case: FSCALE.S dam 6 Worst case: res=1, den=1 FSCALE.X dam 6 Worst case: res=1, den=1 FSEEDD dam 6 Worst case: res=1, den=1 FSEEDR dam 4 Worst case: res=0 den=1 FSUB.S das 8 Worst case: res=1, den=2 FSUB.X dax 6 Worst case: FTFR.S das 6 Worst case: res=1, den=1 FTFR.
PAGE 508
res = number of multiplier de/unnormalized results. den = number of multiplier source operands with U-tag or V-tag set + number of add/sub source operands with U-tag set. i = 0, if den=0; 1 otherwise. das = 2 * (res + i * (1 + den)) clock cycles res = number of de/unnormalized results. den = number of de/unnormalized source operands (U-tag set). i = 0, if den=0; 1 otherwise. dax = 2 * i * (1 + den) clock cycles den = number of de/unnormalized source operands (U-tag set). i = 0, if den=0; 1 otherwise.
PAGE 509
If there are wait states, (i.e., assumption 4 is not applicable) then to each 1-word instruction timing a "+ap" term should be added and to each 2-word instruction a "+(2 * ap)" term should be added to account for the program memory wait states spent to fetch an instruction word to fill the pipeline. A.9.
PAGE 510
A.9.5 MOVEP Timing Summary MOVEC Operation Register ↔ Peripheral + mvp Cycles Comments 2 + aio X Memory ↔ Peripheral 2 + ea + ax + aio Note 1 Y Memory ↔ Peripheral 2 + ea + ay + aio Note 1 P Memory ↔ Peripheral 4 + ea + ap + aio Note: The ax(ay) term does not apply to MOVE IMMEDIATE DATA. Figure A-12 MOVEP Timing Summary If there are wait states, (i.e.
PAGE 511
A.9.7 LEA Timing Summary + le Cycles MOVEC Operation Update Addressing Modes 0 Long Displacement 2 Comments Figure A-14 LEA Timing Summary If there are wait states, (i.e., assumption 4 is not applicable) then to each 1-word instruction timing a "+ap" term should be added and to each 2-word instruction a "+(2 * ap)" term should be added to account for the program memory wait states spent to fetch an instruction word to fill the pipeline. A.9.
PAGE 512
If there are wait states, (i.e., assumption 4 is not applicable) then to each 1-word instruction timing a "+ap" term should be added and to each 2-word instruction a "+(2 * ap)" term should be added to account for the program memory wait states spent to fetch an instruction word to fill the pipeline. A.9.
PAGE 513
The term "2 * ap" comes from the two instruction fetches done by the RTS/RTR/RTI instruction to refill the pipeline. A.9.
PAGE 514
A.9.
PAGE 515
These restricted instructions include: at LA-2, LA-1 and LA: DO BCHG/BCLR/BSET LA, LC, SR, SP, SSH, or SSL BTST SSH JCLR/JSET/JSCLR/JSSET SSH LEA to LA, LC, SR, SP, SSH, or SSL LRA to LA, LC, SR, SP, SSH, or SSL MOVEC/M/P/S from SSH MOVEC/I/M/P/S to LA, LC, SR, SP, SSH, or SSL ANDI MR ORI MR at LA: any two word instruction (F)Jcc, JMP, (F)JScc, JSR, (F)Bcc, BRA, (F)BScc, BSR, LRA, REP, RESET, RTI, RTR, RTS, STOP, WAIT Other restrictions: BSR to (LA), if Loop Flag is set (F)BScc to (LA), if Loop Flag is set
PAGE 516
A.10.3 ENDDO Restrictions Due to pipelining, the ENDDO instruction must not be immediately preceded by any of the following instructions: BCHG/BCLR/BSET LA, LC, SR, SSH, SSL or SP LEA to LA, LC, SR, SSH, SSL or SP LRA to LA, LC, SR, SSH, SSL or SP MOVEC/I/M/S to LA, LC, SR, SSH, SSL or SP MOVEC/M/S from SSH ANDI MR ORI MR A.10.
PAGE 517
and 1. BCHG/BCLR/BSET SP 2. JCLR/JSET/JSCLR/JSSET SSH or SSL and 1. MOVEC/I/M/S to SP 2. JCLR/JSET/JSCLR/JSSET SSH or SSL and 1. LEA to SP 2. JCLR/JSET/JSCLR/JSSET SSH or SSL and 1. LRA to SP 2. JCLR/JSET/JSCLR/JSSET SSH or SSL Also, the instruction MOVEC SSH, SSH is illegal. A.10.6 R, N, and M Register Restrictions If an address register Rn is the destination of a MOVE instruction, the new contents will not be available for use as an address pointer until the second following instruction.
PAGE 518
A.10.8 REP Restrictions The REP instruction can repeat any single word instruction except the REP instruction itself and any instruction that changes program flow.
PAGE 519
MOTOROLA DSP96002 USER’S MANUAL A - 331
PAGE 520
APPENDIX B DSP BENCHMARKS B.1 DSP96002 STANDARD DSP BENCHMARKS Program size and instruction cycle counts for the DSP56000/1 are in parentheses on the line following the DSP96002 program size and instruction cycle count. All floating-point data ALU operations are performed using single precision operations (".s" extension on opcode) rather than in extended precision (".x" extension on opcode).
PAGE 521
B.1.2 N Real Multiplies c(I) = a(I) * b(I) , I=1,...,N Program ICycles Words 1 1 move #aaddr,r0 move #baddr,r4 1 1 move #caddr,r1 1 1 1 1 2 3 1 1 1 1 --- --- 8 2N+7 (8 2N+7) move x:(r0)+,d4.s y:(r4)+,d6.s do #n,end fmpy.s d4,d6,d0 move x:(r0)+,d4.s y:(r4)+,d6.s d0.s,x:(r1)+ end Totals: B.1.3 Real Update d=c+a*b Program ICycles Words x:(r0),d4.s y:(r4),d6.s 1 1 move fmpy.s d4,d6,d1 fadd.s move x:(r1),d0.s 1 1 d1.s,d0.s 1 1 d0.
PAGE 522
B.1.4 N Real Updates d(I) = c(I) + a(I) * b(I), I=1,2,...,N Program ICycles Words 1 1 move #aaddr,r0 move #baddr,r4 1 1 move #caddr,r1 1 1 move #daddr,r5 1 1 x:(r0)+,d4.s y:(r4)+,d6.s 1 1 x:(r1)+,d0.s 1 1 2 3 x:(r0)+,d4.s y:(r4)+,d6.s 1 1 x:(r1)+,d0.s d0.s,y:(r5)+ 1 1 move fmpy.s do d4,d6,d1 #N,_end fadd.s d1,d0 fmpy.s d4,d6,d1 _end --- --- Totals: 10 2N+9 (10 2N+9) B.1.
PAGE 523
B.1.6 Real * Complex Correlation Or Convolution (FIR Filter) cr(n) + jci(n) = SUM(I=0,...,N-1) {( ar(I) + jai(I)) * b(n-I)} cr(n) = SUM(I=0,...,N-1) { ar(I) * b(n-I) } ci(n) = SUM(I=0,...,N-1) { ai(I) * b(n-I) } move Program ICycles Words 1 1 #aaddr,r0 fclr d0 #baddr+n,r4 1 1 fclr d1 x:(r0),d4.s 1 1 fclr d2 x:(r4)-,d5.s y:(r0)+,d6.s 1 1 do #n,end 2 3 fmpy d4,d5,d2 fadd.s d2,d1 x:(r0),d4.s 1 1 fmpy d6,d5,d2 fadd.s d2,d0 x:(r4)-,d5.s y:(r0)+,d6.s 1 1 fadd.
PAGE 524
B.1.8 N Complex Multiplies cr(I) + jci(I) = ( ar(I) + jai(I) ) * ( br(I) + jbi(I) ), I=1,...,N cr(I) = ar(I) * br(I) - ai(I) * bi(I) ci(I) = ar(I) * bi(I) + ai(I) * br(I) R1 → cr,ci R0 → ar,ai R4 → br,bi D5 = ar D6 = bi D4 = br D7 = ai Program Words 1 move #aaddr,r0 move #baddr,r4 1 1 move #caddr-1,r1 1 1 1 1 1 1 1 1 2 3 y:(r0),d7.s 1 1 move x:(r0),d5.s fmpy.s d6,d5,d1 fmpy.s d4,d7,d2 do y:(r4),d6.s x:(r4)+,d4.s y:(r0)+,d7.s #N,_end 1 fmpy d6,d7,d2 fadd.s d2,d1 fmpy.
PAGE 525
B.1.9 Complex Update dr + jdi = ( cr + jci ) + ( ar + jai ) * ( br + jbi ) dr = cr + ar * br - ai * bi R0 → a R4 → b R1 → c R → d di = ci + ar * bi + ai * br Program Words y:(r1),d1.s 1 move 1 move x:(r0),d5.s y:(r4),d6.s 1 1 fmpy.s d6,d5,d2 x:(r4),d4.s y:(r0),d7.s 1 1 fmpy d4,d7,d2 fadd.s d2,d1 x:(r1),d0.s 1 1 fmpy d4,d5,d2 fadd.s d2,d1 1 1 fmpy d6,d7,d2 fadd.s d2,d0 d1.s,y:(r2) 1 1 fsub.s d2,d0 1 1 1 1 --- --- 8 8 (7 7) move d0.s,x:(r2) Totals: B.1.
PAGE 526
Program ICycles Words 1 1 move #aaddr+1,r0 move #3,n0 1 1 move #baddr,r4 1 1 move #caddr,r1 1 1 move #daddr-1,r5 1 1 1 1 1 1 2 3 move x:(r0)-,d4.s fclr d2 y:(r4)+,d6.s x:(r0)+n0,d5.s y:(r5),d0.s do #n,end fmpy d5,d6,d2 fadd.s d2,d0 x:(r1)+,d1.s y:(r4)+,d7.s 1 1 fmpy d4,d7,d2 fadd.s d2,d1 x:(r1)+,d0.s d0.s,y:(r5)+ 1 1 fmpy d4,d6,d2 fsub.s d2,d1 x:(r0)-,d4.s y:(r4)+,d6.s 1 1 fmpy d5,d7,d2 fadd.s d2,d0 x:(r0)+n0,d5.s d1.s,y:(r5)+ 1 1 1 1 1 1 end fadd.
PAGE 527
move move move move #aaddr,r0 #baddr,r4 #caddr,r1 r1,r6 move #daddr,r5 move r5,r2 move move fmpy.s d4,d6,d2 fmpy.s d5,d6,d3 fmpy d5,d7,d2 do #N,_end fmpy d4,d7,d2 fmpy d4,d6,d2 fmpy d5,d6,d3 fmpy d5,d7,d3 x:(r4),d6.s x:(r0),d4.s fadd.s d2,d0 y:(r0)+,d5.s x:(r1)+,d0.s y:(r4)+,d7.s x:(r4),d6.s fsub.s fadd.s fadd.s fadd.s x:(r0),d4.s d0.s,x:(r5)+ x:(r1)+,d0.s x:(r4),d6.s d2,d0 d2,d1 d3,d1 d2,d0 y:(r6)+,d1.s y:(r0)+,d5.s y:(r4)+,d7.s d1.s,y:(r2)+ _end Totals: B.1.
PAGE 528
B.1.12 Nth Order Power Series (Real) c = SUM (I=0,...,N) { a(I) * bI } move #baddr,r4 move #aaddr,r0 c = aNbN + aN-1bN-1 + ... + a1b1 + a0 Program Words 1 d2 do #N,end fmpy d6,d7,d1 fadd.s d2,d0 fmpy.s d6,d4,d2 1 1 1 y:(r4),d7.s 1 1 x:(r0)+,d0.s y:(r4),d6.s 1 1 2 3 x:(r0)+,d4.s 1 1 d1.s,d6.s 1 1 1 --- 1 --- 9 2N+8 (9 2N+8) move fclr ICycles end fadd.s d2,d0 Totals: B.1.
PAGE 529
B.1.14 N Cascaded Real Biquad IIR Filters w(n) = x(n) - a1 * w(n-1) - a2 * w(n-2) y(n) = w(n) + b1 * w(n-1) + b2 * w(n-2) X Memory Organization Y Memory Organization b1N Coef. + 4N-1 b2N a1N a2N wN(n-1) Data + 2N-1 . wN(n-2) . . b11 . b21 w1(n-1) a11 R1,R0 → w1(n-2) Data R4 → a21 Coef.
PAGE 530
DSP96002 IMPLEMENTATION ProgramICycles Words move #$ffffffff,m0 2 2 move m0,m4 1 1 move m0,m1 1 1 move #data,r0 2 2 move r0,r1 1 1 move #coef,r4 2 2 x:input,d0.s 1 2 x:(r0)+,d4.s y:(r4)+,d6.s 1 1 2 3 movep fclr d1 do #n,end fmpy d4,d6,d1 fadd.s d1,d0 x:(r0)+,d5.s y:(r4)+,d6.s 1 1 fmpy d5,d6,d1 fsub.s d1,d0 d5.s,x:(r1)+ y:(r4)+,d6.s 1 1 fmpy d4,d6,d1 fsub.s d1,d0 x:(r0)+,d4.s y:(r4)+,d6.s 1 1 fmpy d5,d6,d1 fadd.s d1,d0 d0.s,x:(r1)+ y:(r4)+,d6.
PAGE 531
; +Sine value (1/2 cycle) in Y memory ; Table size can be i*points/2, i=1,2,... ; ; Macro Call - metr2a points,data,coef,coefsize ; ; points number of points (2 - 2,147,483,648, power of 2) ; data start of data buffer ; coef start of 1/2 cycle sine/cosine table ; coefsize number of table points in sine/cosine table ; = i*points/2, i=1,2,...
PAGE 532
inc d0 m0,m4 lsr d2 m0,m5 move d2.l,n6 do n1,_end_pass move r2,r0 move d0.l,n2 lsr d1 m2,r6 dec d1 d1.l,n0 move d1.l,n1 move n0,n4 move n0,n5 lea (r0)+n0,r1 lea (r0)-,r4 lea (r1)-,r5 do n2,_end_grp move x:(r6)+n6,d9.s y:,d8.s move x:(r1)+,d6.s y:,d7.s fmpy.s d8,d7,d3 y:(r5),d2.s fmpy.s d9,d6,d0 y:(r4),d5.s fmpy.s d9,d7,d1 y:(r1),d7.s do n0,_end_bfy fmpy d8,d6,d2 fadd.s fmpy d3,d0 x:(r0),d4.s d2.s,y:(r5)+ d8,d7,d3 faddsub.s d4,d0 x:(r1)+,d6.s d5.
PAGE 533
; ; Faster FFT using Programming Tricks found in Typical FORTRAN Libraries ; ; First two passes combined as a four butterfly loop since ; multiplies are trivial. ; 2.25 cycles internal (4 cycles external) per Radix 2 ; butterfly. ; Middle passes performed with traditional, triple-nested DO loop. ; 4 cycles internal (8 cycles external) per Radix 2 butterfly ; plus overhead. ; being used to minimize overhead.
PAGE 534
; r4 = b pointer in and out ; r1 = c pointer in and out ; r5 = d pointer in and out ; n5 = 2 ; move #points,d1.l move #passes,d9.l move #data,d0.l move #coef,m2 move #coefsize,d2.l lsr d1 d0.l,r0 lsr d1 r0,r2 add d1,d0 d1.l,d8.l add d1,d0 d0.l,r4 add d1,d0 d0.l,r1 lsr d2 d0.l,r5 lsr d2 r0,r6 move #2,n5 move d2.l,n6 move #-1,m0 move m0,m1 move m0,m4 move m0,m5 move m0,m6 move x:(r0),d1.s move x:(r1),d0.s move x:(r5)-,d2.s move y:(r5)+,d4.s faddsub.
PAGE 535
faddsub.s d1,d7 d0.s,x:(r4) y:(r1)+,d2.s faddsub.s d3,d2 d1.s,x:(r5)- faddsub.s d2,d6 x:(r0)-,d1.s d4.s,y:(r5)+n5 faddsub.s d3,d5 x:(r1)-,d0.s d2.s,y:(r4)+ faddsub.s d1,d0 x:(r5),d2.s d6.s,y:(r0)+ ftfr.s d5,d4 x:(r4),d5.s d3.s,y:(r1) faddsub.s d5,d2 d7.s,x:(r1)+ y:(r4),d7.s _twopass move d4.s,y:(r5)+ ; ; Middle passes ; tfr d9,d3 #4,d0.l clr d2 d8.l,d1.l sub d0,d3 d2.l,m6 do d3.l,_end_pass move d0.l,n2 move r2,r0 lsr d1 m2,r6 dec d1 d1.l,n0 dec d1 d1.
PAGE 536
fmpy fmpy d9,d6,d0 fsub.s d9,d7,d1 d1,d2 faddsub.s d5,d2 fmpy d8,d6,d2 fadd d3,d0 fmpy d8,d7,d3 faddsub.s d4,d0 d0.s,x:(r4) d4.s,x:(r5) y:(r0)+,d5.s y:(r1),d7.s x:(r0),d4.s d2.s,y:(r5)+ x:(r1)+,d6.s d5.s,y:(r4)+ _end_bfy move (r1)+n1 fmpy d9,d6,d0 fsub.s d1,d2 d0.s,x:(r4) y:(r0)+,d5.s fmpy d9,d7,d1 faddsub.s d5,d2 d4.s,x:(r5) y:(r1),d7.s fmpy d8,d6,d2 fadd.s x:(r0),d4.s d2.s,y:(r5)+ d3,d0 move x:(r6)+n6,d9.s y:,d8.s fmpy d8,d7,d3 faddsub.s d4,d0 x:(r1)+,d6.s d5.
PAGE 537
do n2,_end_next fmpy d9,d6,d0 fsub.s d1,d2 d0.s,x:(r4) y:(r0)+,d5.s fmpy d9,d7,d1 faddsub.s d5,d2 d4.s,x:(r5) y:(r1),d7.s fmpy d8,d6,d2 fadd.s x:(r0),d4.s d2.s,y:(r5)+ x:(r6)+n6,d9.s y:,d8.s d3,d0 move fmpy d8,d7,d3 faddsub.s d4,d0 x:(r1)+,d6.s d5.s,y:(r4)+ fmpy d9,d6,d0 fsub.s d1,d2 d0.s,x:(r4) y:(r0)+n0,d5.s fmpy d9,d7,d1 faddsub.s d5,d2 d4.s,x:(r5) y:(r1),d7.s fmpy d8,d6,d2 fadd.s x:(r0),d4.s d2.s,y:(r5)+n5 fmpy d8,d7,d3 faddsub.s d4,d0 x:(r1)+n1,d6.s d5.
PAGE 538
move fmpy d8,d7,d3 faddsub.s d4,d0 x:(r6)+n6,d9.s y:,d8.s x:(r1)+n1,d6.s d5.s,y:(r4)+n4 _end_last B.1.15.
PAGE 539
; ; t3 = dr + br ; t4 = dr - br ; ; t5 = ai + ci ; t6 = ai - ci ; ; t7 = bi + di ; t8 = bi - di ; ; t9 = t2 + t8 ; t10 = t2 - t8 ; ; t11 = t6 + t4 ; t12 = t6 - t4 ; ; ar’ = t1 + t3 ; t13 = t1 - t3 ; ; ai’ = t5 + t7 ; t14 = t5 - t7 ; ; br’ = t9*wr1 + t11*wi1 ; bi’ = t11*wr1 - t9*wi1 ; ; cr’ = t13*wr2 + t14*wi2 ; ci’ = t14*wr2 - t13*wi2 ; ; dr’ = t10*wr3 + t12*wi3 ; di’ = t12*wr3 - t10*wi3 ; ; Address pointers are organized as follows: ; B-20 ; r0 = ar,ai,br,bi point
PAGE 540
; r6 = temp storage pointer n6 = not used ; r7 = not used n7 = not used ; ; Alters Data ALU Registers ; d0 d4 d8 ; d1 d5 d9 ; d2 d6 ; d3 d7 ; ; Alters Address Registers ; r0 n0 m0 ; r1 n1 m1 ; r2 n2 m2 ; r3 n3 m3 ; r4 n4 m4 ; r5 n5 m5 ; r6 m6 ; ; Alters Program Control Registers ; pc sr ; ; Uses 6 locations on System Stack ; ; This program has not been exhaustively tested and may contain errors.
PAGE 541
move #temp,r2 ;initialize temp storage pointers 2 2 move (r2)+,r6 ; 1 1 move #0,r3 ;initialize group index counter 1 1 move #coef+table/4,r1 ;initialize wr (cos) pointer 2 2 move #coef,r5 ;initialize wi (sin) pointer 2 2 " ; ; Perform all FFT passes with triple nested DO loops ; do #@cvi(@log(points)/@log(4)+0.
PAGE 542
fmpy d5,d9,d0 fsub.s d1,d2 y:(r1)-n1,d8.s 1 fmpy d6,d8,d1 fadd.s d0,d3 y:(r5)-n5,d9.s 1 fmpy.s d6,d9,d0 d3.s,x:(r4) y:(r2),d4.s 1 fmpy.s d4,d8,d3 d2.s,y:(r4)+n4 1 fmpy d4,d9,d2 fsub.s d0,d3 y:(r1)-n1,d8.s 1 fadd.s d2,d1 y:(r5)-n5,d9.s 1 move d1.s,x:(r4) d3.s,y: 1 _end_bfy move #coef,r5 ;point at wi0 2 move #coef+table/4,r1 ;point at wr0 2 move #0,r3 ;reset group index counter 1 _end_grp move n0,d0.l ;get butterflies per group 1 lsr d0.l ; 1 lsr d0.l n2,d1.l ;divide butterflies/group by 4 1 lsl d1.l d0.
PAGE 543
Notation and symbols: x(n) - Input sample at time n. d(n) - Desired signal at time n. f(n) - FIR filter output at time n. H(n) - Filter coefficient vector at time n. H={h0,h1,h2,h3} X(n) - Filter state variable vector at time n. X={x0,x1,x2,x3} u - Adaptation gain. ntaps - Number of coefficient taps in the filter. For this example, ntaps=4.
PAGE 544
org y:0 ds ntaps org y:10 dsig ds 1 xsig ds 1 org p:$50 cbuf start move #sbuf,r0 ;point to state buffer move #cbuf,r4 ;point to coefficient buffer move r4,r5 ;extra pointer move #ntaps-1,m0 ;mod on pointers move #ntaps-1,m4 move #ntaps-1,m5 move #-3,n0 ;final adjustment move #u,d7.s ;adaptation constant main fclr d1 fclr d0 rep #ntaps fmpy d4,d5,d1 fadd.s d1,d0 y:xsig,d4.s d4.s,x:(r0)+ fadd.s d1,d0 y:(r4)+,d5.s x:(r0)+,d4.s x:(r0)-,d4.s move y:(r4)+,d5.
PAGE 545
On the delayed LMS algorithm, the coefficients are updated with the error from the previous iteration while the FIR filter is being computed for the current iteration. In the following implementation, two coefficients are updated with each pass of the loop. Delayed LMS Algorithm iter 50 ;Number of LMS iterations conv_fact equ 0.01 ;Convergence factor org x:$0 ds 11 org y:$0 coef ds 10 ;LMS coefficients e dc 0.
PAGE 546
move x:(r0)+,d6.s y:(r4)+,d7.s fmpy.s d7,d6,d1 x:(r0)+,d4.s y:(r4)+,d5.s fmpy.s d9,d4,d2 fmpy d5,d4,d0 fadd.s d7,d2 x:(r0)+,d6.s do #4,_lms_loop fmpy d9,d6,d3 fadd.s d0,d1 fmpy d7,d6,d0 fadd.s d5,d3 fmpy d9,d4,d2 fadd.s d0,d1 fmpy d5,d4,d0 fadd.s d7,d2 y:(r4)+,d7.s x:(r0)+,d4.s d2.s,y:(r5)+ y:(r4)+,d5.s x:(r0)+,d6.s d3.s,y:(r5)+ _lms_loop fmpy d9,d6,d3 fadd.s d0,d1 fadd.s d5,d3 d2.s,y:(r5)+ (r0)- move d3.s,y:(r5)+ move y:dsig,d2.s fsub.s d1,d2 move d2.
PAGE 547
COEFFICIENT AND STATE VARIABLE STORAGE R0 x: R4 S1 S2 S3 Sx y: k1 k2 k3 M0=3 (mod 4) M4=2 (mod 3) SINGLE SECTION ∑ t t’ equations: t’=s*k+t, t’→t k s’=t*k+s k Z-1 B-28 ∑ s s’ DSP96002 USER’S MANUAL MOTOROLA
PAGE 548
DSP56000 IMPLEMENTATION Program ICycles Words move move move move #state,r0 #N,m0 #k,r4 #N-1,m4 ;point to state variable storage ;N=number of k coefficients ;point to k coefficients ;mod for k’s movep y:datin,b ;get input move b,x:(r0)+ y:(r4)+,y0 ;save 1st state, get k do #N,_elat ;do each section move x:(r0),a b,y1 ;get s, copy t for mul macr y1,y0,a a,y0 ;t*k+s, copy s macr x0,y0,b a,x:(r0)+ y:(r4)+,y0 ;s*k+t, sv st, nxt k _elat move x:(r0)-,x0 y:(r4)-,y0 ;adj r0,r4 w/dummy loads movep b,y:datout
PAGE 549
B.1.
PAGE 550
DSP56000 IMPLEMENTATION Program ICycles Words move #k+N-1,r0 ;point to k move #N-1,m0 ;number of k’s-1 move #state,r4 ;point to filter states move m0,m4 ;mod for states movep y:datin,a ;get input sample move x:(r0)-,x0 y:(r4)+,y0 ;first k, first s macr -x0,y0,a x:(r0)-,x0 y:(r4)-,y0 ;t’=t-k*s 1 1 1 1 do #n-1,_endlat macr -x0,y0,a ;do sections ;t’-k*s, save state 2 1 3 1 ;copy t’,get s again ;fnd s,get s,get k 1 1 1 1 b,y:(r4)+ move a,x1 y:(r4)+,b macr x1,x0,b x:(r0)-,x0 y:(r4)-,y0 _endlat move b,
PAGE 551
B.1.
PAGE 552
SINGLE SECTION EQUATIONS: ∑ t t’ t’=t-k*s k s’=s+k*t’ t’→t output= sum(s’*w) k’ ∑ s’ s Z-1 w DSP56000 IMPLEMENTATION Program ICycles Words move #k,r0 move #2*N,m0 ;point to coefficients ;mod 2*(# of k’s)+1 move #state,r4 ;point to filter states move #N,m4 ;mod on filter states movep y:datin,a ;get input sample move x:(r0)+,x0 y:(r4)-,y0 ;get first k, first s do #N,_el ;do filter macr -x0,y0,a b,y:(r4)+ ;t-k*s, save prev s 1 2 1 1 3 1 move macr 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 _el move c
PAGE 553
Program Words move #k,r0 ;point to coefficients move #2*N,m0 ;mod 2*(# of k’s)+1 move #state,r4 ;point to filter states move #N,m4 ;mod on filter states move p y:datin,d1 move ;get input sample #2,n4 move do x:(r0)+,d5.s y:(r4)-,d6.s #N,_elat fmpy d5,d6,d0 ICycles fadd.s d0,d3 fadd.s d0,d1 d6.s,d3.s fmpy.s d5,d1,d0 d3.s,y:(r4)+n4 x:(r0)+,d5.s y:(r4)-,d6.s 1 1 1 1 2 3 1 1 1 1 1 1 1 1 _elat fadd.s d0,d3 fclr d0 d3.s,y:(r4)+ 1 1 fclr d1 d1.
PAGE 554
B.1.
PAGE 555
SINGLE SECTION ∑ q t EQUATIONS: t’ t’=t*q-k*s u’=t*k+s*q t’→t k k’ output=sum (w*u’) ∑ u’ q s Z-1 u w DSP56000 IMPLEMENTATION Program ICycles Words move move #coef,r0 #3*N,m0 ;point to coefficients ;mod on coefficients move move #state,r4 #N,m4 ;point to state variables ;mod on filter states movep y:datin,y0 ;get input sample move do mpy x:(r0)+,x1 ;get first Q in table #order,_endnlat x1,y0,a x:(r0)+,x0 y:(r4),y1 ;q*t, get k, get s macr -x0,y1,a b,y:(r4)+ ;q*t-k*s, save new s 1 2 1
PAGE 556
Program ICycles Words move #coef,r0 ;point to coefficients move #3*N,m0 ;mod on coefficients move #state,r4 ;point to state variables move #N,m4 ;mod on filter states move p y:datin,d5.s ;get input sample move do x:(r0)+,d6.s ;get q 1 1 2 3 ; 1 1 1 1 1 1 1 1 1 1 d3.s,y:(r4)+ ;save 2nd s 1 1 #N,_elat t*q fmpy d5,d6,d2 k*w+q*s ; get k get s fadd.s d1,d3 x:(r0)+,d4.s y:(r4)+,d7.s k*s save s fmpy.s d4,d7,d0 ; t*k d3.s,y:(r4)+ fmpy d5,d4,d1 w*q-k*s ; q*s fsub.
PAGE 557
B.1.21 1x3 3x3 and 1x4 4x4 Matrix Multiply 1x3 3x3 Matrix Multiply Program ICycles Words move #mat_a,r0 ;point to A matrix move #2,m0 ;mod 3 move #mat_b,r4 ;point to B matrix move #-1,m4 ;set for linear addressing move #mat_c,r1 ;output C matrix move x:(r0)+,d4.s y:(r4)+,d5.s ;a11,b11 1 1 fmpy.s d4,d5,d3 x:(r0)+,d4.s y:(r4)+,d5.s ;a12,b21 1 1 fmpy.s d4,d5,d0 x:(r0)+,d4.s y:(r4)+,d5.s ;a13,b31 1 1 fmpy d4,d5,d3 fadd.s d3,d0 x:(r0)+,d4.s y:(r4)+,d5.
PAGE 558
fmpy d7,d4,d0 fadd.s d2,d1 fmpy.s d7,d3,d2 y:(r4)+,d7.s ;b23 1 1 d1.s,x:(r1)+ y:(r4)+,d7.s ;b33 1 1 fmpy d7,d5,d2 fadd.s d2,d0 y:(r4)+,d7.s ;b43 1 1 fmpy d7,d6,d2 fadd.s d2,d0 y:(r4)+,d7.s ;b14 1 1 fmpy d7,d4,d1 fadd.s d2,d0 y:(r4)+,d7.s ;b24 1 1 d0.s,x:(r1)+ y:(r4)+,d7.s ;b34 1 1 y:(r4)+,d7.s ;b44 1 1 fmpy.s d7,d3,d0 fmpy d7,d5,d0 fadd.s d0,d1 fmpy d7,d6,d0 fadd.s d0,d1 1 1 fadd.s d0,d1 1 1 1 1 move d1.s,x:(r1)+ --- --Totals: 19 19 B.1.
PAGE 559
rep #N-1 mac x0,y0,a macr x0,y0,a move ;sum 1 1 x:(r1)+,x0 y:(r5)+n5,y0 1 2 (r4)+ ;finish, next column B 1 1 a,y:(r6)+ ;save output 1 1 move (r0)+n0 ;next row A 1 1 move #mat_b,r4 ;first element B 1 1 ----19 ----- _ecols _erows ((8+(N-1))N+5)N+8 = N3 +7*N2 +5N+8 ← At a DSP56000/1 clock speed of 20.5 MHz, a [10x10][10x10] can be computed in .1715 ms.
PAGE 560
B.1.23 N Point 3x3 2-D FIR Convolution The two dimensional FIR uses a 3x3 coefficient mask: c(1,1) c(1,2) c(1,3) c(2,1) c(2,2) c(2,3) c(3,1) c(3,2) c(3,3) Stored in Y memory in the order: c(1,1), c(1,2), c(1,3), c(2,1), c(2,2), c(2,3), c(3,1), c(3,2), c(3,3) The image is an array of 512x512 pixels. To provide boundary conditions for the FIR filtering, the image is surrounded by a set of zeros such that the image is actually stored as a 514x514 array. i.e. 514 ... 0 ... . . 512 . . . 0 0 .
PAGE 561
r0 r1 r2 →image(n,m) →image(n+514,m) →image(n+2*514,m) r4 r5 →FIR coefficients →output image image(n,m+1) image(n,m+2) image(n+514,m+1) image(n+514,m+2) image(n+2*514,m+1) image(n+2*514,m+2) DSP56000 IMPLEMENTATION Program Words ;point to coefficients 1 ;mod 9 1 ;top boundary 1 ;left of first pixel 1 ;left of first pixel 2nd row 1 move move move move move #mask,r4 #8,m4 #image,r0 #image+514,r1 #image+2*514,r2 move move #2,n1 n1,n2 ;adjustment for end of row 1 1 1 1 move #imageout,r5 ;output i
PAGE 562
DSP96002 IMPLEMENTATION Program ICycles Words 1 1 move #mask,r4 ;point to coefficients move #8,m4 ;mod 9 1 1 move #image,r0 ;top boundary 1 1 move #image+514,r1 ;left of first pixel 1 1 move #image+2*514,r2 ;left of first pixel 2nd row 1 1 move #2,n1 ;adjustment for end of row 1 1 move n1,n2 1 1 move #imageout,r5 1 1 move x:(r0)+,d4.s y:(r4)+,d5.s ;preload, get c(1,1) 1 1 fmpy.s d4,d5,d0 x:(r0)+,d4.s y:(r4)+,d6.
PAGE 563
B.1.24 Table Lookup with Linear Interpolation Between Points This performs a table lookup and linear interpolation between points in the table. It is assumed that the spacing between the known values (breakpoints) is a constant. No range checking is performed on the input number because it is assumed that previous calculations may have limiting and range checking.
PAGE 564
indspc equ 5.0 ;index spacing rindspc equ 1.0/indspc ;reciprocal of index spacing move #table,n0 ;point to start of table move #firstindex,d6.s ;value of first index move #rindspc,d7.s fsub.s d6,d0 ;adjust input relative to index 1 1 fmpy.s d7,d0,d0 ;reduce range and create index 1 1 floor d0,d1 ;get index 1 1 int d1 1 1 fsub.s d2,d0 1 1 ;clear address ALU pipe 1 1 ;reciprocal of index spacing d1.s,d2.s ;convert index to int,copy int part d1.
PAGE 565
rmin equ -3.14159 range equ 2*3.14159 o_range equ 1.0/range Program ICycles Words move #range,d7.s ;load desired range move #rmin,d2.s ;load range min move #o_range,d3.s ;load reciprocal of range fadd.s d2,d0 ;adjust to rmin 1 1 fmpy.s d0,d3,d0 ;scale the input 1 1 floor d0,d1 ;get integer part 1 1 fsub.s d1,d0 ;get fractional part 1 1 fmpy.s d7,d0d0 ;spread out fraction to range 1 1 fadd.s d2,d0 ;adjust to rmin 1 1 --- --- 6 6 Totals: The output is in d0.
PAGE 566
Program Words 1 ICycles #2.0,d2.s 2 2 d2.s,d3.s 1 1 d2.s,d4.s 1 1 1 1 1 1 --- --- 7 7 fseedd d5,d4 fmpy.s d5,d4,d5 fmpy d0,d4,d0 fsub.s d5,d2 fmpy.s d5,d2,d5 fmpy d0,d4,d0 fsub.s d5,d3 fmpy.s d0,d3,d0 Totals: 1 Operation table: d0 (dividend) / 0.0 number infinity / d5 (divisor) ------------------------------------/ NaN NaN 0.0 number NaN NaN NaN 0.0 infinity number NaN infinity B.1.
PAGE 567
--- --Totals: 2. 4 4 Static rotate right 1-32 bits. The 32 bit integer to be rotated is in d0.l. The number of bits to rotate is N. The resulting carry is the value of bit N-1 of the register. For example, if N=3 (three bit rotate right), then the resulting carry will be the value of bit 2 of the register. d0.l,d1.
PAGE 568
Totals: 10 10 The following code assumes a rotating model of the form: 31 0 In this model, the carry does not participate in the rotations. The carry assumes the value of the bit that was rotated around the end of the register. 1. Static rotate left 0-32 bits. The 32 bit integer to be rotated is in d0.l. The number of bits to rotate is N. The resulting carry is the value of bit 32-N of the register.
PAGE 569
;shift other part bits together 1 1 1 1 or --- Totals: 3. 4 d1,d0 ;merge --- 4 Dynamic rotate left 0-32 bits. The 32 bit integer to be rotated is in d0.l. The number of bits to rotate is in d2.l. In the special case of a zero shift count, the resulting carry is the most significant bit. In the special case of a 32 shift count, the resulting carry is the least significant bit. In both cases, the register shifted is unchanged. Program ICycles Words move #32,d1.l ;get 32 1 1 sub d2,d1 d2.l,d1.
PAGE 570
Program ICycles Words move #32,d1.l ;get 32 1 1 sub d2,d1 d2.l,d1.h ;32-shift, move shift 1 1 move d1.l,d0.h ;move other shift 1 1 lsl d0,d0 d0.l,d1.l ;shift, copy input 1 1 lsr d1,d1 ;shift other part 1 1 or d1,d0 ;merge bits together 1 1 --Totals: 6 --- 6 B.1.28 Bit Field Extraction/Insertion The process of bit field extraction is performed on a 32 bit integer in the lower part of a register.
PAGE 571
3. Dynamic bit field extraction, zero extend. Register d1.l contains FOFF, d2.l contains FSIZE. Program ICycles Words move #32,d3.l ;register size 1 1 sub d2,d3 ;32-fsize 1 1 sub d1,d3 d3.l,d4.h ;32-fsize-foff, 32-fsize 1 1 move d3.l,d0.h ;move 32-fsize-foff 1 1 lsl d0,d0 d4.h,d0.h ;shift off upper bits 1 1 lsr d0,d0 ;right justify 1 1 --Totals: 6 --- 6 4. Dynamic bit field extraction, sign extend. Register d1.l contains FOFF, d2.l contains FSIZE. Program ICycles Words move #32,d3.
PAGE 572
6. Dynamic bit field insertion. tains FSIZE. Register d2.l contains FOFF, d3.l con- Program ICycles Words move #32,d4.l ;get 32 1 1 sub d3,d4 #-1,d5.l ;32-fsize, load 1’s mask 2 2 sub d2,d4 d4.l,d5.h ;32-(fsize+foff) 1 1 lsl d5,d5 d4.l,d5.h ;shift one’s mask up 1 1 lsr d5,d5 ;shift one’s mask down 1 1 andc d5,d0 d2.l,d1.h ;invert mask and clear 1 1 lsl d1,d1 ;move bits to field 1 1 or d1,d0 ;insert bit field 1 1 --Totals: 7. 9 --- 9 Static bit field clear.
PAGE 573
9. Dynamic bit field clear. Register d1.l contains FOFF, d2.l contains FSIZE. move #32,d3.l ;register size d2,d3 #-1,d2.l ;32-fsize, get 1s mask 2 d3.l,d3.h ;move shift count 1 1 lsr ;trim mask, get foff 1 1 lsl d1,d2 mask 1 1 andc d2,d0 and clear 1 1 --Totals: 7 Program ICycles Words 1 1 sub 2 move d3,d2 d1.l,d1.h ;align ;invert mask --- 7 10. Dynamic bit field set. Register d1.l contains FOFF, d2.l contains FSIZE. move #32,d3.l ;register size d2,d3 #-1,d2.l ;32-fsize, get 1s mask 2 d3.l,d3.
PAGE 574
seedr d5,d4 fmpy.s d4,d4,d2 #.5,d7.s fmpy.s d5,d2,d2 #3.0,d3.s ;x*y*y, get 3.0 fmpy d4,d7,d2 fmpy.s d2,d3,d4 d6.s,d3.s ;y/2*(3-x*y*y) 1 1 fmpy.s d4,d4,d2 ;y*y 1 1 fmpy.s d5,d2,d2 ;x*y*y 1 1 fmpy d4,d7,d2 d3.s,d6.s ;y/2, 3-x*y*y 1 1 fmpy.s d2,d3,d4 fsub.s d2,d3 fsub.s d2,d3 ;y approx 1/sqrt(x) 1 1 ;y*y, get .5 2 2 2 2 d3.s,d6.s ;y/2, 3-x*y*y 1 1 ;y/2*(3-x*y*y) Totals: B.1.
PAGE 575
Unsigned 32 Bit Integer Division of d0 = d0/d2 eor d1,d1 ; clear d1 do #32,dloop ;32 quotient bits 2 3 rol d0 ;dividend bit out, q bit in 1 1 rol d1 ;put in temp 1 1 cmp d2,d1 ;check for q bit 1 1 sub d2,d1 ;update if less 1 1 ifcc Program Words ICycles dloop rol d0 ;last q bit 1 1 not d0 ;complement q bits 1 1 --- --- 8 133 Totals: The final remainder is not produced.
PAGE 576
dive2big eor do rol rol cmp sub divloop_slow rol not divdone d2,d2 #32,divloop_slow d0 d2 d1,d2 d1,d2 ifhs d0 d0 ;same algorithm as 1st routine end The final quotient is not produced. This program may calculate only the number of quotient bits required and has variable execution time. Unsigned 32 Bit Integer Remainder of d0 = d0 rem d1, d0>=d1 cmp d1,d0 d0.l,d2.m jlo divdone bfind d0,d0 #0,d2.
PAGE 577
Signed 32 Bit Integer Division of d0 = d5/d2 eor d2,d5 d5.l,d0.l abs d2 d0.l,d3.
PAGE 578
divloop_fast not lsl lsr tst neg divdone d0 d2,d0 d2,d0 d2 d0 d8.l,d3.l d1.m,d2.l ifmi The final quotient is destroyed in the generation of the remainder. This program calculates only the number of quotient bits required and has variable execution time. Signed 32 Bit Integer Remainder of d0 = d0 rem d1, d0 >= d1 abs d1 d0.l,d2.l abs d2 d0.l,d1.m cmp d1,d2 d2.l,d2.m jlo divdone bfind d2,d0 bfind d1,d2 d0.h,d0.l move d2.h,d2.l sub d0,d2 d2.m,d0.l inc d2 d2.l,d2.h lsl d2,d1 d2.l,d2.h do d2.
PAGE 579
Registers: d0 = x d4 = limit d1 = y d5 = unused d2 = z d6 = unused d3 = unused d7 = unused Memory Map: X Memory Y Memory Xmin ← r0 Xmax Ymin Ymax Zmin Zmax Single Point Accept/Reject ori Program ICycles Words ;set accept bit 1 1 #$80,ccr move y:(r0)+,d4.s ;get window minimum 1 1 fcmp d4,d0 y:(r0)+,d4.s ;x-Xmin 1 1 fcmp d0,d4 y:(r0)+,d4.s ;Xmax-x 1 1 fcmp d4,d1 y:(r0)+,d4,s ;y-Ymin 1 1 fcmp d1,d4 y:(r0)+,d4.s ;Ymax-y 1 1 fcmp d4,d2 y:(r0)+,d4.
PAGE 580
X Memory Y Memory (n0=3) r0 → x0 Xmin ← r4 y0 Xmax z0 Ymin x1 Ymax y1 Zmin z1 Zmax MOTOROLA DSP96002 USER’S MANUAL B-61
PAGE 581
ori Program Words ;set accept/reject/overflow bits 1 #$e0,ccr move ICycles 1 x:(r0)+n0,d0.s y:(r4)+,d1.s ;get x0,Xmin 1 1 x:(r0)-n0,d0.s 1 1 1 1 1 1 fcmpg d0,d1 x:(r0)+n0,d0.s y:(r4)+,d1.s ;Xmax-x0, y0,Ymin 1 1 fcmp x:(r0)-n0,d0.s 1 1 1 1 1 1 fcmpg d0,d1 x:(r0)+n0,d0.s y:(r4)+,d1.s ;Ymax-y0, z0,Zmin 1 1 fcmp x:(r0)-n0,d0.s 1 1 1 1 ;Zmax-z1, get z0 1 1 ;Zmax-z0 1 1 --- --- 14 14 fcmp d1,d0 fcmpg d1,d0 fcmp d0,d1 d1,d0 y:(r4)+,d1.s ;x1-Xmin, Xmax x:(r0)+,d0.
PAGE 582
If the A bit is set, the line can be accepted. If the R bit is cleared, the line can be rejected. B.1.33.4 Four Point Polygon Accept/Reject This determines if the polygon consisting of the points (x0,y0,z0), (x1,y1,z1), (x2,y2,z2), (x3,y3,z3) is within a three-dimensional view cube. If the polygon is within the cube, the A (accept) bit of the CCR will be set. If the polygon is entirely outside of the cube, then the R bit will be cleared.
PAGE 583
fcmp d0,d1 fcmpg d0,d1 x:(r0)+,d0.s ;Ymax-y1, get y0 1 x:(r0)+n0,d0.s y:(r4)+,d1.s ;Ymax-y0, z0,Zmin 1 1 1 fcmp fcmp fcmp fcmpg fcmp fcmp fcmp fcmpg x:(r0)+n0,d0.s x:(r0)+n0,d0.s x:(r0)-n0,d0.s 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 --- --- 26 26 d1,d0 d1,d0 d1,d0 d1,d0 d0,d1 d0,d1 d0,d1 d0,d1 ;z0-Zmin, ;z1-Zmin, ;z2-Zmin, y:(r4)+,d1.s ;z3-Zmin, x:(r0)-n0,d0.s ;Zmax-z3, x:(r0)-n0,d0.s ;Zmax-z2, x:(r0)+,d0.
PAGE 584
bi0 x ∑ y w1 Z-1 bi1 ai1 ∑ w2 Z-1 bi2 The filter equations are: y ∑ ai2 = x*bi0 + w1 w1 = x*bi1 + y*ai1 + w2 w2 = x*bi2 + y*a2 Program Words nsec equ org ICycles 3 x:0 coef MOTOROLA dc .93622314E-04 ;/* section 1 B0 */ dc .18724463E-03 ;/* section 1 B1 */ dc .19625904E+01 ;/* section 1 A1 */ dc .93622314E-04 ;/* section 1 B2 */ dc -.96296486E+00 ;/* section 1 A2 */ dc .94089162E-04 ;/* section 2 B0 */ dc .18817832E-03 ;/* section 2 B1 */ dc .
PAGE 585
org y:0 w1 dsm nsec w2 dsm nsec org p:$100 move #coef,r0 move #5*nsec-1,m0 move #w1,r4 move #nsec-1,m4 move #w2,r5 move m4,m5 ; ; input in d7 ; move x:(r0)+,d4.s ;get b0 1 1 do #nsec,tran 2 3 fmpy d7,d4,d0 fadd.s d1,d2 x:(r0)+,d4.s y:(r4),d5.s 1 1 fmpy d7,d4,d1 fadd.s d5,d0 x:(r0)+,d4.s y:(r5),d6.s 1 1 fmpy d0,d4,d2 fadd.s d6,d1 x:(r0)+,d4.s d2.s,y:(r5)+ 1 1 fmpy d7,d4,d2 fadd.s d2,d1 x:(r0)+,d4.s d0.s,d7.s 1 1 fmpy.s d0,d4,d1 x:(r0)+,d4.s d1.
PAGE 586
R Direction vector of reflection of the point source from the object R={Rx,Ry,Rz} V Direction vector from the object to the viewpoint Ks Specular reflection constant 0<= Ks <= 1.0 It should be noted that all vectors are normalized to unit magnitude.
PAGE 587
3-D Graphics Illumination Program Words move #vec,r0 2 move #ktbl,r4 2 move x:(r0)+,d6.s y:,d7.s 1 fmpy.s d6,d7,d0 x:(r0)+,d6.s y:,d7.s 1 fmpy.s d6,d7,d1 x:(r0)+,d6.s y:,d7.s 1 fmpy d6,d7,d1 fadd.s d1,d0 x:(r0)+,d6.s y:,d7.s 1 fmpy d6,d7,d1 fadd.s d1,d0 x:(r4)+,d2.s 1 fmpy.s d2,d0,d0 x:(r4)+,n1 1 intrz d0 x:(r0)+,d6.s y:,d7.s 1 fmpy.s d6,d7,d0 d0.l,r1 1 move x:(r0)+,d6.s y:,d7.s 1 fmpy d6,d7,d0 fadd.s d0,d1 x:(r1+n1),d2.s 1 fadd.s d0,d1 x:(r4)+,d0.s 1 fmpy.s d0,d1,d1 x:(r4)+,d0.s 1 fadd.s d1,d2 x:(r4)+,d1.
PAGE 588
The resulting unsigned pseudorandom integer number is in d0.l. Reference: VAX/VMS Run-Time Library Routines Reference Manual, Volume 8C, p. RTL-433. B.1.37 Bezier Cubic Polynomial Evaluation Bezier polynomials are used to represent curves and surfaces in graphics. The Bezier form requires four points: two endpoints and two points other points. The four points define (in two dimensions) a convex polygon. The curve is bounded by the edges of the polygon.
PAGE 589
Bezier Cubic Evaluation move #Ptable+2,r0 move #2,n0 move #TK,r4 Program Words ICycles move x:(r0)-,d4.s 1 1 move x:(r4)+,d0.s y:,d5.s 1 1 x:(r0)-,d4.s d0.s,d5.s 1 1 y:(r4)-,d4.s 1 1 1 1 1 1 fmpy d4,d0,d1 fsub.s d5,d0 fmpy.s d4,d0,d2 fmpy d4,d5,d1 fsub.s d1,d2 fmpy.s d1,d2,d2 fmpy.s d0,d0,d1 x:(r0)+n0,d4.s 1 1 fmpy.s d1,d4,d1 d5.s,d4.s 1 1 1 1 1 1 1 1 fmpy.s d1,d5,d1 1 1 fadd.s d1,d2 1 1 --- --- 13 13 fmpy d4,d4,d1 fsub.s d1,d2 fmpy.s d0,d2,d2 fmpy.
PAGE 590
Four 8 Bit Packs Program Words 1 joinb d0,d1 ;d1 = xxAB joinb d2,d3 ;d3 = xxCD 1 1 join d1,d3 ;d3 = ABCD 1 1 --- --- 3 3 Totals: B.1.38.2 ICycles 1 Pack Two 16 Bit Words Into a 32 Bit Word The following packs two 16 bit words into a single 32 bit word. The words to be packed are right justified in two separate registers: d0 = xY d1 = xZ Two 16 Bit Packs join B.1.38.
PAGE 591
B.1.39 Nth Order Polynomial Evaluation for Two Points ;An Nth order polynomial c1XN + c2XN-1 + ...cNX + cN+1 can be factored ;and represented as ((c1X + c2)X + c3)X + ...) + cN+1. This routine ;evaluates the polynomial at both X = s and X = t. ; ;Memory Map : X Y ; ; r1 -> s t ; . ; . ; r0 -> c1 ; c2 ; c3 ; . ; . ; cN+1 ; Setup N equ order of polynomial move #coef,r0 move #2_pts,r1 move x:(r1)+,d5.s y:,d4.s ; s, t move x:(r0)+,d1.s ; c1 move d1.s,d0.
PAGE 592
B.1.40.1 32 Bit Block Transfer 32 Bit Block Transfer BITBLT org x:0 source ds 1 ;source address dest ds 1 ;destination address offset ds 1 ;bit number start (0-31) count ds 1 ;number of 32 bit source words org p:$50 move move sub d0,d1 move move move lsl d1,d4 move do lsr lsl or move bitblt lsr lsr lsl or move x:offset,d0.l #32,d1.l x:source,r0 x:dest,r1 d1.l,d1.h y:(r1),d4.l d0.l,d0.
PAGE 593
B.1.40.2 64 Bit Block Transfer A more efficient implementation of BITBLT may be performed by transferring 64 bits at a time. Thus, the value of COUNT specifies the number of 64 bit transfers (two 32 bit words). 64 Bit Block Transfer BITBLT org x:0 source ds 1 ;source address dest ds 1 ;destination address offset ds 1 ;bit number start (0-31) count ds 1 ;number of 64 bit source words org p:$50 move x:offset,d0.l move x:offset,d0.l move #32,d1.l sub d0,d1 x:source,r0 move x:dest,r1 move d1.l,d1.
PAGE 594
B.1.41 64x64 Bit Unsigned Multiply This performs a double precision unsigned integer multiply. The 64 bit integer is formed by the concatenation of two 32 bit registers. Let X = A:B and Y = C:D, then X*Y can be written as: A B * C D -----------------+ B*D + A*D + B*C + A*C -----------------= W X Y Z 64x64 Bit Unsigned Multiply d3:d7:d6:d4 = d0:d1 * d2:d3 mpyu mpyu mpyu mpyu move add addc inc add addc inc MOTOROLA d0,d2,d7 d0,d3,d5 d1,d3,d4 d1,d2,d6 d0,d5 d1,d2 d3 d5,d6 d2,d7 d3 d7.h,d3.l d4.
PAGE 595
B.1.42 Signed Reciprocal Generation This generates a fast approximation to 1/x. Approximation of 1/d1 16 Bit Accuracy fseedd d1,d6 fmpy.s d1,d6,d1 fsub.s fmpy.s Program Words 1 #2.0,d4.s 1 2 2 d1,d4 1 1 d6,d4,d1 1 1 --- --- 5 5 Totals: Approximation of 1/d1 32 Bit Accuracy fseedd d1,d6 Program Words 1 ICycles 1 fmpy.s d1,d6,d1 #2.0,d4.s 2 2 fsub.s d1,d4 d4.s,d3.s 1 1 fmpy.s d1,d4,d1 1 1 fmpy d6,d4,d1 1 1 fmpy.s d1,d3,d1 1 1 --- --- 7 7 fsub.
PAGE 596
Program Words ICycles ; Calculate dx and dy fsub.s d6,d2 d2.s,d4.s 1 1 fsub.s d7,d3 d3.s,d5.s 1 1 ; Determine whether to increment x or y fcmpm d3,d2 1 1 fjge _inc_x 2 2 d2.s,d0.s 1 1 ftfr.s d4,d6 fflt 1 1 ftfr.s d5,d7 fflt 1 1 d3.s,d1.s 1 1 1 1 1 1 2 2 1 1 d9.s,d2.s 1 1 d0,d4,d0 fsub.s d5,d2 d2.s,d3.
PAGE 597
; Switch endpoints if necessary _inc_x ftst d2 d3.s,d0.s 1 1 ftfr.s d4,d6 fflt 1 1 ftfr.s d5,d7 fflt 1 1 d2.s,d1.s 1 1 1 1 iflt 1 1 _draw1_x 2 2 1 1 d9.s,d2.s 1 1 d0,d4,d0 fsub.s d5,d2 d2.s,d3.s 1 1 1 1 1 1 1 1 1 1 ; Fix x0 and dx int d6 int d1 neg d1 jeq ; Calculate dy/dx fseedd d2,d4 fmpy.s d2,d4,d5 fmpy fmpy.s d5,d2,d5 fmpy d2.s,d4.s d0,d4,d0 fsub.s d5,d3 fmpy.s d0,d3,d0 d7.s,d2.
PAGE 598
B.1.43.2 Integer Incremental Line Drawing Algorithm This implementation of line drawing uses Bresenham’s algorithm. This algorithm uses only integer operations to generate the points.
PAGE 599
neg d1 iflt neg d0 iflt tst d0 jlt _set_y_xn ; Increment y, dx positive case ; Set up registers _set_y_xp lsr d1 d1.l,d2.l dec d2 d2.l,d4.l move d1.l,r0 move d2.l,m0 move d0.l,n0 move d0.l,d5.l ; Draw first point jsr _draw_point ; Draw additional points do d4.l,_line_y_xp inc d7 r0,d2.l add d5,d2 (r0)+n0 cmp d4,d2 inc d6 jsr _draw_point ifge _line_y_xp rts ; Increment y, dx negative case ; Set up registers _set_y_xn lsr d1 d1.l,d2.l dec d2 d2.l,d4.
PAGE 600
cmp d4,d2 dec d6 jsr _draw_point ifge _line_y_xn rts ; Increment x case ; If dx is negative, switch endpoints and sign of dx and dy _inc_x tst d0 jeq _draw1 tfr d4,d6 iflt tfr d5,d7 iflt neg d0 iflt neg d1 iflt tst d1 jlt _set_x_yn ; Increment x, dy positive case ; Set up registers _set_x_yp lsr d0 d0.l,d2.l dec d2 d2.l,d4.l move d0.l,r0 move d2.l,m0 move d1.l,n0 move d1.l,d5.l ; Draw first point jsr _draw_point ; Draw additional points do d4.
PAGE 601
lsr d0 d0.l,d2.l dec d2 d2.l,d4.l neg d1 d0.l,r0 move d2.l,m0 move d1.l,n0 move d1.l,d5.l ; Draw first point jsr _draw_point ; Draw additional points do d4.l,_line_x_yn inc d6 r0,d2.l add d5,d2 (r0)+n0 cmp d4,d2 dec d7 ifge _draw1 jsr _draw_point _line_x_yn rts ; Draw a single point _draw_point move d6.l,x:(r1)+ d7.l,y: rts B.1.44 Wire-Frame Graphics Rendering WIRE-FRAME RENDITION OF A THREE DIMENSIONAL POLYLINE ON THE MOTOROLA DSP96002 Version 1.
PAGE 602
If the point is found to lie outside the viewing pyramid, an algorithm to clip a single point is performed and the program enters the trivial reject loop. The trivial reject loop assumes that the last displayed point was outside the viewing pyramid. It pulls a new point from the input list, converts it to clipping space and checks if the line joining the new point and the last point can be trivially rejected. Trivial rejection occurs when both points of a line lie outside of a clipping plane.
PAGE 603
OUTPUT Address register r5 should point to a display list data area when the polyline generator is called. Afterwards, the display list will be in the following format: Polygon1: X1,Y1 X2,Y2 X3,Y3 Xn,Yn Delimiter -1.0 Polygon2: X1,Y1 X2,Y2 Delimiter PolygonM: -1.0 X1,Y1 Xn,Yn -2.0 All coordinates are in IEEE single-precision floating-point format to speed up the DSP96002 floating-point incremental line drawing algorithm.
PAGE 604
The following memory map results: X Memory r0 → Y Memory Xobj0 n0=0.0 Yobj0 Zobj0 Xobj1 r1 → Xnew Znew n1=2 Ynew Wnew m1=3 Xold Zold Yold Wold r4 → n4=2 Matrix1,1 Matrix4,1 m4=13 Matrix2,1 Matrix3,1 Matrix1,2 Matrix4,2 Matrix2,2 Matrix3,2 Matrix1,3 Matrix4,3 Matrix2,3 Matrix3,3 Matrix1,4 Matrix4,4 Matrix2,4 Matrix3,4 Xscale Xoffset Yscale Yoffset Xout0 ← r5 Yout0 n5=-1.
PAGE 605
n0 = 0.0 for z limit test and double point clipping n5 = -1.0 for end of polyline marker TRIVIAL ACCEPT LOOP The transformation from object space to screen space is performed in lines 19-33. This is a {1x4}{4x4} matrix multiplication but because the W coordinate of the {1x4} input vector {X Y Z W} is always equal to one, four multiplications can be eliminated. Lines 39-47 determine if the point is within the viewing pyramid.
PAGE 606
Substituting the value of t results in the determinant y2 = | y2 w2-y2 | | y1 w1-y1 | ------------------(w1-x1) - (w2-x2) The equations for z2 and w2 are analogous. Since w2 has the same denominator as x2, y2 and z2, and these will be divided by w2 in the perspective transformation, the division shown above does not need to be performed. Lines 151-162 determine which planes that the point is outside and call the appropriate clipping routines.
PAGE 607
The reject loop single point clipping code is very similar to the analogous code in the accept loop. It calls the same clipping subroutines in lines 520-617. Then the point that was just calculated is transformed, scaled and translated and stored in the output list (lines 305-321). Finally, the new point (which was accepted) is transformed, scaled and translated (lines 327-345). Control is transferred to the accept loop.
PAGE 608
; wf3d Words ICycles move x:(r0)+,d0.s ;X 1 1 move x:(r0)+,d5.s y:(r4)+,d4.s ;Y M11 1 1 fmpy.s d4,d0,d2 x:(r4)+,d3.s y:,d4.s ;M41 M21 1 1 fmpy d4,d5,d3 fadd.s d3,d2 x:(r0)+,d6.s y:(r4)+,d4.s ;Z M31 1 1 fmpy d4,d6,d3 fadd.s d3,d2 x:(r1)+n1,d1.s y:(r4)+,d4.s ;r1+ M12 1 1 fmpy d4,d0,d1 fadd.s d3,d2 x:(r4)+,d3.s y:,d4.s 1 fmpy d4,d5,d3 fadd.s d3,d1 y:(r4)+,d4.s ; M32 1 1 fmpy d4,d6,d3 fadd.s d3,d1 d2.s,x:(r1)+ y:(r4)+,d4.s ;Xo M13 1 1 fmpy d4,d0,d2 fadd.
PAGE 609
; Multiply coordinates by 1/W, scale and add offset fmpy.s d0,d4,d2 ; fmpy.s d2,d1,d2 fmpy x:(r4)+,d4.s 1 1 1 1 ; 1 1 d2.s,y:(r5)+ ; 1 1 ; 1 1 1 1 M11 1 1 ;M41 M21 1 1 y:,d6.s d5,d4,d3 fadd.s d3,d2 fmpy.s d3,d1,d3 fadd.s d6,d3 x:(r0)+,d0.s dec d7 ;Ys d3.s,y:(r5)+ ; Yf Y1 ;--------------------------------------------------------; ; Accept loop ; ;--------------------------------------------------------- ; Transform point to clip space _accept_loop move x:(r0)+,d5.
PAGE 610
; Determine if point is within view volume fneg.s d1 d1.s,d2.s ; 1 1 ; 1 1 1 1 ori #$80,ccr fcmp d1,d0 fcmp d0,d2 x:(r1)-,d5.s ;Yn 1 1 fcmp d1,d5 n0,d4.s ; 1 1 fcmp d5,d2 1 1 fcmp d4,d6 d7.l,x:(r6) ; 1 1 fcmp d6,d2 d6.s,d7.s ; 1 1 jclr #7,sr,_accept_clip ; 2 3 ; 1 1 d9.s,d4.s ; 1 1 fsub.s d1,d4 d4.s,d3.s d2.s,y:(r1)- ; Wo 1 1 d7.s,y: Zo 1 1 1 1 1 1 1 1 1 1 ; 1 1 d2.s,y:(r5)+ ; 1 1 ; 1 1 1 1 d2.s,y:(r1) ; y:(r1)-,d6.
PAGE 611
;--------------------------------------------------------; ; Accept loop single-clip routine ; ;--------------------------------------------------------- ; Dispatch to single-plane clipping routines _accept_clip fsub.s d0,d2 fjslt d2.s,d1.s _clip1_xp fadd.s d0,d1 fjslt d1.s,d2.s _clip1_xn fsub.s d5,d2 fjslt d2.s,d1.s _clip1_yp fadd.s d5,d1 fjslt d1.s,d2.s _clip1_yn fsub.s d6,d2 d2.s,d1.
PAGE 612
fmpy.s d2,d1,d2 fmpy x:(r4)+,d4.s y:,d6.s ;Ys Yf 1 1 d5,d4,d3 fadd.s d3,d2 d0.s,x:(r1)+n1 d7.s,y: ;Xo Zo 1 1 1 1 1 1 1 1 ; -1.0 1 1 fmpy.s d3,d1,d3 x:(r6),d7.l fadd.s d6,d3 x:(r0)+,d0.s move ;Cnt d2.s,y:(r5)+ ;X d3.s,y:(r5)+ ; n5,y:(r5)+ Y1 dec d7 jne _reject_loop ; 2 2 jmp _end ; 2 2 1 1 x:(r0)+,d0.s y:(r4)+n4,d4.s ;X r4+2 1 1 move x:(r0)+,d5.s y:(r4)+,d4.s ;Y M11 1 1 fmpy.s d4,d0,d2 x:(r4)+,d3.s y:,d4.
PAGE 613
ori #$e0,ccr ; y:(r1)-,d2.s ; 1 1 1 fneg.s d1 d1.s,d5.s fneg.s d2 x:(r1)+n1,d6.s d2.s,d4.s ;Xo 1 1 fcmp d2,d6 x:(r1)-,d0.s ;Xn 1 1 fcmpg d1,d0 (r4)+n4 ;r4+2 1 1 fcmp d6,d4 ; 1 1 fcmpg d0,d5 x:(r1)+n1,d6.s ;Yo 1 1 fcmp d2,d6 x:(r1)+,d3.s ;Yn 1 1 fcmpg d1,d3 ; 1 1 fcmp d6,d4 ; 1 1 fcmpg d3,d5 y:(r1)+n1,d6.s ;Zo 1 1 fcmp d6,d4 y:(r1)+n1,d2.
PAGE 614
fcmp d0,d5 ; 1 1 fcmp d3,d5 ; 1 1 fcmp d2,d5 ; 1 1 jclr #7,sr,_r_clip2 ; 2 3 ;--------------------------------------------------------; ; Reject loop single-clip routine ; ;--------------------------------------------------------; Dispatch to clipping routines move x:(r1)+,d0.s y:,d6.s ;Xo Zo 1 1 move x:(r1)+n1,d5.s y:,d2.s ;Yo Wo 1 1 move d7.l,x:(r6) ;Cnt 1 1 fsub.s d0,d2 d2.s,d1.
PAGE 615
; Multiply coordinates by 1/W, scale and add offset (old point) fmpy.s d0,d4,d2 ; fmpy.s d2,d1,d2 fmpy x:(r4)-,d4.s y:,d6.s d5,d4,d3 fadd.s d3,d2 ;Ys Yf ; fmpy.s d3,d1,d3 d2.s,y:(r5)+ 1 1 1 1 1 1 ; X1 1 1 y:(r1)+n1,d2.s ; Wn 1 1 d3.s,y:(r5)+ Y1 1 1 ; 1 1 d9.s,d4.s ; 1 1 fsub.s d1,d4 d4.s,d3.s d2.s,y:(r1)+ ; Wo 1 1 ;Xn Zn 1 1 d2.s,y: ;Xo Zo 1 1 y:,d3.s ;Xs Xf 1 1 1 1 1 1 1 1 1 1 1 1 1 1 fadd.
PAGE 616
;--------------------------------------------------------; ; Double point clipping routine ; ;--------------------------------------------------------- ; Dispatch to old point clipping routines _r_clip2 move d7.l,x:(r6) move y:(r1)+,d1.l ;Cnt r1+ 1 1 y:(r1)-,d1.s ; 1 1 Wo move x:(r1)+,d5.s ;Xo 1 1 move n0,d7.s ; 1 1 ; 1 1 ; 2 2 ;Yo 1 1 ; 2 2 ; 1 1 ; 2 2 1 1 ; 2 2 ; 1 1 ; 2 2 ;Xn 1 1 ; 2 2 fsub.s d1,d5 d5.s,d6.s fjsgt _clip2_xop fadd.
PAGE 617
fjsgt _clip2_ynp ; fadd.s d1,d6 fjslt 2 2 1 1 ; 2 2 ; 1 1 y:(r1)+n1,d5.s ;Zn _clip2_ynn fsub.s d1,d5 d5.s,d6.s fjsgt _clip2_znp ; 2 2 ftst d6 ; 1 1 fjslt _clip2_znn ; 2 2 1 1 ; Check for rejection move x:(r1)+n1,d3.s y:(r6),d5.s ;Xo d7.s,d4.s ; 1 1 ; 2 2 ;Xn 1 1 ;Xo 1 1 fmpy.s d4,d6,d1 ; 1 1 fmpy ;Yn 1 1 ;Yo 1 1 fsub.s d3,d6 d6.s,x:(r1)+n1 d1.
PAGE 618
fmpy d5,d6,d2 fadd.s d3,d1 1 1 ;Wnd 1 1 ; 1 1 d9.s,d4.s ; 1 1 fsub.s d1,d4 d4.s,d3.s ; 1 1 ; r4-2 1 1 ; 1 1 1 1 1 1 Yf 1 1 y:(r1)+n1,d4.s ; Zn 1 1 d4.s,y:(r1)+n1 ; Zo 1 1 d2.s,y:(r5)+ ; X1 1 1 move y:(r6)-,d1.s ; Wnd 1 1 move d3.s,y:(r5)+ ; Y1 1 1 ; 1 1 ; 1 1 y:(r6)-,d5.s ; Ynd 1 1 y:(r6),d0.s ; Xnd 1 1 ; 1 1 1 1 fadd.s d3,d2 d4.s,y:(r1)+ ; d1.s,y:(r6) Wo ; Calculate reciprocal 1/W (old point) fseedd d2,d6 fmpy.s d2,d6,d1 fmpy.
PAGE 619
fmpy.s d0,d4,d2 ; fmpy.s d2,d1,d2 fmpy x:(r4)+,d4.s y:,d6.s d5,d4,d3 fadd.s d3,d2 fmpy.s d3,d1,d3 x:(r6),d7.l fadd.s d6,d3 x:(r0)+,d0.s move 1 1 1 1 ; 1 1 ; 1 1 ;Ys Yf d2.s,y:(r5)+ ;X X1 1 1 d3.s,y:(r5)+ ; Y1 1 1 ; -1.0 1 1 dec d7 n5,y:(r5)+ jne _reject_loop ; 2 2 jmp _end ; 2 2 1 1 ; Reject double-clipped line _clip2_reject move x:(r6),d7.l ; move x:(r1)+n1,d0.s y:,d1.s ;Xn Zn 1 1 move d0.s,x:(r1)- d1.s,y: ;Xo Zo 1 1 move x:(r1)+n1,d0.s y:,d1.
PAGE 620
;--------------------------------------------------------; ; Single point clipping routines ; ;--------------------------------------------------------; x = w boundary _clip1_xp move y:(r1)-,d4.s ;W1 1 1 d2.s,d7.s ;X1 1 1 ;Y1 1 1 fmpy.s d1,d4,d1 ; 1 1 fmpy ; 1 1 ; 1 1 1 1 fmpy.s d2,d4,d3 x:(r1)+,d0.s fsub.s d0,d4 x:(r1)-,d0.s d4,d5,d2 fsub.s d3,d1 d0.s,d5.s fmpy.s d5,d7,d3 fmpy d4,d6,d3 fsub.s d3,d2 fmpy.s d4,d7,d2 y:(r1)+,d4.s ;Z1 d2.s,d5.s ; 1 1 fsub.s d2,d3 d1.s,d0.
PAGE 621
; y = w boundary _clip1_yp move y:(r1),d4.s ;W1 1 1 d2.s,d7.s ;Y1 1 1 ;X1 1 1 fmpy.s d1,d4,d1 ; 1 1 fmpy ; 1 1 ; 1 1 1 1 fmpy.s d2,d4,d3 x:(r1)-,d5.s fsub.s d5,d4 x:(r1),d5.s d0,d4,d2 fsub.s d3,d1 fmpy.s d5,d7,d3 fmpy d4,d6,d3 fsub.s d3,d2 fmpy.s d4,d7,d2 y:(r1)+,d4.s ;Z1 d2.s,d0.s ; 1 1 fsub.s d2,d3 d1.s,d5.s ; 1 1 d3.s,d6.s ; 1 1 ; 2 2 y:(r1),d4.s ;W1 1 1 d1.s,d7.s ;Y1 1 1 ;X1 1 1 fmpy.
PAGE 622
fmpy.s d2,d4,d3 d2.s,d7.s y:(r1),d6.s ;Z1 1 1 ;X1 1 1 fmpy.s d1,d4,d1 ; 1 1 fmpy ; 1 1 fmpy.s d6,d7,d3 ; 1 1 fmpy ;Y1 1 1 d2.s,d0.s ; 1 1 fsub.s d2,d3 d1.s,d6.s ; 1 1 d3.s,d5.s ; 1 1 ; 2 2 fsub.s d6,d4 x:(r1)+,d6.s d0,d4,d2 fsub.s d3,d1 d4,d5,d3 fsub.s d3,d2 x:(r1),d4.s fmpy.s d4,d7,d2 move rts ; Clip at z = 0 boundary _clip1_zn move y:(r1)-,d2.s ;W1 1 1 fmpy.s d2,d6,d2 y:(r1),d4.
PAGE 623
move y:(r1)-,d3.s ;Wn 1 1 d5.s,d0.s ;Xn 1 1 ; 1 1 ; 1 1 d9.s,d2.s ; 1 1 d0,d4,d0 fsub.s d5,d2 d2.s,d3.s ; 1 1 ; 1 1 ; 1 1 fmpy.s d0,d3,d0 ; 1 1 fcmp ; 1 1 ; 1 1 ; 2 2 ; 1 1 1 1 fadd.s d3,d5 x:(r1)-,d3.s fsub.s d3,d5 fseedd d5,d4 fmpy.s d5,d4,d5 fmpy fmpy.s d5,d2,d5 fmpy d2.s,d4.s d0,d4,d0 fsub.s d5,d3 d7,d0 ftfr.s d0,d7 ffgt rts ; XOld = -WOld boundary _clip2_xon move (r1)- move y:(r1)-,d3.s ;Wn fsub.s d3,d6 x:(r1)+n1,d3.s d6.s,d0.s ;Xn 1 1 fsub.
PAGE 624
fsub.s d3,d5 ; 1 1 ; 1 1 d9.s,d2.s ; 1 1 d0,d4,d0 fsub.s d5,d2 d2.s,d3.s ; 1 1 ; 1 1 ; 1 1 fmpy.s d0,d3,d0 ; 1 1 fcmp ; 1 1 ; 1 1 ; 2 2 ; 1 1 y:(r1),d3.s ;Wn 1 1 d6.s,d0.s ;Yn 1 1 ; 1 1 ; 1 1 d9.s,d2.s ; 1 1 d0,d4,d0 fsub.s d6,d2 d2.s,d3.s ; 1 1 ; 1 1 ; 1 1 fmpy.s d0,d3,d0 ; 1 1 fcmp ; 1 1 ; 1 1 ; 2 2 ; 1 1 y:(r1)-,d3.s ;Wn 1 1 y:(r1),d3.s ;Zn 1 1 ; 1 1 ; 1 1 fseedd d5,d4 fmpy.s d5,d4,d5 fmpy fmpy.
PAGE 625
fmpy.s d5,d4,d5 d9.s,d2.s ; 1 1 d0,d4,d0 fsub.s d5,d2 d2.s,d3.s ; 1 1 ; 1 1 ; 1 1 fmpy.s d0,d3,d0 ; 1 1 fcmp ; 1 1 ; 1 1 ; 2 2 ; 1 1 1 1 ; 1 1 ; 1 1 d9.s,d2.s ; 1 1 d0,d4,d0 fsub.s d6,d2 d2.s,d3.s ; 1 1 ; 1 1 ; 1 1 fmpy.s d0,d3,d0 ; 1 1 fcmp ; 1 1 ; 1 1 ; 2 2 ; 1 1 y:(r1)-,d0.s ;Wo 1 1 ;Xo 1 1 fsub.s d2,d0 ; 1 1 fadd.s d0,d5 ; 1 1 ; 1 1 d9.s,d2.s ; 1 1 d0,d4,d0 fsub.s d5,d2 d2.s,d3.
PAGE 626
fmpy.s d0,d3,d0 ; 1 1 fcmp ; 1 1 ; 1 1 ; 2 2 ; 1 1 y:(r1)-,d3.s ;Wo 1 1 ;Xo 1 1 fadd.s d3,d2 ; 1 1 fsub.s d6,d2 d2.s,d0.s ; 1 1 ; 1 1 d9.s,d2.s ; 1 1 d0,d4,d0 fsub.s d6,d2 d2.s,d3.s ; 1 1 ; 1 1 ; 1 1 fmpy.s d0,d3,d0 ; 1 1 fcmp ; 1 1 ; 1 1 ; 2 2 ; 1 1 1 1 d7,d0 ftfr.s d0,d7 fflt rts ; XNew = -WNew boundary _clip2_xnn move (r1)- move move x:(r1)+n1,d2.s fseedd d2,d4 fmpy.s d2,d4,d6 fmpy fmpy.s d6,d2,d6 fmpy d2.s,d4.s d0,d4,d0 fsub.
PAGE 627
; YNew = -WNew boundary _clip2_ynn move (r1)+ move x:(r1)-,d2.s ; y:,d3.s ;Yo Wo 1 1 1 1 fadd.s d3,d2 ; 1 1 fsub.s d6,d2 d2.s,d0.s ; 1 1 ; 1 1 d9.s,d2.s ; 1 1 d0,d4,d0 fsub.s d6,d2 d2.s,d3.s ; 1 1 ; 1 1 ; 1 1 fmpy.s d0,d3,d0 ; 1 1 fcmp ; 1 1 ; 1 1 ; 2 2 ; 1 1 fseedd d2,d4 fmpy.s d2,d4,d6 fmpy fmpy.s d6,d2,d6 fmpy d2.s,d4.s d0,d4,d0 fsub.s d6,d3 d7,d0 ftfr.s d0,d7 fflt rts ; ZNew = WNew boundary _clip2_znp move (r1)+ move y:(r1)-,d0.
PAGE 628
_clip2_znn move d6.s,d0.s y:(r1),d6.s ;Zo 1 1 ; 1 1 ; 1 1 d9.s,d2.s ; 1 1 d0,d4,d0 fsub.s d6,d2 d2.s,d3.s ; 1 1 ; 1 1 ; 1 1 fmpy.s d0,d3,d0 ; 1 1 fcmp ; 1 1 ; 1 1 ; 2 2 fsub.s d0,d6 d6.s,d0.s fseedd d6,d4 fmpy.s d6,d4,d6 fmpy fmpy.s d6,d2,d6 fmpy d2.s,d4.s d0,d4,d0 fsub.s d6,d3 d7,d0 ftfr.s d0,d7 fflt rts B.1.45 Walsh-Hadamard Transforms The Walsh-Hadamard transform (WHT) is an orthogonal transform requiring only additions and subtractions.
PAGE 629
dc 0.0000000E+00 dc 2.000000 dc 3.000000 dc 8.000000 dc 9.000000 dc 12.00000 dc 15.00000 dc 19.00000 dc 20.00000 dc 22.00000 dc 23.00000 dc 24.00000 dc 25.00000 dc 26.00000 dc 27.00000 dc 28.00000 org p:$100 start B-110 move #1,d7.l ;number of groups move #n/4,d6.
PAGE 630
move d2.s,x:(r4)+ ;save lower 2, point to next _bfly move x:(r0)+n0,d0.s y:(r4)+n4,d1.s ;adjust r0,r4 _grp lsr d6 d6.l,n0 ;bflys/2, make old value new offset lsl d7 n0,n4 ;ngroups*2, move new offset lea (r0)+n0,r4 ;new lower leg pointer move #3,n0 ;offset between 2 butterflies-1 move n0,n4 ;same move (r4)+ ;point r4 to second bfly do #n/4,_laststage ;do last stage, 2 bflys at a time _stage move x:(r0)+,d0.s ;get upper of bfly 1 move x:(r0)-,d1.
PAGE 631
page 132,60,1,1 ; ; Implements the Walsh-Hadamard Transform ; iord equ 4 ;order of transform=log2(npoints) n equ 1<
PAGE 632
move d2.s,y:(r0)+ ;save dif 2 _firststage nop nop move #data,r0 ;point to data move #n/2-1,m0 ;mod n/2 move #n/4,n0 ;offset to next group move #data+n/4,r4 ;point to lower leg of half move #n/4,n4 ;offset to next group move #1,d8.l ;number of groups/stage move #n/8,d9.l ;number of bflys/group do #iord-2,_mid ;do middle part of transform move d8.l,d7.l ;get group count do d7.l,_grps ;do groups move d9.l,d7.l ;get bfly count do d7.l,_bfly ;do bflys move x:(r0)+,d0.
PAGE 633
_mid move #3,n0 ;new offset move n0,n4 ;copy move do (r4)+ ;point to second butterfly #n/8,_laststage ;do last stage, 4 bflys at a time move x:(r0)+,d0.s y:,d4.s ;upper x,y #1 move x:(r0)-,d1.s y:,d5.s ;lower x,1 #1 faddsub.s d0,d1 x:(r4)+,d2.s y:,d6.s ;upper x,y #2 faddsub.s d4,d5 x:(r4)-,d3.s y:,d7.s ;lower x,y #2 faddsub.s d2,d3 d1.s,x:(r0)+ d5.s,y: ;save upper x,y #1 faddsub.s d6,d7 d0.s,x:(r0)+n0 d4.s,y: ;save lower x,y #1 move d3.s,x:(r4)+ d7.
PAGE 634
B.1.46 Evaluation of LOG(x) Floating-point evaluation of log2(x) can be performed by representing x as s*(2**e) where s is the significand and e is the unbiased exponent. Then, log2(s*(2**e)) = log2(s) + e. After extracting the significand s, log2(s) can be evaluated with a polynomial. By adding the unbiased exponent, log2(x) results. Various execution speeds and accuracies may be determined by using different order polynomials. page 132,60,1,1 org x:0 polyc dc 0.6681523e-02 ;**8 dc -0.
PAGE 635
B.1.47 Evaluation of EXP2(x) Floating-point evaluation of exp2(x) can be performed by representing x as i+f where f is the fractional part and i is the greatest integer in x that does not exceed x. Then, exp2(i+f) = exp2(f)*(2**i). After extracting the fractional part f, exp2(f) can be evaluated with a polynomial. By scaling by the integer part, exp2(x) results. Various execution speeds and accuracies may be determined by using different order polynomials. page 132,60,1,1 org x:0 polyc dc -0.
PAGE 636
B.1.48 Vector Cross Product The cross product of two vectors is always perpendicular to both of the vectors making this vector useful for 3D graphics, shading, and illumination. The three dimensional cross product a X b where a and b are {1 x 3} vectors can be written as the determinant: i j k ax ay az bx by bz where i, j and k are the unit vectors in the x, y and z directions respectively.
PAGE 637
fsub.s d2,d3 move ; ;cz d3.s,x:(r1)+ 1 1 1 1 --- --- Totals: B.1.49 10 10 Power Function X**Y Power Function X**Y X = Single Precision Float, Y = 5 Bit Integer Program Words ICycles ; ; d1.s = d4.s**d0.l ; andi #0,ccr ;clear ccr bits 1 1 move sr,d3.l ;get sr 1 1 or d0,d3 ;set ccr bits 2 2 move d3.l,sr ;move power to CCR bits 1 1 fmpy.x d1,d4,d1 ifcs ;bit 0, carry 1 1 fmpy.x d4,d4,d4 ifal ;do multiply w/o ccr update 1 1 fmpy.
PAGE 638
fmpy.x d4,d4,d4 ;scale power pwr Totals: 1 1 --- --- 7 100 Power Function X**Y X = Single Precision Float, Y = 32 Bit Unsigned Integer Program ICycles Words ; ; d1.s = d4.s**d0.l ; bfind d0,d0 #32,d2.l move d0.h,d3.l sub d3,d2 do d2.l,pwr lsr d0 fmpy.x d1,d4,d1 fmpy.x d4,d4,d4 #1.0,d1.
PAGE 639
dc 0.2093549e-02 ;**7 dc -.02777411e-02 ;**6 dc 0.3357901e-02 ;**5 dc 0.8940958e-02 ;**4 dc 0.5558203e-01 ;**3 dc 0.2402348e+00 ;**2 dc 0.6931450e+00 ;**1 dc 0.1000000e+01 ;**0 ; ; d2.s = d4.s**d0.s = exp2(d0 * log2(d4)) ; ; calculate d2=log2(d4) ; getexp d4,d7 #logc,r0 ;get exponent 2 2 fgetman d4,d4 ;get mantissa 1 1 fclr d2 1 1 do #9,_log ;do log2(man) 2 3 fmpy.x d2,d4,d2 ;sum*x 1 1 fadd.x d1,d2 1 1 float.x d7 ;float exponent 1 1 fadd.
PAGE 640
B.1.50 Cascaded Five Coefficient Biquad Filter Filter Section: ∑ b0 ∑ b1 ∑ b2 ∑ Z-1 a1 ∑ Z-1 a2 ∑ Program Words nsec equ org states ds ICycles 3 org x:0 2*nsec y:0 coef ; ; ; dc -.68461698E+00 ;/* section dc .16526726E+01 ;/* section dc .83384343E-02 ;/* section dc .16676869E-01 ;/* section 1 B2 */ 1 B1 */ dc .83384343E-02 ;/* section 1 B0 */ dc -.75893794E+00 ;/* section dc .17255842E+01 ;/* section 2 A2 */ 2 A1 */ dc .90060414E-02 ;/* section dc .
PAGE 641
move nop fclr do fmpy fmpy fmpy fmpy.s fmpy #coef,r4 d1 #nsec,loop d4,d6,d0 fadd.s d5,d6,d1 fadd.s d6,d4,d1 fadd.s d6,d5,d2 d4,d0,d1 fadd.s x:(r0)+,d4.s d1,d2 d2,d0 d1,d0 d1,d2 x:(r0)-,d5.s d5.s,x:(r0)+ d0.s,x:(r0)+ x:(r0)+,d4.s y:(r4)+,d6.s 1 2 y:(r4)+,d6.s 1 y:(r4)+,d6.s 1 y:(r4)+,d6.s 1 y:(r4)+,d4.s 1 y:(r4)+,d6.s 1 1 3 1 1 1 1 1 loop move d2.s,y:output Totals: B.1.
PAGE 642
move fcmp fsub.x ftfr.x ; ; ; ; ; d7,d6 d6,d7 d6,d7 #90.0,d7.s #180.0,d7.s ffge fflt First quadrant CORDIC trig computation Input angle in d7 in degrees Output d1=sine, d0=cosine move #tantab,r0 fclr d1 #scale,d0.s fclr d5 #45.0,d6.s do #tabsize,_cordic fcmp d5,d7 x:(r0)+,d4.s fneg.x d4 fflt fsub.x d6,d5 fflt fadd.x d6,d5 ffge fmpy.x d1,d4,d2 fmpy d0,d4,d2 fsub.x d2,d0 fadd.x d2,d1 fscale.x #-1,d6 _cordic fcopys.s d3,d1 end Argument Reduction Quadrantizing CORDIC Algorithm B.1.
PAGE 643
org scale tantab tanarg scale tanarg org ; ; ; p:$100 #-180.0,d7.s #1.0/360.0,d5.s d7,d6 d5,d6,d6 d6,d5 d5,d6 #360.0,d5.s d5,d6,d6 d7,d6 ;get range min ;adjust to min, get range ;reduce range ;get int part ;get frac part, spread ;spread fraction part to range ;adjust to min d6 d7,d6 d6,d7 d6,d7 d3 #90.0,d7.s d6.s,d3.s #180.0,d7.
PAGE 644
fadd.x d2,d1 fscale.x #-1,d6 _cordic fcopys.s d3,d0 end ;y’=y+x*tan ;alp=alp/2 ;fix sign of cosine Program Words 10 8 16 ---Totals: 34 Argument Reduction Quadrantizing CORDIC Algorithm B.1.53 page opt tabsize org scale tantab tanarg scale tanarg org ; ; ; ICycles 10 8 8N+9 ------8N+27 Four Quadrant Trigonometric TANGENT (CORDIC Algorithm) 132,60,1,1 mex,cex equ 16 x:0 set set dup set dc set endm 1.0 45.0*3.14159/180.0 tabsize scale*@cos(tanarg) @tan(tanarg) tanarg/2.
PAGE 645
; ; Input angle in d6 in degrees, -180 < d6 < 180 fabs.x move fcmp fsub.x ftfr.x fneg.x ; ; ; ; ; d6 d7,d6 d6,d7 d6,d7 d3 d6.s,d3.s #90.0,d7.s #180.0,d7.s ffge fflt ffge First quadrant CORDIC trig computation Input angle in d7 in degrees Output d1=sine, d0=cosine move #tantab,r0 fclr d1 #scale,d0.s fclr d5 #45.0,d6.s do #tabsize,_cordic fcmp d5,d7 x:(r0)+,d4.s fneg.x d4 fflt fsub.x d6,d5 fflt fadd.x d6,d5 ffge fmpy.x d1,d4,d2 fmpy d0,d4,d2 fsub.x d2,d0 fadd.x d2,d1 fscale.x #-1,d6 _cordic fcopys.
PAGE 646
Totals: B.1.54 44 8N+37 [NxN] by [NxN] Matrix Multiplication (Modulo-Aligned) ;This routine performs an [NxN] by [NxN] matrix multiplication ;for the 96000 floating-point DSP chip. Sample data is given ;for N=4. The data for all matrices is stored in row major ;format. For example, take the matrix A: ; ; A(1,1) ... A(1,N) ; . . . ; . . . ; A(N,1) ... A(N,N) ; ;Matrix A’s elements are stored as such: ;amatrix dc A(1,1),A(1,2),...,A(1,N),A(2,1),A(2,2),...,A(2,N), ...
PAGE 647
dc .5,.5,.5,.5 dc .5,.5,.5,.5 dc .5,.5,.5,.5 org p:$100 move #amatrix,r0 1 1 move #N,n0 1 1 move #N_sqr-1,m0 ; modulo N-squared addressing 1 1 move #bmatrix,r4 1 1 move #cmatrix,r1 1 1 move n0,n4 1 1 move m0,m4 1 1 move n0,n1 1 1 move m0,m1 1 1 fclr d1 x:(r0)+,d0.s y:(r4)+n4,d4.s 1 1 fclr d3 d1.s,d7.s 1 1 do #N,endall 2 3 do #N,endcol 2 3 rep #N 1 2 fmpy d0,d4,d3 fadd.s d3,d1 x:(r0)+,d0.s y:(r4)+n4,d4.s 1 1 fadd.s d3,d1 d7.s,d3.s 1 1 fclr d1 d1.
PAGE 648
;amatrix dc A(1,1),A(1,2),...,A(1,N),A(2,1),A(2,2),...,A(2,N), ... ; ;Matrix A is in X memory, while matrices B and C are in Y memory. ;Since modulo N**2 addressing is used for all matrices, the first ;k least significant bits of the address of the beginning of any ;matrix storage area must be equal to zero, where 2**k >= N**2. ; ;This routine takes ; 15 + 4*18 = 87 instruction cycles to complete. ; ; ; Program ICycles Words page 132,60,1,1 N equ 4 N_sqr equ N*N org x:$0 amatrix dc .1,.2,.3,.4 dc .5,.6,.7,.
PAGE 649
fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy x:(r0)+,d4.s d5.s,d2.s 1 x:(r0)+,d4.s d1.s,y:(r5)+n5 1 x:(r0)+,d4.s 1 x:(r0)+,d4.s 1 x:(r0)+,d4.s d5.s,d1.s 1 x:(r0)+,d4.s d2.s,y:(r5)+n5 1 x:(r0)+,d4.s 1 x:(r0)+,d4.s 1 x:(r0)+,d4.s d5.s,d2.s 1 x:(r0)+,d4.s d1.s,y:(r5)+n5 1 x:(r0)+,d4.s d5.s,d1.s 1 x:(r0)+,d4.s y:(r4)+,d0.s ;junk into d0.s 1 fadd.s d3,d2 y:(r4)+n4,d8.s 1 move d2.s,y:(r5)+n5 1 endall --Totals: 30 B.1.
PAGE 650
Words page 132,60,1,1 N equ 8 N_sqr equ N*N org x:$0 amatrix dc .1,.2,.3,.4,.1,.2,.3,.4 dc .5,.6,.7,.8,.5,.6,.7,.8 dc .9,.1,.2,.3,.9,.1,.2,.3 dc .4,.5,.6,.7,.4,.5,.6,.7 dc .1,.2,.3,.4,.1,.2,.3,.4 dc .5,.6,.7,.8,.5,.6,.7,.8 dc .9,.1,.2,.3,.9,.1,.2,.3 dc .4,.5,.6,.7,.4,.5,.6,.7 org y:$0 bmatrix dc .5,.5,.5,.5,.5,.5,.5,.5 dc .5,.5,.5,.5,.5,.5,.5,.5 dc .5,.5,.5,.5,.5,.5,.5,.5 dc .5,.5,.5,.5,.5,.5,.5,.5 dc .5,.5,.5,.5,.5,.5,.5,.5 dc .5,.5,.5,.5,.5,.5,.5,.5 dc .5,.5,.5,.5,.5,.5,.5,.5 dc .5,.5,.5,.5,.5,.5,.5,.
PAGE 651
move fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy move fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy move fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy move fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy move fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy move B-132 d1.s,y:(r5)+n5 d5.s,d2.s d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.
PAGE 652
fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy move fmpy fmpy fmpy fmpy fmpy fmpy fmpy d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s y:(r4)+n4,d7.s 1 d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s y:(r4)+n4,d7.s 1 d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s y:(r4)+n4,d7.s 1 d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s y:(r4)+n4,d7.s 1 d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s y:(r4)+n4,d7.s 1 d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s y:(r4)+n4,d7.s 1 d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s y:(r4)+n4,d7.s 1 d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s y:(r4)+n4,d7.s 1 d1.
PAGE 653
;This routine takes 15 + 16(18 + 14*17 + 18) = 4399 instruction cycles. ; ; ; Program ICycles Words page 132,60,1,1 N equ 16 N_sqr equ N*N org x:$0 amatrix dc .1,.2,.3,.4,.1,.2,.3,.4,1,1,1,1,1,1,1,1 dc .5,.6,.7,.8,.5,.6,.7,.8,1,1,1,1,1,1,1,1 dc .9,.1,.2,.3,.9,.1,.2,.3,1,1,1,1,1,1,1,1 dc .4,.5,.6,.7,.4,.5,.6,.7,1,1,1,1,1,1,1,1 dc .1,.2,.3,.4,.1,.2,.3,.4,1,1,1,1,1,1,1,1 dc .5,.6,.7,.8,.5,.6,.7,.8,1,1,1,1,1,1,1,1 dc .9,.1,.2,.3,.9,.1,.2,.3,1,1,1,1,1,1,1,1 dc .4,.5,.6,.7,.4,.5,.6,.7,1,1,1,1,1,1,1,1 dc .1,.2,.
PAGE 654
org p:$100 move #amatrix,r0 move #N,n4 move #N_sqr-1,m0 ; modulo-N addressing move #bmatrix,r4 move #cmatrix,r5 move m0,m4 move n4,n5 move m0,m5 fclr d1 x:(r0)+,d4.s fclr d5 y:(r4)+n4,d7.s 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 do #16,endall fmpy.s d4,d7,d3 x:(r0)+,d4.s fmpy d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s fmpy d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s fmpy d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s fmpy d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s fmpy d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s fmpy d4,d7,d3 fadd.
PAGE 655
fmpy move fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy move fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy move fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy B-136 d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.s d2.s,y:(r5)+n5 d5.s,d1.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.
PAGE 656
fmpy fmpy fmpy fmpy fmpy fmpy move fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy move fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy move fmpy fmpy fmpy fmpy fmpy MOTOROLA d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d1.s,y:(r5)+n5 d5.s,d2.s d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.
PAGE 657
fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy move fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy move fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy move B-138 d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.s d4,d7,d3 fadd.
PAGE 658
fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy move fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy move fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy MOTOROLA d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.
PAGE 659
fmpy fmpy fmpy fmpy move fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy move fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy move fmpy fmpy fmpy fmpy fmpy fmpy fmpy B-140 d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d1 x:(r0)+,d4.s d1.s,y:(r5)+n5 d5.s,d2.s d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.s d4,d7,d3 fadd.s d3,d2 x:(r0)+,d4.
PAGE 660
fmpy fmpy fmpy fmpy fmpy fmpy fmpy fmpy x:(r0)+,d4.s y:(r4)+n4,d7.s 1 x:(r0)+,d4.s y:(r4)+n4,d7.s 1 x:(r0)+,d4.s y:(r4)+n4,d7.s 1 x:(r0)+,d4.s y:(r4)+n4,d7.s 1 x:(r0)+,d4.s y:(r4)+n4,d7.s 1 x:(r0)+,d4.s y:(r4)+n4,d7.s 1 x:(r0)+,d4.s y:(r4)+n4,d7.s 1 x:(r0)+,d4.s y:(r4)+,d1.s ;junk to d1.s 1 fadd.s d3,d2 y:(r4)+n4,d7.s 1 move d2.s,y:(r5)+n5 d5.s,d1.s 1 move (r5)+ 1 endall --Totals: 286 B.1.58 d4,d7,d3 d4,d7,d3 d4,d7,d3 d4,d7,d3 d4,d7,d3 d4,d7,d3 d4,d7,d3 d4,d7,d3 fadd.s fadd.s fadd.s fadd.s fadd.s fadd.
PAGE 661
fs f0 scale mag output page equ equ equ equ equ 132,60,1,1 8000.0 ;sampling frequency 320.0 ;center frequency 2.0*@cos(2.0*3.14159*f0/fs) 1.0*@sin(2.0*3.14159*f0/fs) $ffff org p:$100 move #scale,d7.s ;init scale factor fclr d6 #mag,d5.s ;init magnitudes do #200,_gen ;generate 200 points fmpy.s d6,d7,d6 d6.s,d4.s fsub.s d5,d6 d4.s,d5.s move _gen fs f0 scale0 mag0 f1 scale1 mag1 output --2 --2 DTMF Generation page equ equ equ equ equ equ equ equ DTMF Generation 132,60,1,1 8000.
PAGE 662
fadd.s move _gen d4,d0 d0.s,y:output Totals: B.2 B.2.1 1 1 --5 --5 IEEE STANDARD CONFORMANCE FUNCTIONS IEEE Remainder B.2.2 IEEE floating-point Round to Integer The IEEE standard section 5.5 specifies that it shall be possible to round a floating-point number to an integral valued floating point number in the same format. If the rounding mode is round to nearest, the rounded result is even if the difference between the rounded result and the unrounded operand is exactly one half.
PAGE 663
checking on the source and either jump to an error handling procedure or return a valid result. The programs provided may vary depending on the application. The following data types and abbreviations will be used: I - Signed 32 bit integer U - Unsigned 32 bit integer SP - Single precision floating-point All conversion examples assume that the value to be converted is in d0 if floating-point or in d0.l if fixed point.
PAGE 664
SP → I i nt jset d0 #20,sr,_error ;convert to integer ;jump if invalid op set Program ICycles Words 1 1 2 3 ---3 4 ;convert to integer ;jump if invalid op set Program ICycles Words 1 1 2 3 ----3 4 SP → U intu jset B.3 d0 #20,sr,_error IEEE RECOMMENDED FUNCTIONS AND PREDICATES The following functions are recommended by the IEEE-754 standard but are not required.
PAGE 665
B.3.2 -x The arithmetic form signals IOP if x is a signalling NaN. The non-arithmetic form copies x with its sign complemented. Arithmetic Implementation Of -d0 fneg.s d0 ;change sign bit Program ICycles Words 1 1 ----Totals: 1 1 Non-Arithmetic Implementation Of -d0 bchg #31,d0.h ;change sign bit Program ICycles Words 1 2 ----Totals: 1 2 B.3.3 Scalb(y,N) Scalb(y,N) returns y*(2**N) for integral values of N without computing 2**N. This is an arithmetic function.
PAGE 666
move #ninf,d0.s ori #2,er ori #2,ier jmp _done _notzero getexp d1,d0 #-126,d3.l cmp d3,d0 tfr d3,d0 iflt float.s d0 _done ;set -infinity result ;set DZ in ER ;set DZ in IER ;done ;get exponent ;cmp to SP exp min ;limit if denorm ;convert to SP FP 2 1 1 2 2 1 1 1 --Totals: 18 2 1 1 2 2 1 1 1 --* Execution Time: Nan 4 Infinity 7 Zero 16 In-range 15 B.3.5 Nextafter(x,y) Nextafter(x,y) returns the next representable neighbor of x in the direction toward y.
PAGE 667
Implementation of nextafter(d0,d4) d0 for single precision numbers: ftst d4 ftfr.s d4,d0 ffun ftst d0 d0.s,d1.l fjor _not_nan move #$7fffffff,d0.s jmp _ok fjinf _ok bclr #31,d1.l neg d1 ifcs fcmp d0,d4 #$00800000,d3.s inc d1 ffgt dec d1 fflt tst d1 #$80000000,d2.l neg d1 ifmi or d2,d1 ifmi move d1.l,d0.
PAGE 668
B.3.6 Finite(x) Finite(x) returns the value TRUE if -inf
PAGE 669
When comparing two values, GL is true if the values are not equal and both values being compared are valid floating-point numbers. The GL condition is false if either number is a NaN even though the values are not equal. B.3.9 Unordered(x,y) or x?y Unordered(x,y), or x?y, returns the value TRUE if x is unordered with y, and returns FALSE otherwise. This is an arithmetic function. d2=d0?d1 Program ICycles Words 2 2 fcmp d0,d1 #0,d2.
PAGE 670
d1=class(d0) ftst fjor jset jmp _notnan fjne fjmi jmp _notz fjninf fjmi jmp _finite fjge jset jmp _pos jset jmp _tpinf inc _tpnorm inc _tpdnrm inc _tpzer inc _tmzer inc _tmdnrm inc _tmnorm inc _tminf inc _tqnan inc _tsnan Program ICycles Words 1 1 2 3 2 3 2 2 d0 _notnan #5,er,_tsnan _tqnan ;test d0 ;jump if ordered ;check signaling nan bit ;quiet nan _notz _tmzer _tpzer ;jump if not zero ;type is minus zero ;type is plus zero 2 2 2 3 3 3 _finite _tminf _tpinf ;jump if finite ;minus infinity ;plus i
PAGE 671
Execution Times: Signaling not a number - 7 Quiet not a number - 10 Negative infinity - 15 Negative normalized nonzero - 21 Negative denormalized - 20 Negative zero - 15 Positive zero - 19 Positive denormalized - 23 Positive normalized nonzero - 26 Positive infinity - 25 Note the following code assignments: Signaling not a number - 0 Quiet not a number - 1 Negative infinity - 2 Negative normalized nonzero - 3 Negative denormalized - 4 Negative zero - 5 Positive zero - 6 Positive
PAGE 672
; d4.h d4.l ; d5.h d5.l ; d6.l ; d7.l ; ; Alters Program Control Registers ; pc sr ; ; ; Version 1.
PAGE 673
_mant2 _inf1 _binf _minf _nan0 _nan1 B-154 and tst jne tst jne move move and cmp jne ; ; Check ; move and tst jne tst jne jclr ori jmp ; ; Check ; ftfr.x ori jmp ftst jmi ftst jmi ori jmp ftst jpl ori jmp ; ; Check ; jclr move move and cmp jne move and tst jne tst jeq jset d7,d2 d2 _nan0 d0 _nan0 #emsk,d7.l d1.h,d5.l d7,d5 d7,d5 _inf1 ; ; ; ; ; ; ; ; ; ; remove implied one bit check m0.high = zero jump if nan check m0.
PAGE 674
_inan _qnan _mant3 _mant4 _den0 _bden _ftz _den1 _tfr MOTOROLA ori #$10,ier ; set invalid operation bit move #qnane,d1.h ; get QNaN exponent move #qnanmh,d1.m ; get QNaN mantissa high move #qnanml,d1.l ; get QNaN mantissa low ori #$20,ccr ; set Not-a-Number bit jmp _done ; result is a NaN ; ; Check if Addend 0 is a Denormalized Number ; tst d2 ; check mant0.high = zero jne _den0 ; jump if a0 is a denorm tst d0 ; check mant0.
PAGE 675
_tmov _bzero tst d3 dec d5 ifpl.u move d5.l,d1.h move d0.m,d1.m move d0.l,d1.l jmp _done ; ; Both Addends are Zero ; move d0.h,d4.l move d1.h,d5.l eor d4,d5 jclr #31,d5.l,_done bclr #31,d1.h jclr #22,sr,_done jset #21,sr,_done bset #31,d1.h jmp _done ; ; ; ; ; ; test mantr.
PAGE 676
jmp _add ; ; ; Set Sticky Bit for Shift > 55 Bits ; _setst0 move #0,d3.l ; get number for addition move #inum,d1.l ; " jmp _add ; ; ; *** Case: Exp1 > Exp0 *** ; ; ; Align Mantissas ; _pos sub d4,d5 d1.h,d6.l ; get shift, get expr move #55,d7.l ; get number of bits cmp d7,d5 ; check for shift > 55 jgt _setst1 ; jump if shift > 55 do d5.l,_end2 ; align mantissas lsr d2 ; shift right m0.h ror d0 ; shift right m0.l and GRS0 jclr #8,d0.l,_cclr2 ; jump if sticky bit clear move #1,d2.
PAGE 677
add jcc inc jmp d7,d1 _zchk d3 _zchk ; ; ; ; " " " ; ; *** Case: Addend 0 is Positive, ; Addend 1 is Negative *** ; _neg2 bclr #31,d6.l ; set result as positive sub d1,d0 ; subtract for case: a0+,a1subc d3,d2 ; " jcc _cclr3 ; jump if result is positive bset #31,d6.l ; set result as negative move #inum,d7.l ; get increment number not d0 ; get 2’s comp of result not d2 ; " add d7,d0 ; " jcc _cclr3 ; " inc d2 ; " _cclr3 move d0.l,d1.l ; get mantr.low and GRS bits move d2.l,d3.l ; get mantr.
PAGE 678
rol jset jclr jmp d3 #31,d3.l,_rnd #8,d1.l,_st0 _st1 ; ; ; ; shift mantr.h left jump if result normalized jump if sticky bit = 0 jump if sticky bit = 1 ; ; *** Cases: 1) Addend 0 is Negative, ; Addend 1 is Negative ; 2) Addend 0 is Positive, ; Addend 1 is Positive *** ; _nset bset #31,d6.l ; set result as negative _fadd add d0,d1 ; add for case: a0-,a1addc d2,d3 ; and case: a0+,a1+ jcc _rnd ; jump if number normalized lsr d3 ; shift right mantr.h ror d1 ; shift right mantr.low jclr #8,d1.
PAGE 679
move ori ori ori jmp d6.l,d1.h #$10,ccr #$09,ier #$09,er _done ; ; ; ; ; " set infinity bit set OVF and INX bits in IER set OVF and INX bits in ER result is infinity ; ; Begin Rounding the Result ; ; ; Check for Denormalized Numbers ; _rnd move #eden,d7.l ; get denorm exponent move d6.l,d5.l ; get expr move #emsk,d4.l ; get exponent mask and d4,d5 ; delete tags and sign cmp d7,d5 ; compare exponents jne _remst ; jump if not a denorm tst d3 ; test mantr.high dec d6 ifpl.
PAGE 680
; ; Round toward -infinity ; _rminf jclr #31,d6.l,_lmove ; no rounding if positive _addone move #$800,d7.l ; get increment number add d7,d1 ; add one to lsb jcc _acar ; jump if no carry inc d3 ; increment mantr.high _acar jcc _lmove ; jump if result normalized lsr d3 ; shift right mantr.high ror d1 ; shift right mantr.low inc d6 ; increment expr ; ; Check if Result is Infinity ; move #emsk,d7.l ; get exp mask move d6.l,d5.
PAGE 681
; d4.h d4.l ; d5.h d5.l ; d6.l ; d7.l ; ; Alters Program Control Registers ; pc sr ; ; ; Version 1.
PAGE 682
_mant2 _inf1 _binf _minf _nan0 _nan1 MOTOROLA and tst jne tst jne move move and cmp jne ; ; Check ; move and tst jne tst jne jclr ori jmp ; ; Check ; ftfr.x ori jmp ftst jmi ftst jmi ori jmp ftst jpl ori jmp ; ; Check ; jclr move move and cmp jne move and tst jne tst jeq jset d7,d2 d2 _nan0 d0 _nan0 #emsk,d7.l d1.h,d5.l d7,d5 d7,d5 _inf1 ; ; ; ; ; ; ; ; ; ; remove implied one bit check m0.high = zero jump if nan check m0.
PAGE 683
_inan _qnan _mant3 _mant4 _den0 _bden _ftz _den1 _tfr B-164 ori #$10,ier ; set invalid operation bit move #qnane,d1.h ; get QNaN exponent move #qnanmh,d1.m ; get QNaN mantissa high move #qnanml,d1.l ; get QNaN mantissa low ori #$20,ccr ; set Not-a-Number bit jmp _done ; result is a NaN ; ; Check if Addend 0 is a Denormalized Number ; tst d2 ; check mant0.high = zero jne _den0 ; jump if a0 is a denorm tst d0 ; check mant0.
PAGE 684
_tmov _bzero tst d3 dec d5 ifpl.u move d5.l,d1.h move d0.m,d1.m move d0.l,d1.l jmp _done ; ; Both Addends are Zero ; move d0.h,d4.l move d1.h,d5.l eor d4,d5 jclr #31,d5.l,_done bclr #31,d1.h jclr #22,sr,_done jset #21,sr,_done bset #31,d1.h jmp _done ; ; ; ; ; ; test mantr.
PAGE 685
jmp _add ; ; ; Set Sticky Bit for Shift > 55 Bits ; _setst0 move #0,d3.l ; get number for addition move #inum,d1.l ; " jmp _add ; ; ; *** Case: Exp1 > Exp0 *** ; ; ; Align Mantissas ; _pos sub d4,d5 d1.h,d6.l ; get shift, get expr move #55,d7.l ; get number of bits cmp d7,d5 ; check for shift > 55 jgt _setst1 ; jump if shift > 55 do d5.l,_end2 ; align mantissas lsr d2 ; shift right m0.h ror d0 ; shift right m0.l and GRS0 jclr #8,d0.l,_cclr2 ; jump if sticky bit clear move #1,d2.
PAGE 686
add jcc inc jmp d7,d1 _zchk d3 _zchk ; ; ; ; " " " ; ; *** Case: Addend 0 is Positive, ; Addend 1 is Negative *** ; _neg2 bclr #31,d6.l ; set result as positive sub d1,d0 ; subtract for case: a0+,a1subc d3,d2 ; " jcc _cclr3 ; jump if result is positive bset #31,d6.l ; set result as negative move #inum,d7.l ; get increment number not d0 ; get 2’s comp of result not d2 ; " add d7,d0 ; " jcc _cclr3 ; " inc d2 ; " _cclr3 move d0.l,d1.l ; get mantr.low and GRS bits move d2.l,d3.l ; get mantr.
PAGE 687
rol jset jclr jmp d3 #31,d3.l,_rnd #8,d1.l,_st0 _st1 ; ; ; ; shift mantr.h left jump if result normalized jump if sticky bit = 0 jump if sticky bit = 1 ; ; *** Cases: 1) Addend 0 is Negative, ; Addend 1 is Negative ; 2) Addend 0 is Positive, ; Addend 1 is Positive *** ; _nset bset #31,d6.l ; set result as negative _fadd add d0,d1 ; add for case: a0-,a1addc d2,d3 ; and case: a0+,a1+ jcc _rnd ; jump if number normalized lsr d3 ; shift right mantr.h ror d1 ; shift right mantr.low jclr #8,d1.
PAGE 688
move ori ori ori jmp d6.l,d1.h #$10,ccr #$09,ier #$09,er _done ; ; ; ; ; " set infinity bit set OVF and INX bits in IER set OVF and INX bits in ER result is infinity ; ; Begin Rounding the Result ; ; ; Check for Denormalized Numbers ; _rnd move #eden,d7.l ; get denorm exponent move d6.l,d5.l ; get expr move #emsk,d4.l ; get exponent mask and d4,d5 ; delete tags and sign cmp d7,d5 ; compare exponents jne _remst ; jump if not a denorm tst d3 ; test mantr.high dec d6 ifpl.
PAGE 689
; ; Round toward -infinity ; _rminf jclr #31,d6.l,_lmove _addone move #$800,d7.l add d7,d1 jcc _acar inc d3 _acar jcc _lmove lsr d3 ror d1 inc d6 move #emsk,d7.l move d6.l,d5.l and d7,d5 cmp d7,d5 jne _lmove move #0,d1.l move #0,d1.m ori #$10,ccr ori #$09,ier ori #$09,er jmp _emove ; ; Get Result in D1 ; _lmove move d3.l,d1.m _emove move d6.l,d1.h _done nop nop nop rts endsec B.4.
PAGE 690
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** SR_MASK EXP_MSK EBIAS EMAX EMIN EDEN MAX SMSK INUM sdptest f l g r s = = = = - fraction bits, initially bits in mantissas least significant fraction bit, initially in mantissas guard bit round bit sticky bit Routine Inputs: d6 - IEEE double extended precision operand 1 d7 - IEEE double extended precision op
PAGE 691
; tmp1 = ; ; tmp2 = ; ; ; Note that register uses the register uses the name of the form "Dn", and is a temporary var which lowest 32 bits of the register (Dn.L is destroyed) name of the form "Dn", and is a temporary var which lowest 32 bits of the register (Dn.L is destroyed) op, tmp1, and tmp2 must all be different registers. andi #$c3,ccr ; jclr ori #31,op.h,_chkrst #$8,ccr ; ; _chkrst move move and op.h,tmp1.l #EXP_MSK,tmp2.
PAGE 692
; ****** Flush DeNorms to 0 if Fast Mode ****** jclr move jset fclr and move _chkop2 jset fclr and move #27,sr,_chksgn #$80000000,d1.l #31,d6.m,_chkop2 d6 d6.h,d0.l d1,d0 d0.l,d6.h #31,d7.m,_chksgn d7 d7.h,d0.l d1,d0 d0.l,d7.h ; ; ; ; ; ; ; ; ; ; ; ****** Sign Bit Calculation ****** _chksgn move move eor move d6.h,d0.l d7.h,d1.l d1,d0 d0.l,d0.
PAGE 693
move and tst jne inc _ebias2 move sub move #EXP_MSK,d1.l d1,d0 d0 _ebias2 d0 #EBIAS,d1.l d1,d0 d0.l,d1.m ; ; ; ; ; ; ; ; ; ****** Extract Mantissas ****** move move #0,d6.h #0,d7.h ; ; ; ****** Normalize any Denorms ****** jset move tst jneq move move sub move move move jset #31,d6.m,_nrmop2 d6.m,d0.l d0 _op1nrm d1.m,d0.l #32,d1.l d1,d0 d0.l,d1.m d6.l,d6.m #0,d6.l #31,d6.m,_nrmop2 asl rol move move dec move jclr d6 d6.m,d0.l d0 d0.l,d6.m d1.m,d0.l d0 d0.l,d1.m #31,d6.
PAGE 694
move move dec move jclr d0.l,d7.m d0.m,d0.l d0 d0.l,d0.m #31,d7.m,_op2nrm ; ; ; ; ; _domul ; ****** Initial Exponent Processing ****** move move add inc move d0.m,d0.l d1.m,d1.l d0,d1 d1 d1.l,d0.m ; ; ; ; ; ; ****** Calculate Partial Products (A:B * C:D) ****** mpyu move mpyu move mpyu mpyu d6,d7,d2 d6.m,d0.l d0,d7,d3 d7.m,d1.l d1,d6,d4 d0,d1,d5 ; ; ; ; ; ; ; ****** Sum Partial Products ****** move tst jeq move #0,d1.h d2 d2.m,d0.l _addpps #1,d1.
PAGE 695
; d5.l = next most significant word, ; d3.l = next most significant word, ; and least significant word info ; is in the sticky bit. ; ; Upper 96 bits = d4.l:d5.l:d3.l, and ; the lowest 32 bits have been ORed ; into the sticky bit. ; ****** Continue Calculating Sticky Bit ****** _stlow move move and tst jeq move #SMSK,d0.l d5.l,d2.l d0,d2 d2 _stlow #1,d1.h ; ; ; ; ; ; tst jeq move d3 _post #1,d1.h ; ; ; ; ****** Post Normalization ****** _post _ptop jset asl rol rol move dec move jclr #31,d4.
PAGE 696
move abs move do lsr ror ror #EDEN,d0.m d0 d0.l,d1.m d1.m,_dnrmq d4 d5 d3 ; ; ; ; ; ; ; jset jset jset jmp #0,d1.h,_sundr #9,d5.l,_sundr #10,d5.l,_sundr _asml ; ; ; ; _dnrmq ; ****** Round ****** _rnd jset jset jset jmp #10,d5.l,_inex #9,d5.l,_inex #0,d1.h,_inex _endrnd ; ; ; ; ori ori #$04,er #$04,ier ori ori jclr jset jclr jmp #$01,er #$01,ier #22,sr,_nxt #21,sr,_pinf #31,d0.h,_endrnd _add1 jclr jmp jset jmp #21,sr,_rn _endrnd #31,d0.
PAGE 697
jmp _endrnd ; move move cmp jne jclr inc move jmp #EDEN,d1.l d0.m,d0.l d1,d0 _endrnd #31,d4.l,_asml d0 d0.l,d0.m _asml ; ; ; ; ; ; ; ; ori ori jset jclr jmp #$05,er #$05,ier #22,sr,_nxt1 #21,sr,_rn1 _ret0 ; Reaches here if value is too small ; to denormalize. ; ; ; ; ; _pinf1 jset jmp #31,d0.h,_ret0 _retsml ; ; _rn1 move cmp jle move cmp jeq lsr dec jmp move jclr jset jset jmp #-56,d1.l d1,d0 _grs0 #-53,d1.l d1,d0 _rnrnd d4 d0 _grsl #0,d4.l #31,d4.l,_ret0 #30,d4.l,_retsml #0,d1.
PAGE 698
move move cmp jle ori ori #EMAX,d1.l d0.m,d0.l d1,d0 _asml #$09,er #$09,ier ; ; ; ; ; ; jclr jset jset jmp #22,sr,_next #21,sr,_posinf #31,d0.h,_retinf _retlrg ; ; ; ; jclr jmp #31,d0.h,_retinf _retlrg jclr #21,sr,_retinf ; ; ; ; ; ; ; ; ; ; ; _posinf _next _retlrg move move move dec move jmp #$ffffffff,d5.m #$ffffffff,d5.l #MAX,d0.l d0 d0.l,d5.h _putsgn ; ****** Assemble Result into IEEE Format ****** _asml move d4.l,d5.m ; move move add move d0.m,d0.l #EBIAS,d1.l d1,d0 d0.l,d5.
PAGE 699
jset jset jmp _retinf move move move ori jmp #5,sr,_op1nan #4,sr,_operr _ret0 ; ; ; #0,d5.l d5.l,d5.m #MAX,d5.h #$10,ccr _putsgn ; ; ; ; ; _op1_0 depftst d7,d0,d1 jset #5,sr,_op2nan jset #4,sr,_operr ; ; ; _ret0 move move move bset jmp #0,d5.h d5.h,d5.m d5.m,d5.l #2,sr _putsgn ; ; ; ; ; _operr bset bset bset move move move jclr bset jmp #12,sr #20,sr #4,sr #$ffffffff,d5.l #$ffffffff,d5.m #$7ff,d5.h #31,d0.h,_done #31,d5.
PAGE 700
_snan2 _op2sn _op1sn _done B.5 dplib ; depftst jset jmp jset jmp d7,d0,d1 #5,sr,_snan2 _done #13,sr,_op2sn _done ftfr.x d7,d6 bset #30,d6.m ftfr.x d6,d5 bset #4,sr bset #13,sr bset #20,sr nop nop nop rts endsec ; ; ; ; ; ; ; ; ; ; ; ; ; end of subroutine NON-IEEE DOUBLE PRECISION USING SOFTWARE EMULATION ident 1,0 ; MOTOROLA DSP96002 DPLIB - VERSION 1.
PAGE 701
; extended precision number. ; ; Entry point: ieee2dplib: c(r0) ← convert(d0) ; ; Input: r0 contains the lowest address of the 4-word internal ; extended precision number ; d0 contains the DSP96002 floating-point number. ; The DSP96002 has the following floating-point formats: ; SP normalized (24 bit mantissa) ; SP denormalized ; SEP normalized (32 bit mantissa) ; SEP denormalized (encoded as DP normalized) ; DP normalized ; The SP denormalized is encoded using the U tag.
PAGE 702
page ; ; MOTOROLA DSP96002 DPLIB - VERSION 1.0 ; ; DPLIB2IEEE - Convert internal double precision format to a double ; precision format in d0. ; ; Entry point: dplib2ieee: d0 ← convert(c(r0)) ; ; Input: r0 contains the lowest address of the 4-word internal ; extended precision number ; ; Output: The returned format is DSP96002 extended precision ; floating-point format. Typical calling sequences: ; ; jsr dplib2ieee ;convert to register format ; move d0.
PAGE 703
; ; Input: r0 contains the lowest address of the 4-word internal ; extended precision number ; ; Output: r0 contains the lowest address of a 4-word internal ; extended precision number ; ; Alters: D0.l ; dp_abs clr d0.l move d0.l,x:(r0+sign) ;clear the sign word rts page ; ; MOTOROLA DSP96002 DPLIB - VERSION 1.0 ; ; DP_ADD - Add two double precision numbers.
PAGE 704
abig sub cmp jmi cmp jge sub clr lsr d5,d6 d6,d7 aequala d6,d7 dshiftx d7,d6 d2.l d0.h,d3 ;c(r0) exponent is greater ;is |r0(exp)-r1(exp)| > 63? ;yes, then c(r0) + c(r1) = c(r0) #32,d7.l ;is |r0(exp)-r1(exp)| > 31? ;no, shift both c(r1) words d2.l,d3.l ;yes, shift ms to ls d6.l,d0.h ;# of shifts to be performed ;align the mantissas #31,d7.l ; ; Add the two mantissas together ; addmant move x:(r0+sign),d6.l ;get c(r0) sign move x:(r1+sign),d7.
PAGE 705
inc jmp d4.l leave ;increment the exponent ;check for overflow ; ; Calculate the result assuming that c(r0) > 0 and X < 0 ; apos cmp d2,d0 ;compare mantissas jne decide ;if ms’s are equal, test ls’s cmp d3,d1 #1,d7.
PAGE 706
jmp echeck ; ; MOTOROLA DSP96002 DPLIB - VERSION 1.0 ; ; DP_CLR - Set the double precision number to zero. ; ; Entry point: dp_clr: c(r0) = 0 ; ; Inputs: r0 contains the lowest address of a 4-word internal ; extended precision number ; ; Outputs: r0 contains the lowest address of a 4-word internal ; extended precision number ; ; Alters: D0.L ; dp_clr clr d0.l ;get a 0 move d0.l,x:(r0) move d0.l,x:(r0+sign) move d0.l,x:(r0+ms) move d0.l,x:(r0+ls) rts page ; ; MOTOROLA DSP96002 DPLIB - VERSION 1.
PAGE 707
; GE - greater than or equal N eor V = 0 ; GT - greater than Z + (N eor V) = 0 ; LE - less than or equal Z + (N eor V) = 1 ; LT - less than N eor V = 1 ; NE - not equal Z = 0 ; ; Alters: D0.L,D1.L,D2.L ; dp_cmp move x:(r0+sign),d0.l ;get sign tst d0 x:(r0),d1.l ;get exponent jeq _pos1 ;positive bset #31,d1.l ;set sign bit _pos1 move x:(r1+sign),d0.l ;get sign tst d0 x:(r1),d2.l ;get exponent jeq _pos2 ;positive bset #31,d2.
PAGE 708
move rts page d0.l,x:(r0+sign) ;apply to destination ; ; MOTOROLA DSP96002 DPLIB - VERSION 1.0 ; ; DP_DIV - Divide two double precision numbers.
PAGE 709
dec d1 ;and adjust exponent d1.l,x:(r0) ;save new exponent _startdiv move ; ; ; unsigned fractional divide: d7:d6 / d5:d4 = d3:d2 do cmp jhi jlo cmp jhs _small andi jmp _big sub subc ori _q rol rol lsl rol _divloop move move move move eor move jmp ; #64,_divloop d5,d7 _big _small d4,d6 _big #$fe,ccr _q d4,d6 d5,d7 #$01,ccr d2 d3 d6 d7 d3.l,x:(r0+ms) d2.l,x:(r0+ls) x:(r0+sign),d0.l x:(r1+sign),d1.l d1,d0 d0.
PAGE 710
rndls and move rts sub move asr and move rts page d3,d0 d1.l,x:(r0+ls) ;truncate to an integer d0.l,x:(r0+ms) ;store the result d2,d4 x:(r0+ls),d1.l ;calculate # shifts, get ls d4.l,d0.h ;put # shifts in .h register d0,d3 ;create the truncation mask d3,d1 ;truncate to an integer d1.l,x:(r0+ls) ;store the result ; ; MOTOROLA DSP96002 DPLIB - VERSION 1.0 ; ; DP_MAC - Multiply two double precision numbers and ; accumulate the sum.
PAGE 711
dp_move move move move move move move move move rts x:(r1),d0.l d0.l,x:(r0) x:(r1+sign),d0.l d0.l,x:(r0+sign) x:(r1+ms),d0.l d0.l,x:(r0+ms) x:(r1+ls),d0.l d0.l,x:(r0+ls) ;move exponent ;move sign ;move ms ;move ls ; ; MOTOROLA DSP96002 DPLIB - VERSION 1.0 ; ; DP_MPY - Multiply two double precision numbers.
PAGE 712
eor move jmp page d7,d1 d6.l,x:(r0) d1.l,x:(r0+sign) echeck ; ; Check for overflow and underflow ; echeck move x:(r0),d0.l jset #31,d0.l,uflow jset #30,d0.l,oflow rts oflow move #$3fffffff,d0.l move d0.l,x:(r0) move #$ffffffff,d0.l move d0.l,x:(r0+ms) move d0.l,x:(r0+ls) rts uflow clr d0 move d0.l,x:(r0) move d0.l,x:(r0+sign) move d0.l,x:(r0+ms) move d0.
PAGE 713
; Entry point: dp_scale: c(r0) ← c(r0) * 2**r1 ; ; Inputs: r0 contains the lowest address of a 4-word internal ; extended precision number ; r1 contains an integer number ; ; Outputs: r0 contains the lowest address of a 4-word internal ; extended precision number ; ; Alters: D0.L,D1.L ; ; NOTE: r1 contains an integer. (It does NOT point to an address.) ; dp_scale move r1,d0.l ;put scale factor in data register move x:(r0),d1.l ;get exponent add d0,d1 #$3fffffff,d0.
PAGE 714
inc add move move move clr clr do lsl rol rol rol _initshift do lsl rol lsl rol inc sub subc jcs lsr ror inc jmp _ofl lsr ror _next lsl rol rol rol lsl rol rol rol _sqrt lsl rol lsl rol move move rts ; d0 ifcs ;if odd exponent, use 2 bits d2,d1 ;restore exponent bias d1.l,x:(r0) ;store it x:(r0+ms),d7.l ;get ms x:(r0+ls),d6.l ;get ls d4 #0,d5.l ;clear RR d3 #0,d2.l ;clear DR d0.l,_initshift ;initial shift d6 ;shift 2 bits from d7:d6 (SQR) d7 d4 ;to d5:d4 (RR) d5 #62,_sqrt ;take root of SQR into DR d2 d4.
PAGE 715
; extended precision number ; ; Outputs: r0 contains the lowest address of the 4-word internal ; extended precision number with the result ; ; Alters: D0.L,D1.L,D2.L,D3.L,D4.L,D5.L,D6.L,D7.L,D0.H,D1.H ; dp_sub jsr dp_neg ;negate the operand jsr dp_add ;add the numbers jmp dp_neg ;negate the operand ; ; MOTOROLA DSP96002 DPLIB - VERSION 1.0 ; ; DP_TST - Test a double precision operand. ; (The same as "TST.
PAGE 716
mszero tst rts move tst jne tst rts d0 ;set the correct flags x:(r0+ls),d0.l ;get ls d0 #0,d2.l ;check if ls = 0 sgntst ;if not, check sign d2 ;set the correct flags ; ; END OF DPLIB ; end Double precision FIR example ; ; ; ; ; "data" and "coef" are assumed to be in DPLIB format. Other variables are assumed to be in IEEE DP format.
PAGE 717
move move jmp ; d0.d,l:ieee_out ;output as dp ieee number (r2)-n2 ;delete last sample _loop NxN by NxN Matrix Multiplication Example ; ; Multiply Two Matrices: AB = C ; ; ***NOTE: All numbers are assumed to be in DPLIB format.
PAGE 718
MOTOROLA DSP96002 USER’S MANUAL B-199
PAGE 719
B.6 STANDARD BENCHMARK SUMMARY 56000/1 Word Icyc Benchmark B.1.1 B.1.2 B.1.3 B.1.4 B.1.5 B.1.6 B.1.7 B.1.8 B.1.9 B.1.10 Real Multiply N Real Multiplies Real Update N Real Updates N Term Real Convolution (FIR) N Term Real*Complex Convolution Complex Multiply N Complex Multiplies Complex Update N Complex Updates B.1.11 B.1.12 B.1.13 B.1.14 B.1.15 B.1.
PAGE 720
DSP96000 Word Icyc Benchmark B.1.28 Bit Field Extraction/Insertion Static Field Extraction, Zero Extend Static Field Extraction, Sign Extend Dynamic Field Extraction, Zero Extend Dynamic Field Extraction, Sign Extend Static Field Insertion Dynamic Field Insertion Static Field Clear Static Field Set Dynamic Field Clear Dynamic Field Set B.1.29 Newton-Raphson Approximation of 1.0/SQRT(x) B.1.30 Newton-Raphson Approximation of SQRT(x) B.1.
PAGE 721
DSP96000 Word Icyc Benchmark B.1.47 Evaluation of EXP2(x) B.1.487 Vector Cross Product B.1.49 Power Function X**Y, X = Single Precision FP Y = 5 Bit Integer, Straight Y = 32 Bit Unsigned Integer, Looped Y = 32 Bit Unsigned Integer, Variable Loop Y = Single Precision FP B.1.50 Cascaded Five Coefficient Biquad Filter B.1.51 CORDIC Sine (4 Quadrant/Argument Reduction) B.1.52 CORDIC Cosine (4 Quadrant/Argument Reduction) B.1.53 CORDIC Tangent (4 Quadrant/Argument Reduction) B.1.
PAGE 722
IEEE Recommended Functions and Predicates Benchmark Copysign(x,y) Arithmetic Non-arithmetic B.3.2 -x Arithmetic Non-arithmetic B.3.3 Scalb(y,N) B.3.4 Logb(x) x = NaN x = Infinity x = Zero x = In-range B.3.5 Nextafter(x,y) Either operand a NaN X is signed infinity Result is normalized Result is denormalized Result overflowed B.3.6 Finite(x) B.3.7 Isnan(x) B.3.8 x<>y B.3.9 Unordered(x,y) B.3.
PAGE 723
IEEE Double Precision Using Software Emulation TYPICAL WORST CASE FULLY TESTED B.4.1 ADDITION B.4.2 SUBTRACTION B.4.3 MULTIPLICATION 6.86 us 7.01 us 13.58 us 29.1 us 29.2 us 39.5 us YES YES YES Non-IEEE Double Precision Using Software Emulation B.5.1 B.5.2 B.5.3 B.5.4 B.5.5 B.5.6 B.5.7 B.5.8 B.5.9 B.5.10 B.5.11 B.5.12 B.5.13 B.5.14 B.5.15 B.5.
PAGE 724
APPENDIX C IEEE ARITHMETIC C.1 FLOATING-POINT NUMBER STORAGE AND ARITHMETIC C.1.1 General The IEEE standard for binary floating point arithmetic provides for the compatibility of floating-point numbers across all implementations which use the standard by defining bit-level encoding of floating-point numbers. Maximum mathematical accuracy, with respect to roundoff errors, is achieved by optimally scaling floatingpoint numbers by using a normalized exponential notation.
PAGE 725
Examples of QNaNs are results of operations such as 0/0, ∞−∞, ∞/∞, etc. Encodings of QNaNs are intended to provide some kind of retrospective diagnostic information concerning the origin of the NaN. Since this information needs to remain available even after a large number of arithmetic operations, QNaNs "propagate" unchanged through arithmetic operations and format conversions. QNaNs can thus occur as operands of an arithmetic operation.
PAGE 726
31 30 8-bit biased S exponent 23 22 0 23-bit fraction Single Precision (SP) 63 62 11-bit biased S exponent 52 51 0 52-bit fraction Double Precision (DP) Figure C-1. SP and DP IEEE Formats SP DP p-1 bias emin emax 23 52 127 1023 +1 +1 + 254 - 126 + 127 +2046 -1022 +1023 Emin Emax Table C-1. Parameters for Numerical Formats f = •b1b2•••bp-1 There are 23 fractional bits (p=24) (bits 0 through 22) in the SP format, and 52 fractional bits (p=53) (bits 0 through 51) in the DP format.
PAGE 727
p-1 x max,n emax - bias = (2 - 0.5 ) 2 = (2 - 0.25p-1) 2Emax For SP this equals approximately (using the values in Table C-1) 3.4 • 1038 . 2. Denormalized Numerical Values (e = e min -1, f ≠ 0): When the exponent e equals the value e - min 1 and the fraction field is non-zero the floating point number is called denormalized, and the implicit integer bit b0 is equal to zero. The numerical value of a denormalized number y is given by: emin-bias y = (-1)s • 0.f • 2 = (-1)s • 0.
PAGE 728
31 30 S 23 22 0 0 0 Single Precision 63 62 S 52 51 0 0 0 Double Precision Figure C-2. Encodings for + and - Zero 31 30 23 22 0 S 11..................1 0 Single Precision 63 62 52 51 0 0 S 11...............................1 Double Precision Figure C-3. Encodings for + and - Infinity 31 30 23 22 X 11..................1 0 1111....................................1 Single Precision 63 62 52 51 X 11...............................1 0 11111111.........................................
PAGE 729
generated exclusively by the DSP96002 data ALU as a result of floating point arithmetic operations, is embedded in the DP format, and is thus stored implicitly as a DP number with zeros in the lower 21 bits of the fraction. C.1.3 DSP96002 Floating Point Storage Format in the Data ALU The data ALU is designed to accommodate mixed-precision operands in a common format. To this end, a common DP storage format is used internal to the data ALU.
PAGE 730
95 94 93 92 S U V 75 74 Z 64 e 63 62 i Dn.h 32 31 Fraction (MSBs) 11 10 Fraction (LSBs) Dn.m 0 Z Dn.l S : sign U : single precision unnormalized tag V : double precision unnormalized i : explicit integer Figure C-5. DP Format in the Data ALU Emin Tiny SP Numbers between +2 -1.0 × 2 -126 ,Exclusive +1.0 × 2 -126 0 Figure C-6. Tiny Numbers on the Real Number Line specific operation to occur. The result of an invalid operation is a QNaN, as described above.
PAGE 731
result of a floating point operation (nonzero result with true exponent smaller than the minimum exponent, see Figure C-6) and (2) loss of accuracy is detected (delivered result differs from what would have been computed if the exponent range was unbounded – i. e., cannot be accurately represented as a denormalized number due to an insufficient number of bits or roundoff errors). Consider the case of floating point multiplication as an example. Let the first SP source operand have a mantissa of 1.
PAGE 732
X-Data Bus Y-Data Bus Automatic Format Conversion Unit Control and d0.h d0.m d0.l d0 d1 Arbitration Unit d2 d3 d4 d5 Register File d6 d7 d8 d9 Operands Results Add/Subtract Unit Multiply Unit Special Function Unit Figure C-8. The Data ALU Block Diagram Infinite-precision result 1.000 11100000.... 1.000 01100000.... 1.000 10000000....(absolute tie) 1.001 10000000....(absolute tie) Rounded result (to p=4 bits for example) 1.001 (round up) 1.000 (round down) 1.000 (round down) 1.
PAGE 733
algorithm. 5. Controller and arbitrator: A controller/arbitrator supplies all of the control signals necessary for the operation of the data ALU. The data ALU uses the SEP format for all of its operations: the results are automatically rounded to either SP or SEP. All of the rounding modes specified by the IEEE standard are supported. These rounding modes are: 1. Round to nearest (even): a convergent rounding mode, designed to deliver results without a rounding bias.
PAGE 734
31 30 29 e S 95 94 S 74 73 72 71 † † † † 64 e 23 22 63 62 i* Fraction 40 39 Fraction 31 30 29 S 0 X or Y Data Memory 32 31 (2) (1) 23 22 e Notes: * – i = 1 when normalized i = 0 when unnormalized † – When NaN, bits 71, 72, 73 = 1 When not NaN Bit 74 ↔ Bit 30 Bits 73, 72, 71 are complement of Bit 74. 11 10 0 (3) Dn 0 Fraction X or Y Data Memory (2) – Bits 11-31 are only nonzero when the register contains a DP floating point number.
PAGE 735
63 62 S 95 94 52 51 e 75 74 S 21 20 0 Fraction 64 e L Data Memory 63 62 i 32 31 Fraction 11 10 * 0 0 Dn i = 1 when normalized i = 0 when unnormalized 63 62 S 52 51 e 21 22 0 Fraction L Data Memory * – Bits 11-31 (in Dn) or 0-20 (in L memory) are zero when the register contains an SEP result. Figure C-10b.
PAGE 736
source is moved to the 52 bit fraction of the destination, and the implicit integer bit is made explicit. If the number is denormalized, the V tag is set. Again, extra cycles may be required when a denormalized number is used as an operand, depending on the FZ bit in the SR. The 11-bit exponent of the source is copied to the 11-bit exponent of the destination. When moving DP numbers from the data ALU to memory, the above process is reversed, as shown in Figure C-10b.
PAGE 737
C.1.5.1.1.2 SP Move Of A SP Denormalized Number This section describes what happens when a 32-bit denormalized, single precision number is written by a single precision floating-point move, into a Data ALU floating-point register D0-D9. Following the above operation, the Data ALU register will be read first by a single precision and then by a double precision floating-point move.
PAGE 738
C.1.5.1.1.3 Denormalized Numbers In Double Precision (DP) This section describes what happens when a 64-bit denormalized double precision number is written by a double precision floating-point move, into a Data ALU floating-point register D0-D9. Following the above operation, the Data ALU register will be read first by a single precision and then by a double precision floating-point move.
PAGE 739
SP move into the register 0 00000000 0100 ..... 00 \ / inv / 0 1 0 Zero \ 01110000000 0 0100 ............................. 00 SP read of the register 0 1 0 Zero 01110000000 0 0100 ............................. 00 \ / / \ 0 00000000 0100 ..... 00 Data read correctly (read as 2**(-128)) DP read of the register 0 1 0 Zero 01110000000 0 0100 ............................. 00 \ / / \ 0 01110000000 0100 ...... 00 Data read incorrectly (read as 1.01x2**( 127)) Figure C-12.
PAGE 740
DP move into the register 0 00000000000 0100 ..... 00 \ / / 0 0 1 Zero \ 00000000000 0 0100 ............................. 00 NOTE THAT THE V TAG IS SET IN THIS CASE SP read of the register 0 0 1 Zero 00000000000 0 0100 ............................. 00 \ / / \ 0 00000000 0100 ..... 00 Data read incorrectly (read as 2**(-128)) DP read of the register 0 0 1 Zero 00000000000 0 0100 ............................. 00 \ / / \ 0 00000000000 0100 .......... 00 C.1.5.1.1.
PAGE 741
. MOVE EXPONENT RANGE IN (UNBIASED) TYPE SP SP SP SP SP DP DP DP DP DP DP DP INPUT DATA U V MOVE OUT TYPE 0 SP CORRECT DP CORRECT SP CORRECT DP CORRECT SP CORRECT DP CORRECT SP CORRECT DP CORRECT SP CORRECT DP WRONG SP CORRECT DP CORRECT SP CORRECT DP CORRECT SP CORRECT DP CORRECT SP WRONG DP CORRECT SP TRUNC DP CORRECT SP WRONG DP CORRECT SP WRONG DP CORRECT SP WRONG DP CORRECT E= 128 Fraction= .0xx...
PAGE 742
Note 1 The xx...xx pattern for the signaling NaNs indicates any NON-ZERO bit pattern. Note 2 The xx...xx pattern for the non-signaling NaNs indicates any bit pattern. The DSP96002 generTags ates all ones for QNaNs.
PAGE 743
C.1.5.1.2.1 Results Rounded To SP Data ALU results are rounded to SP when the instruction is specified with the .S suffix (FMPY.S, FADD.S, etc.). The rounding mode is programmed using the rounding mode bits in the status register. C.1.5.1.2.1.
PAGE 744
or truncation. If the register is read by a single precision move, completely incorrect data will be obtained; see the discussion in Section C.1.5.1.1.3. C.1.5.1.2.3 Data ALU Results/Move Compatibility Summary Figure C-16 summarizes what happens when Data ALU operation results of a certain range are stored in the destination register, and the register is read by a certain kind of move.
PAGE 745
C.1.5.2 Multiply unit The multiply unit consists of a hardware multiplier, an exponent adder, and a control unit, as shown in Figure C-17. The multiply unit accepts two 44 bit input operands for floating point multiplications, each consisting of a sign bit, eleven exponent bits, the explicit integer bit, and 31 fractional bits. Note that for full double precision operands, as obtained by double precision MOVEs, the least significant 8 bits of the fraction are simply truncated.
PAGE 746
ROUND TO SP EXPONENT RANGE BEFORE ROUND (UNBIASED) NaN operand or invalid op SP 127
PAGE 747
ES1 ES2 Exponent Adder MS1 MS2 Multiplier Array ED Control MD positions (3 bit shift). The exponent comparator and update unit consists of an 11 bit subtracter, which compares the two exponents of the floating point operands, and delivers the difference to the barrel shifter for mantissa alignment. The largest of the two exponents is delivered to the exponent update unit.
PAGE 748
M1 M2 32 Bits 32 Bits Multiplier Array 64 Bits Round Rounding mode is determined by rounding bits in the MR. 32 Bits MD D C.1.5.4 Special Function Unit The special function unit (SFU) consists of a logic unit and a divide and square root unit. The logic unit is further described under the fixed point (integer) operations. The divide and square root unit supports execution of the divide and square root algorithms. These algorithms are iterative, and require an initial approximation or "seed".
PAGE 749
S1 S2 E1 E2 11 Bits 11 Bits Add exponents and subtract bias 11 Bits ED D put operand. These cycles are used to normalize the input operand. The original value of the operand in the source register is not affected. During the IEEE mode procedure all activity of the chip is suspended until the input operands have been normalized. When denormalized output results are detected, each denormalized output result is normalized (one additional instruction cycle).
PAGE 750
ES1 ES2 MS1 Exponent Comparator/ Update Unit MS2 Barrel Shifter/ Normalization Unit Adder Subtracter Round MD1 ED1 ED2 MD2 From Pre-normalization 32 Bits 32 Bits Adder MR 32 Bits 32 Bits Subtracter Rounding Rounding 32 Bits MR 32 Bits To Post-normalization MOTOROLA DSP96002 USER’S MANUAL C-27
PAGE 751
ES1 11 Bits ES2 11 Bits Exponent Comparator/Update Unit 11 Bits max(E1, E2) To PostNormalization 11 Bits E1-E2 To Mantissa Alignment can only be used in long (L) data memory space. 3. Unsigned Word Integer: 32 bits wide with unsigned magnitude representation. This storage format can be used in either X and/or Y data memory space. 4. Unsigned Long Word Integer: 64 bits wide with unsigned magnitude representation. This storage format can only be used in long (L) data memory space.
PAGE 752
tination register. 2. Multiplier: The multiplier in the multiply unit described in paragraph C.1.5.2 also performs the integer multiplications. It accepts two 32-bit operands in the low portion of the data ALU source registers, and delivers a 64-bit result in the low and middle portions of the destination register. Both signed and unsigned multiplications are supported. 3. Logic Unit: The logic unit is responsible for the logical operations AND, ANDC, OR, ORC, EOR, NOT, ROR.
PAGE 753
APPENDIX D D.1 FLOATING-POINT NUMBER STORAGE AND ARITHMETIC D.1.1 General The IEEE standard for binary floating point arithmetic provides for the compatibility of floating-point numbers across all implementations which use the standard by defining bit-level encoding of floating-point numbers. Maximum mathematical accuracy, with respect to roundoff errors, is achieved by optimally scaling floatingpoint numbers by using a normalized exponential notation.
PAGE 754
are intended to provide some kind of retrospective diagnostic information concerning the origin of the NaN. Since this information needs to remain available even after a large number of arithmetic operations, QNaNs "propagate" unchanged through arithmetic operations and format conversions. QNaNs can thus occur as operands of an arithmetic operation. If one or more QNaN occur as operands, the result is a quiet NaN, and no floating point exception is signaled. Hence the name "quiet" NaN.
PAGE 755
31 30 8-bit biased S exponent 23 22 0 23-bit fraction Single Precision 63 62 11-bit biased S exponent 52 51 0 52-bit fraction Double Precision Figure D-1. SP and DP Formats SP DP p-1 bias emin emax 23 52 127 1023 +1 +1 +254 +2046 Table D-1. Parameters for Numerical Formats f = •b1b2•••bp-1 There are 23 fractional bits (p=24) (bits 0 through 22) in the SP format, and 52 fractional bits (p=53) (bits 0 through 51) in the DP format.
PAGE 756
p-1 x max,n emax - bias = (2 - 0.5 ) 2 For SP this equals approximately (using the values in Table D-1) 3.4 • 1038 . 2. Denormalized Numerical Values (e = e min -1, f ≠ 0): When the exponent e equals the value e - min 1 and the fraction field is non-zero the floating point number is called denormalized, and the implicit integer bit b0 is equal to zero. The numerical value of a denormalized number y is given by: emin-bias y = (-1)s • 0.
PAGE 757
31 30 S 23 22 0 0 0 Single Precision 63 62 52 51 0 0 S 0 Double Precision Figure D-2. Encodings for + and - Zero 31 30 S 23 22 0 0 11..................1 Single Precision 63 62 S 52 51 0 0 11...............................1 Double Precision Figure D-3. Encodings for + and - Infinity 31 30 23 22 X 11..................1 0 1111....................................1 Single Precision 63 62 X 11...............................1 52 51 0 11111111...........................................
PAGE 758
D.1.3 IEEE Floating Point Exceptions The IEEE standard defines five types of exceptions which must be signaled when detected. The DSP96002 implements the default "trap disabled" way of signaling exceptions: when an exception occurs, a flag is set and program execution continues. The flag remains set until cleared by the user. The different exceptions are: 1. Invalid operation: The invalid operation exception is signaled when an operand is invalid for the specific operation to occur.
PAGE 759
livered result is the correct SP denormalized number. 5. Inexact: The inexact exception is signaled if the delivered result differs from what would have been obtained with infinite-precision arithmetic. For instance, the examples of underflow shown above deliver numerically inexact results, and thus set the inexact flag. Another example is the case where floating point numbers are rounded up or down. D.1.
PAGE 760
95 94 93 92 S U V 75 74 O 64 E 63 62 32 31 Fraction (MSBs) i Dn.h 11 10 Fraction (LSBs) Dn.m 0 0 Dn.l S : sign U : single precision unnormalized tag V : double precision unnormalized i : explicit integer Figure D-5. DP Format in the Data ALU Emin Tiny SP Numbers between ±2 -1.0 × 2 -126 +1.0 × 2 -126 0 Figure D-6. Tiny Numbers register file consisting of 10 96-bit registers for storage of floating-point numbers is available for that purpose.
PAGE 761
X-Data Bus Y-Data Bus Automatic Format Conversion Unit Control and d0.h d0.m d0.l d0 d1 Arbitration Unit d2 d3 d4 d5 Register File d6 d7 d8 d9 Operands Results Add/Subtract Unit Multiply Unit Special Function Unit Figure D-8. The Data ALU 1. Round to nearest (even): a convergent rounding mode, designed to deliver results without a rounding bias. In this case the infinite-precision result is rounded to the finite-precision result which is closest.
PAGE 762
4. Round to minus infinity: results are always rounded in the direction of minus infinity, or "down". D.1.5.1 Register file and automatic format conversion unit The general-purpose register file consists of ten 96-bit registers named d0..d9, as shown in Figure D-9. Each 96-bit register accommodates the DP internal floating point storage format. Each 96-bit register is obInfinite-precision result 1.000 11100000.... 1.000 01100000.... 1.000 10000000....(absolute tie) 1.001 10000000....
PAGE 763
31 30 29 E S 95 94 S 74 73 72 71 † † † † 64 E 23 22 63 62 i* Fraction 40 39 Fraction 31 30 29 S 0 X or Y Data Memory 32 31 (2) (1) 23 22 E Notes: * – i = 1 when normalized i = 0 when unnormalized † – When NaN bits 71, 72, 73 = 1 When not NaN Bit 74 ↔ Bit 30 Bits 73, 72, 71 are complement of Bit 74. 11 10 (3) 0 Dn 0 Fraction X or Y Data Memory (2) – Bits 11-31 are only nonzero when the register contains a DP floating point number.
PAGE 764
63 62 S 95 94 52 51 E 75 74 S 0 Fraction 64 E 21 20 L Data Memory 63 62 32 31 Fraction i 11 10 * 0 0 Dn i = 1 when normalized i = 0 when unnormalized 63 62 S 52 51 E 21 22 0 Fraction L Data Memory * – Bits 11-31 (in Dn) or 0-20 (in L memory) are zero when the register contains an SEP result. Figure D-10b.
PAGE 765
ure D-10b. Note that the 52-bit fraction may actually consist of zeros (21 or 29) if the number in question was the result of a SEP arithmetic or a SP move. SEP arithmetic result precision can only be retained in memory by using DP moves. D.1.5.1.1 FLOATING-POINT MOVES TO/FROM DATA ALU REGISTERS The following sections deal with the case where a write (move in) is followed by a read (move out) without any floating-point operation being actually performed on the Data ALU register (save-restore procedure).
PAGE 766
One should notice that both single and double precision floating-point moves out of the register will produce correct results in this case. SP move into the register 0 01111111 0000 ..... 00 \ / inv / 0 0 0 Zero 01111111111 1 0000 ............................. 00 SP read of the register 0 0 0 Zero 01111111111 1 0000 ............................. 00 \ / 0 01111111 0000 ..... 00 Data read correctly (read as 1.0) DP read of the register 0 0 0 Zero 01111111111 1 0000 .............................
PAGE 767
lowing the above operation, the Data ALU register will be read first by a single precision and then by a double precision floating-point move.
PAGE 768
the register will yield the wrong data in this case. SP move into the register 0 00000000 0100 ..... 00 \ / inv / 0 1 0 Zero \ 01110000000 0 0100 ............................. 00 SP read of the register 0 1 0 Zero 01110000000 0 0100 ............................. 00 \ / / \ 0 00000000 0100 ..... 00 Data read correctly (read as 2**(-128)) DP read of the register 0 1 0 Zero 01110000000 0 0100 ............................. 00 \ / / \ 0 01110000000 0100 ...... 00 D.1.5.1.1.
PAGE 769
Following the above operation, the Data ALU register will be read first by a single precision and then by a double precision floating-point move. The denormalized double precision data is stored in the Data ALU register with the V tag set and the exponent set to $000 (always). The V-TAG set indicates that floating-point multiply operations will require extra cycles to wrap it ("normalize") before using it as operand.
PAGE 770
DP move into the register 0 00000000000 0100 ..... 00 \ / / 0 0 1 Zero \ 00000000000 0 0100 ............................. 00 NOTE THAT THE V TAG IS SET IN THIS CASE SP read of the register 0 0 1 Zero 00000000000 0 0100 ............................. 00 \ / / \ 0 00000000 0100 ..... 00 Data read incorrectly (read as 2**(-128)) DP read of the register 0 0 1 Zero 00000000000 0 0100 ............................. 00 \ / / \ 0 00000000000 0100 .......... 00 D.1.5.1.1.
PAGE 771
. MOVE EXPONENT RANGE IN (UNBIASED) TYPE SP SP SP SP SP DP DP DP DP DP DP DP DP INPUT DATA TAGS U V MOVE OUT TYPE 0 SP CORRECT DP CORRECT SP CORRECT DP CORRECT SP CORRECT DP CORRECT SP CORRECT DP CORRECT SP CORRECT DP WRONG SP CORRECT DP CORRECT SP CORRECT DP CORRECT SP CORRECT DP CORRECT SP WRONG DP CORRECT SP TRUNC DP CORRECT SP WRONG DP CORRECT SP WRONG DP CORRECT SP WRONG DP CORRECT e= 128 Fraction= .0xx...
PAGE 772
Note 2 The xx...xx pattern for the non-signaling NaNs indicates any bit pattern. Note 3 If a register is written with a SNAN using a double precision floating-point move and then the same register is read using single precision floating-point move the result will be a single precision SNAN (if the first 23 bits of the fraction are a non-zero pattern) or single precision infinity (if the first 23 bits of the fraction are a zero pattern).
PAGE 773
D.1.5.1.2.2 Results Rounded To SP That Are Normalized If the Data ALU operation result was rounded to SP and the rounded result may be represented as a normalized single precision floating-point number, the result will be stored in normalized DP format that may be read out by single and double precision moves without errors or truncation. D.1.5.1.2.
PAGE 774
D.1.5.1.2.7 Data ALU Results/Move Compatibility Summary Figure C-3 summarizes what happens when Data ALU operation results of a certain range is stored in the destination register, and the register is read by a certain kind of move. All cases where "move out type"=SP and "move out result"=WRONG can be corrected by rounding in the instruction (using the .S option). The case where "move out type"=SP and "move out result"=TRUNC can also be corrected by using the .S option.
PAGE 775
ROUND TO SEP SEP SEP EXPONENT RANGE BEFORE ROUND (UNBIASED) DATA ALU OPERATION RESULT -150< e < -126 denormalized in SP normalized in SEP -1023< e < -149 zero in SP normalized in SEP -1054< e< -1022 zero in SP denormalized in SEP SEP e< -1053 zero in SP zero in SEP (underflow) TAGS U V MOVE OUT TYPE 0 0 SP WRONG DP CORRECT SP WRONG DP CORRECT SP WRONG DP CORRECT SP CORRECT DP CORRECT 0 0 0 0 1 0 MOVE OUT RESULT Figure C- 4.
PAGE 776
ES1 ES2 Exponent Adder MS1 MS2 Multiplier Array ED Control MD Figure D-11. The Multiply Unit in the barrel shifter and normalization unit, after which they are added in the add unit. The result is then rounded to 32-bits for SEP results, and to 24 bits for SP results, as indicated by the instruction opcode. The type of rounding implemented depends on the rounding mode bits in the MR register. The rounded result is stored in the middle portion (mantissa) of the destination register.
PAGE 777
M1 M2 32 Bits 32 Bits Multiplier Array 64 Bits Round Rounding mode is determined by rounding bits in the MR. 32 Bits MD D Figure D-12. The Multiply Unit this exponent for normalization of the result, after which the exponent (biased) is stored in the high portion of the destination register. This is depicted in Figure D-16. For example, if the mantissa of the first operand in a floating point addition is 1.010...0, with biased exponent of 10, and the mantissa of the second operand is 1.000...
PAGE 778
S1 S2 M1 M2 11 Bits 11 Bits Add exponents and subtract bias 11 Bits ED D Figure D-13. The Exponent Adder provide an initial approximation to 1/x and sqrt(1/x), as is described in Appendix A. D.1.5.5 Controller and Arbtrator Unit The controller and arbitrator (CA) unit supplies control signals to the processing units of the data ALU and register file, and is responsible for the full implementation of the IEEE standard.
PAGE 779
ES1 ES2 MS1 Exponent Comparator/ Update Unit MS2 Barrel Shifter/ Normalization Unit Adder Subtracter Round MD1 ED1 ED2 MD2 Figure D-14. The Adder/Subtracter From Pre-normalization 32 Bits 32 Bits Adder MR 32 Bits 32 Bits Subtracter Rounding Rounding 32 Bits MR 32 Bits To Post-normalization Figure D-15.
PAGE 780
ES1 ES2 11 Bits 11 Bits Exponent Comparitor/Update Unit 11 Bits 11 Bits max(E1, E2) To PostNormalization E1-E2 To Mantissa Alignment Figure D-16. Exponent Comparator/Update Unit. D.2 FIXED-POINT NUMBER STORAGE AND ARITHMETIC D.2.1 General Integer operand sizes are defined as follows: 1. Byte: 8 bits long 2. Short word: 16 bits long 3. Word: 32-bits long 4.
PAGE 781
4. Unsigned Long Word Integer: 64 bits wide with unsigned magnitude representation. This storage format can only be used in long (L) data memory space. Long type integers can be moved to and from the data ALU register file. However, long integers can not be directly used as input operands to data ALU operations. Long integers can however be results of data ALU operations. D.2.3 Integer Storage Format in the Data ALU There are thirty 32-bit registers in which can contain integer words.
PAGE 782
Order this document by DSP96002UM/AD Motorola reserves the right to make changes without further notice to any products herein to improve reliability, function or design. Motorola does not assume any liability arising out of the application or use of any product or circuit described herein; neither does it convey any license under its patent rights nor the rights of others.
PAGE 783
MOTOROLA SEMICONDUCTOR TECHNICAL DATA Order this document by DSP96002UMAD/AD DSP96002 Addendum to DSP96002 Digital Signal Processor User Manual THE DSP96002 INSTRUCTION CACHE and 32-BIT TIMER/EVENT COUNTER FOREWORD This document is an addendum to the DSP96002 IEEE Floating-Point Dual-Port Processor User’s Manual (DSP96002UM/AD).
PAGE 784
Integer Mode The integer performance on the DSP96002 has been doubled with the introduction of the Integer Mode (IM).
PAGE 785
CONTROL 19 BUS CONTROL ADDRESS GENERATION UNIT (AGU) BUS CONTROL CONTROL 19 YAB* ADDRESS EXTERNAL ADDRESS SWITCH PAB * INTERNAL SWITCH and BIT MANIPULATION UNIT PORT A X DATA MEMORY 512x32 RAM * Y DATA * MEMORY 512x32 RAM PORT B PROGRAM * MEMORY 1024x32 RAM and 64x32 BOOTSTRAP ROM 32-BIT HOST INTERFACE TIMER ADDRESS 4 DUAL CHANNEL DMA CONTROLLER 4 EXTERNAL ADDRESS SWITCH XAB * 32-BIT HOST INTERFACE 512x32 ROM 512x32 ROM INSTRUCTION CACHE TIMER DDB YDB 32 DATA EXTERNAL DATA BUS SW
PAGE 786
The DSP96002 instruction cache is a “real-time” cache and therefore it has no inherent penalty on a cache miss. In other words, if there is a cache hit, it takes exactly one bus cycle to fetch the instruction from the on-chip cache - like fetching any other data from an on-chip memory. If there is a cache miss, it behaves exactly as a “normal” instruction fetch, as if it were fetching any other data from that external memory.
PAGE 787
32 bit address 25 bit Tag Field 7 bits comparator 0 128 valid bits for sector 0 tag 0 Tag Values hit/miss Figure 2 - Cache Controller Block Diagram Since there are 8 sectors of 128 words each, in the internal program RAM, the 32 bit address is divided into the following two fields: • • 7 LSBs for the word displacement or offset in the sector 25 MSBs for the tag The sector placement algorithm is fully associative so that each external program memory sector could be placed in any of the 8 internal pr
PAGE 788
2.3 CACHE OPERATION During cache operation each instruction is fetched on demand, only when it is needed. When the core generates an address for an instruction fetch, the cache controller compares the tag field portion of the physical address to the tag values currently stored in the tag register file. The tag values are the memory sector’s 25 upper bits currently mapped into the cache. When a tag match occurs (i.e. sector hit), then the valid-bit of the corresponding word in that cache sector is checked.
PAGE 789
Cache Enable (CE) bit. When the CE bit is cleared (0) the DSP96002 is in PRAM mode. When the CE bit is set, the processor is in cache mode. The CE bit is cleared during reset. 31 6 reserved 5 4 SPM CE 3 2 DE MC 1 0 MB MA Cache Enable 2.5 NEW INSTRUCTIONS The DSP96002 instruction set features six new instructions discussed in the following paragraphs to support the instruction cache operation. APPENDIX A, starting on page 54, presents a full description for each of the new instructions. 2.5.
PAGE 790
will load the least recently used cache sector tag with the 25 most significant bits of the sum and then lock that cache sector. The instruction will update the LRU stack accordingly. The displacement is a 2’s complement 32-bit integer that represents the relative distance from the current PC to the address to be locked. Short Displacement, Long Displacement and Address Register PC Relative addressing modes may be used.
PAGE 791
2.6 CACHE OPERATING MODES There are two main operating modes for the DSP96002: cache mode and PRAM mode. They are both global, as they affect the internal program memory as a whole. When the processor is in cache mode, each separate sector could be in one of two operating modes: sector unlocked mode or sector locked mode. When the processor is in PRAM mode the PRAM itself could be in one of two modes: PRAM enabled or PRAM disabled.
PAGE 792
cache sector is unlocked. As a result of this sequence, the unlocked cache sector is placed at the top of the LRU stack, as it is the most recently used. Unlocking a locked cache sector using the PUNLOCK or PUNLOCKR instructions does not affect the sector’s contents, its tag, or its valid-bits.
PAGE 793
Locking a sector does not affect the contents of the cache sector (instructions already fetched into the cache sector storage area), the valid-bits or the tag register contents of that particular sector. 2.6.2 PRAM Mode In the PRAM mode the DSP96002 is fully compatible with the original DSP96002. The internal program RAM is either enabled or disabled, according to the OMR. DMA references to/from program memory, and the MOVEM instruction are fully enabled.
PAGE 794
The PFLUSH instruction is not performed automatically when switching from cache mode to PRAM mode to give the user full control of the cache. 2.7 SECTOR REPLACEMENT POLICY When a sector miss occurs, a cache sector must be selected to contain the new desired memory sector. The selected cache sector typically contains another memory sector. The sector replacement policy determines which sector would be flushed from the cache, and thus frees the cache sector for the new memory sector.
PAGE 795
since these will be usually locked, all further accesses to these locations would not cause a miss and therefore the external Program Memory would not be read. In this case the non-consistency would have no affect. On the other hand, a user that switches from PRAM mode to cache mode and doesn’t want the content to be kept should issue the PFLUSH instruction and therefore prevent this situation altogether.
PAGE 796
to does), the content of that word is changed in the internal Program Memory. This should be transparent to the user since, although the word content had been changed, it’s validbit remains cleared as it was, and therefore the content is meaningless. Nevertheless, if the user switches to PRAM mode without flushing the cache the new word content could be meaningful. 2.
PAGE 797
be executed to set or clear OMR bit 4 without affecting other OMR bits, which could be changed safely three cycles later. 2.12.2 Change of OMR Bit 4 Relative to PLOCK/PUNLOCK The instruction that sets OMR bit 4 should appear at least three instruction cycles prior to a PLOCK or PUNLOCK instruction, otherwise an illegal instruction trap will be executed. 2.12.3 Fetches Following a PFLUSH Instruction When the processor is in cache mode, the first two words following a PFLUSH instruction are not cached.
PAGE 798
2.13 CACHE USE SCENARIO This section demonstrates a possible scenario of cache use in a real time system. 1. The DSP96002 leaves the hardware reset in PRAM mode as determined by the mode bits in the OMR. 2. To achieve “hit on first access” (especially important for the fast interrupt vectors), the user, while still in PRAM mode and using DMA, transfers the interrupt vectors and some critical routines into the lower PRAM addresses. The DMA transfers set the corresponding valid-bits.
PAGE 799
Notice that the code doesn’t fall within the critical sectors, but rather in the initialization code. PLOCK is the first instruction fetched in cache mode. 4. Now the cache is ready for normal operation with 2 sectors locked and 6 sectors in unlocked mode. Notice that a fetch from one of the locked sectors (addresses 0 to 200) will not cause a miss since the code for these sectors was brought into the cache while in the processor is in PRAM mode. 5. The user can lock an additional sector dynamically.
PAGE 800
ANDI #$ef, OMR ; clear CE bit in OMR NOP ; pipeline delay NOP ; pipeline delay NOP ; pipeline delay PFLUSH MOVEI #$04, OMR ; bootstrap from Port A NOP ; pipeline delay JMP #0 ; jump to bootstrap ROM Notice that PFLUSH was fetched and executed in PRAM mode. It could have appeared one cycle earlier, in which case it would have been fetched in cache mode but executed in PRAM mode. 3 INTEGER MODE The DSP96002’s integer performance has been doubled with the definition of the new integer mode.
PAGE 801
3.1 CHANGE TO THE PROGRAMMING MODEL (INTEGER MODE) To support the integer mode, bit 25 of the status register now features a new integer mode (IM) bit as shown in Figure 3. When the IM bit is cleared (0) the integer mode is disabled. When the IM bit is set, the processor is in integer mode. The IM bit is cleared during reset.
PAGE 802
operations that yield single-precision results, then the two register files are completely decoupled - thus effectively doubling the amount of registers available for the data ALU. 4.1 CHANGE TO THE PROGRAMMING MODEL (SINGLE PRECISION MODE) To support the single precision mode, bit 5 of the OMR supports a new single precision mode (SPM) bit. When OMR bit 5 is clear, the single precision mode is disabled. When OMR bit 5 is set, the processor is in the single precision mode.
PAGE 803
5 OnCE ENHANCEMENTS The OnCE has been enhanced to provide the user with fully non-intrusive system debug capability when the processor is in cache mode. When the processor is in debug mode, the OnCE offers the ability to observe the cache status, such as which memory sectors are currently mapped into cache sectors, which cache sectors are locked, and which cache sector is the least recently used by reading the tag registers contents, lock bits, and LRU bits serially.
PAGE 804
Table 1 Register Select Bits 4-0 (RS4-RS0) RS4-RS0 Register Selected 00111 Breakpoint Program Memory Lower-Equal (OPLLR) 01000 Transfer Register (OGDBR) 01001 Program Data Bus Latch (OPDBR) 01010 Program Address Bus Latch for Fetch (OPABF) 01011 Program Instruction Latch (OPILR) 01100 Clear Program Breakpoint Counter 01101 Clear Data Breakpoint Counter 01110 Clear Trace Counter 01111 Reserved 10000 Reserved 10001 Program Address Bus FIFO and Increment Counter 10010 Tags Buffer 1001
PAGE 805
tors could be “least recently used” although they can not be replaced. Therefore, the “next to be replaced sector” is the only sector whose lru bit is set and lock bit cleared. The exception to this rule is the case where all of the eight sectors are locked and designated as “least recently used”, in which case there is no “next to be replaced sector” because no sector will be replaced until at least one sector is unlocked.
PAGE 806
5.4 USING THE OnCE FOR CACHE OBSERVABILITY 5.4.1 Displaying the tags, locks and LRU status 1. ACK 2. Save pipeline information: 1. Send command READ PDB REGISTER 2. ACK 3. CLK 4. Send command READ PIL REGISTER (instruction latch). 5. ACK 6. CLK 3. Read the 9 registers from the tags buffer: 1. Send command READ TAGS BUFFER (read tag 0 and increment pointer). 2. ACK 3. CLK 4. Send command READ TAGS BUFFER (read tag 1 and increment pointer). 5. ACK 6. CLK 7.
PAGE 807
5.4.2 Displaying the Valid-bits of Specific Cache Locations Starting From Address xxx This routine uses R0 as pointer to cache addresses. Therefore this register has to be read before the routine, and has to be loaded with the value xxx. At the end of the routine, the values of R0 must be restored. See Section 10.12.3 in the DSP96002 User’s Manual (DSP96002UM/AD) for an example. 1. Send command WRITE PDB REGISTER and GO (no EX). (ODEC selects PDB as destination for serial data.) 2. ACK 3.
PAGE 808
1. Send command WRITE PDB REGISTER and GO (no EX). (ODEC selects PDB as destination for serial data.) 2. ACK 3. Send the 32-bit opcode: “ORI #$10, OMR” (After the 32 bits have been received, the PDB register drives the PDB. ODEC releases the chip from “halt” state and the ORI instruction is executed. This instruction sets the “CE” bit in the OMR register. The signal that marks the end of the instruction returns the chip to the “halt” state and an acknowledge is issued to the command controller.) 4.
PAGE 809
15. Send command WRITE PDB REGISTER (no GO, no EX). (ODEC selects PDB as destination for serial data.) 16. ACK 17. Send 32 bits of the target absolute address for the second processor ($xxxxxxxx). 18. ACK The sequence of instructions described above will be repeated for the remaining processors in the system. Finally the command controller will select ALL the processors in the system and will issue in a broadcast manner the synchronous GO command. 19. Send command GO and EX with no register select.
PAGE 810
6 INTRODUCTION TO THE TIMER/EVENT COUNTER This section describes the two identical and independent timer/event counter modules now featured on the DSP96002. The timer can use internal or external clocking and can interrupt the processor after a number of events specified by a user program, or it can signal an external device after counting internal events. The timer can also be used to trigger DMA transfers after a specified number of events (clocks) occurs.
PAGE 811
ADDRESS BUS A aA0-aA31 Vcc Vss 32 32 (32) (2) (4) DATA BUS A aD0-aD31 Vcc Vss ADDRESS BUS B (32) bA0-bA31 (2) Vcc (4) Vss 32 32 (32) (2) (4) DATA BUS B (32) bD0-bD31 (2) Vcc (4) Vss PORT A BUS CONTROL PORT B BUS CONTROL aS1 aS0 bS1 bS0 aR/W aWR bR/W bWR DSP96002 223 PINS aBS bBS aBL bBL aTT bTT aTS bTS aTA bTA aAE bAE aDE bDE aHS aHA bHS bHA aHR bHR aBR bBR aBG bBG aBB bBB aBA bBA Vcc (1) (1) Vcc Vss (2) (2) Vss TIMER/EVENT COUNTER (2) INTERRUPT AND MODE C
PAGE 812
1 6 7 10 11 12 13 A BA23 BA27 BA29 BA31 IRQA ABB ABR TIO0 AR/W AS0 ATS AAE AA02 AA04 AA07 AA10 AA13 AA16 A B BA20 BA25 BA28 BA30 IRQB ABG ABA BTT AS1 ABS AA00 AA03 AA06 AA09 AA11 AA14 AA18 AA20 B C BA17 BA21 BA26 GNDN IRQC RES ABL ATT AWR AA01 AA05 AA08 AA12 AA15 AA17 AA19 AA21 AA23 C D BA15 BA18 BA24 GNDN GNDN GNDN VCCN VCCN VCCQ GNDQ VCCN GNDN GNDN GNDN AA22 AA25 AA26 D E BA13 BA16 BA22 GNDN GNDN AA24 AA28 AA29 E F BA12 BA14 BA19 GNDN GNDN AA27 AA30 AD31 F G BA
PAGE 813
The DSP96002 views each timer as a memory-mapped peripheral occupying two 32-bit words in the X data memory space, and may use each timer as a normal memory-mapped peripheral by using standard polled or interrupt programming techniques.The programming model is shown in Figure 5. 6.2 TIMER CONTROL/STATUS REGISTER (TCSR) The 32-bit read/write TCSR controls the timer and verifies its status. The TCSR can be accessed by normal move instructions and by bit manipulation instructions.
PAGE 814
31 30 29 28 27 TE TIE INV TC2 TC1 (0) (0) (0) (0) (0) 23 22 21 20 19 DIR DI DO (0) (1) (0) ** ** 15 14 13 12 11 ** ** ** ** ** 7 6 5 4 ** ** ** ** 26 25 TC0 GPIO (0) 18 ** (0) 17 24 TS (0) 16 ** ** 9 8 ** ** ** 3 2 1 0 ** ** ** ** 10 READ/WRITE TIMER CONTROL/STATUS REGISTER (TCSR0) ADDRESS X:$FFFFFFE0 ** - reserved, read as zero, should be written with zero for future compatibility The numbers in parentheses represent the bits’ reset va
PAGE 815
the status of the INV bit is crucial to the timer’s function. Change it only when the timer is disabled (TE=0). 6.2.4 Timer Control (TC2-TC0) Bits 28-26 The three TC bits control the source of the timer clock, the behavior of the TIO pin, and the timer mode of operation. Table 2 summarizes the functionality of the TC bits. A detailed description of the timer operating modes is given in Section 6.4 on page 35. The timer control bits are cleared by hardware RESET and software RESET (RESET instruction).
PAGE 816
6.2.6 Timer Status (TS) Bit 24 When the TS bit is set, it indicates that the counter has been decremented to zero. The TS bit is cleared when the TCSR is read. The bit is also cleared when the timer interrupt is serviced (timer interrupt acknowledge). TS is cleared by hardware and software resets. 6.2.7 Direction (DIR) Bit 23 The DIR bit determines the behavior of the TIO pin when TIO acts as general purpose IO. When DIR=0, the TIO pin acts as an input. When DIR=1, the TIO pin acts as an output.
PAGE 817
In Timer Modes 4 and 5, however, the TCR will be loaded with the current value of the counter on the appropriate edge of the TIO input signal (rather than with a value specified by the user program). The value loaded to the TCR represents the width or the period of the signal coming in on the TIO pin, depending on the timer mode. See Sections 6.4.4 and 6.4.5 for detailed descriptions of Timer Modes 4 and 5. 6.4 TIMER MODES OF OPERATION This section gives the details of each of the timer modes of operation.
PAGE 818
write to TCR (N) stop counting first event TE Clock (CLK/2) TCR Counter N N-k N-k-1 N-k-1 N N-1 TS Interrupt Figure 10 - Timer Disabled Note: It is recommended that the GPIO input function of Mode 0 only be activated with the timer disabled. If the processor attempts to read the DI bit, it must read the entire TCSR register, which would clear the TS bit and, thus, clear a pending timer interrupt. 6.4.
PAGE 819
write to first TCR (N) event last event new event TE Clock (CLK/2) TCR N Counter N N-1 0 N N-1 Interrupt 2xCLK TIO Figure 11 - Standard Timer Mode, Internal Clock, Output Pulse Enabled (INV=0) write to first TCR(N) event last event new event TE Clock (CLK/2) TCR N Counter N N-1 0 N N-1 Interrupt TIO 2xCLK Figure 12 - Standard Timer Mode, Internal Clock, Output Pulse Enabled (INV=1) two (CLK/2).
PAGE 820
TS bit in TCSR is set and, if the TIE is set, an interrupt is generated.The counter is reloaded with the value contained by the TCR and the entire process is repeated until the timer is disabled (TE=0). Each time the counter reaches 0, the TIO output pin will be toggled. The INV bit determines the polarity of the TIO output. Figure 13 illustrates Timer Mode 2.
PAGE 821
start event stop event start event TE Clock TCR xxx Counter yyy N 0 1 N-1 0 N Interrupt TIO Figure 14 - Pulse Width Measurement Mode (INV=0) start event stop event start event TE Clock TCR xxx Counter yyy N 0 1 N-1 N 0 Interrupt TIO Figure 15 - Pulse Width Measurement Mode (INV=1) MOTOROLA 39
PAGE 822
6.4.5 Timer Mode 5 (Period Measurement Mode) Timer Mode 5 is defined by TC2-TC0 equal to 101. In Timer Mode 5, the counter is driven by a clock derived from the DSP’s internal clock divided by 2 (CLK/2). With the timer enabled (TE=1), the counter is loaded with the value contained by the TCR and starts incrementing. On each transition of the same polarity that occurs on TIO, the TS bit in TCSR is set and, if TIE is set, an interrupt is generated. The contents of the counter is loaded in the TCR.
PAGE 823
periodic event (first event) periodic event TE Clock N TCR N+1 N Counter N+1 M N+2 M-1 M M+1 M+2 Interrupt TIO Figure 16 - Period Measurement Mode (INV=0) periodic event (first event) periodic event TE Clock N TCR Counter M N+1 N N+1 N+2 M-1 M M+1 M+2 Interrupt TIO Figure 17 - Period Measurement Mode (INV=1) MOTOROLA 41
PAGE 824
write to TCR (N) last event first event TE TIO (Event) N TCR N-1 N Counter N 0 Interrupt Figure 18 - Event Counter Mode, External Clock (INV=0) write to TCR (N) first event last event TE TIO (Event) TCR Counter N N N-1 0 N Interrupt Figure 19 - Event Counter Mode, External Clock (INV=1) 42 MOTOROLA
PAGE 825
6.5 TIMER BEHAVIOR DURING WAIT and STOP During the execution of the WAIT instruction, the timer clocks are active and the timer activity continues undisturbed. If the timer interrupt is enabled when the final event occurs, an interrupt will be generated and serviced. It is recommended that the timer be disabled before executing the STOP instruction because, during the execution of the STOP instruction, the timer clocks are disabled and the timer activity will be stopped.
PAGE 826
6.7.2 General purpose IO output The following routine can be used to write the TIO1 output pin: movep #$02800000,x:TCSR1 ;clear TC2-TC0, set GPIO ;and set DIR for GPIO output, set TIO1 to 0 movep #$02a00000,x:TCSR1 ; set TIO1 to 1 movep #$02800000,x:TCSR1 ; set TIO1 to 0 This routine generates a pulse on the TIO1 pin with the duration equal to 8 CLK (assuming no wait states, no external bus conflict etc.) 6.7.
PAGE 827
6.7.4 Pulse width measurement mode (mode 4) The following program illustrates the use of the timer module for input pulse width measurement. The width is measured in this example for the low active period of the input pulse on the TIO1 pin and is stored in a table (in multiples of the chip operating clock divided by 2).
PAGE 828
6.7.5 Period measurement mode (mode 5) The following program illustrates the usage of the timer module for input period measurement. The period is measured in this example between 0 to 1 transitions of the input signal on TIO0 and is stored in a table (in multiples of the chip operating clock divided by 2).
PAGE 829
7 ADDITIONAL CHANGES This section presents various other changes to the DSP96002 to support the addition of the Timer/Event Counter modules. Specifically, two new DMA mask bits (M7 and M8) were added to the DMA Control/Status Register. Figure 20 and Figure 20 indicate the changed DMA Controller Programming Models. Table 3 indicates the DMA Request Mask Bits functions.
PAGE 830
31 0 DMA Source Modifier Register DSM1 addr X:$FFFFFFD7 DMA Source Address Register DSR1 addr X:$FFFFFFD6 DMA Source Offset Register DSN1 addr X:$FFFFFFD5 DMA Destination Modifier Register DDM1 addr X:$FFFFFFD3 DMA Destination Address Register DDR1 addr X:$FFFFFFD2 DMA Destination Offset Register DDN1 addr X:$FFFFFFD1 DMA Counter DCO1 addr X:$FFFFFFD4 31 30 29 28 27 DIE * DTD * 22 21 20 19 18 17 16 DCP * * * * * * M8 15 14 13 12 11 10 9 8 M7 M6 M5 M4 M3 M2 M1 M0 7
PAGE 831
Each requesting device input is first individually ANDed with its respective mask bit (M0,M1,etc) and then all AND outputs are ORed together. The OR output goes to the edge-triggered latch whose output initiates the DMA transfer. If an input is unmasked, asserting that input will set the latch and initiate a DMA transfer. The DMA state machine clears the latch when accessing the DMA source address.
PAGE 832
Table 4 Internal I/O Memory Map of the X Data Memory Space ADDRESS REGISTER ADDRESS $FFFFFFFF IPR - Interrupt Priority Register $FFFFFFE0 TCSR0 - Timer Control Status Register 0 $FFFFFFFE BCRA - Port A Bus Control Register $FFFFFFDF DSM0 -DMA CH0 Source Modifier Register $FFFFFFFD BCRB - Port B Bus Control Register $FFFFFFDE DSR0 -DMA CH0 Source Address Register $FFFFFFFC PSR - Port Select Register $FFFFFFDD DSN0 -DMA CH0 Source Offset Register $FFFFFFDC DCO0 -DMA CH0 Counter Register :
PAGE 833
Interrupt Starting Address Interrupt Source $FFFFFFFE Hardware RESET $00000000 Hardware RESET $00000002 Stack Error $00000004 Illegal Instruction $00000006 (F)TRAPcc (default) $00000008 IRQA $0000000A IRQB $0000000C IRQC $0000000E Reserved $00000010 DMA Channel 1 $00000012 DMA Channel 2 $00000014 Timer 0 $00000016 Timer 1 $00000018 Reserved $0000001A Reserved $0000001C Host A Command (default) $0000001E Host B Command (default) $00000020 Host A Receive Data $00000022 H
PAGE 834
7.3 Exception Priorities within an IPL If more than one exception is pending when an instruction is executed, the interrupt with the highest priority level is serviced first. Within a given interrupt priority level, a second priority structure determines which interrupt is serviced when multiple interrupt requests with the same IPL are pending.
PAGE 835
7.4 Interrupt Priority Register (IPR) The Interrupt Priority Register supports the timer module with the addition of the Timer0 and Timer1 priority level bits. Figure 21 shows the revised IPR with the new bits indicated in bold characters.
PAGE 836
7.4.1 Reserved bits (Bits 12-15, 28-31) These reserved bits read as zero and should be written with zero for future compatibility. 7.4.2 Timer 0 Interrupt Priority Level - T0L1-T0L0 (Bits 24-25) The Timer 0 Interrupt Priority Level (T0L1-T0L0) bits are used to enable and specify the priority level of the Timer 0 interrupt. T0L1 T0L0 0 0 0 1 1 0 1 1 Enabled no yes yes yes Int. Priority Level (IPL) 0 1 2 7.4.
PAGE 837
MPYS//ADD Integer Signed Multiply and Add MPYS//ADD Operation: Assembler Syntax: S1.L * S2.L → D1.M:D1.L MPYS S1,S2,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.) (parallel data bus move) S3.L + D2.L → D2.L MPYS S2,S1,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.) Description: Multiply the two signed operands S1 and S2 and store the product in the specified destination register D1.
PAGE 838
Instruction Format: MPYS S1,S2,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.
PAGE 839
MPYS//SUB Integer Signed Multiply and Subtract MPYS//SUB Operation: Assembler Syntax: S1.L * S2.L → D1.M:D1.L MPYS S1,S2,D1 SUB S3,D2 (parallel data bus move) D2.L - S3.L → D2.L (move syntax - see the MOVE instruction description.) MPYS S2,S1,D1 SUB S3,D2 (move syntax - see the MOVE instruction description.) Description: Multiply the two signed operands S1 and S2 and store the product in the specified destination register D1.
PAGE 840
Instruction Format: MPYS S1,S2,D1 SUB S3,D2 (move syntax - see the MOVE instruction description.
PAGE 841
MPYU//ADD Integer Unsigned Multiply and Add MPYU//ADD Operation: Assembler Syntax: S1.L * S2.L → D1.M:D1.L MPYU S1,S2,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.) (parallel data bus move) S3.L + D2.L → D2.L MPYU S2,S1,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.) Description: Multiply the two unsigned operands S1 and S2 and store the product in the specified destination register D1.
PAGE 842
Instruction Format: MPYU S1,S2,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.
PAGE 843
MPYU//SUB Integer Unsigned Multiply and Subtract MPYU//SUB Operation: Assembler Syntax: S1.L * S2.L → D1.M:D1.L MPYU S1,S2,D1 SUB S3,D2 (parallel data bus move) D2.L - S3.L → D2.L (move syntax - see the MOVE instruction description.) MPYU S2,S1,D1 SUB S3,D2 (move syntax - see the MOVE instruction description.) Description: Multiply the two unsigned operands S1 and S2 and store the product in the specified destination register D1.
PAGE 844
Instruction Format: MPYU S1,S2,D1 SUB S3,D2 (move syntax - see the MOVE instruction description.
PAGE 845
PFLUSH Program-Cache Flush Operation: Assembler Syntax: Flush instruction cache PFLUSH PFLUSH Description: Flush the whole instruction cache, unlock all cache sectors, set the LRU stack and tag registers to their default values. The PFLUSH instruction is enabled both in Cache Mode and PRAM Mode. CCR Condition Codes: Not affected. ER Status Bits: Not affected. IER Flags: Not affected.
PAGE 846
PFREE Program-Cache Global Unlock Operation: Assembler Syntax: Unlock all locked sectors PFREE PFREE Description: Unlock all the locked cache sectors in the instruction cache. The PFREE instruction is enabled both in Cache Mode and PRAM Mode. CCR Condition Codes: Not affected. ER Status Bits: Not affected. IER Flags: Not affected.
PAGE 847
PLOCK Program-Cache-Sector Lock Operation: Assembler Syntax: Lock sector by ea PLOCK PLOCK ea Description: Lock the cache sector to which the specified effective address belongs. If the specified effective address does not belong to any cache sector, then load the least recently used cache sector tag with the 25 most significant bits of the specified address and then lock that cache sector. Update the LRU stack accordingly.
PAGE 848
PLOCKR Program-Cache-Sector Relative Lock Operation: Assembler Syntax: Lock sector by PC + xx PLOCKR label Lock sector by PC + xxxx PLOCKR Rn PLOCKR Lock sector by PC + Rn Description: Lock the cache sector to which the sum PC + specified displacement belongs. If the sum does not belong to any cache sector, then load the least recently used cache sector tag with the 25 most significant bits of the sum and then lock that cache sector. Update the LRU stack accordingly.
PAGE 849
Instruction Fields: Rn - R0-R7 Long PC Relative Displacement - 32 bits Short PC Relative Displacement - aaaaaaaaaaaaaaa (15 bits) Timing: 4 + ea oscillator clock cycles Memory: 1 + ea program words MOTOROLA 67
PAGE 850
PUNLOCK Program-Cache-Sector Unlock Operation: Assembler Syntax: Unlock sector by ea PUNLOCK PUNLOCK ea Description: Unlock the cache sector to which the specified effective address belongs. If the specified effective address does not belong to any cache sector, and is therefore definitely unlocked, nevertheless, load the least recently used cache sector tag with the 25 most significant bits of the specified address. Update the LRU stack accordingly.
PAGE 851
PUNLOCKR Program-Cache-Sector Relative Unlock Operation: Assembler Syntax: Unlock sector by PC + xx PUNLOCKR label Unlock sector by PC + xxxx PUNLOCKR Rn PUNLOCKR Unlock sector by PC + Rn Description: Unlock the cache sector to which the sum PC + specified displacement belongs. If the sum does not belong to any cache sector, and is therefore definitely unlocked, nevertheless, load the least recently used cache sector tag with the 25 most significant bits of the sum.
PAGE 852
Instruction Fields: Rn - R0-R7 Long PC Relative Displacement - 32 bits Short PC Relative Displacement - aaaaaaaaaaaaaaa (15 bits) Timing: 4 + ea oscillator clock cycles Memory: 1 + ea program words 70 MOTOROLA
PAGE 853
MOTOROLA SEMICONDUCTOR TECHNICAL DATA DSP96002 Addendum to the DSP96002 Digital Signal Processor Instruction Set found in the DSP96002 Digital Signal Processor User’s Manual and the DSP96002 CLAS Documentation FOREWORD The following ten instructions have been added to the DSP96002 instruction set. These instructions are available only on versions of the DSP96002 that have an instruction cache.
PAGE 854
MPYS//ADD Integer Signed Multiply and Add MPYS//ADD Operation: Assembler Syntax: S1.L * S2.L → D1.M:D1.L MPYS S1,S2,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.) (parallel data bus move) S3.L + D2.L → D2.L MPYS S2,S1,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.) Description: Multiply the two signed operands S1 and S2 and store the product in the specified destination register D1.
PAGE 855
Instruction Format: MPYS S1,S2,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.
PAGE 856
MPYS//SUB Integer Signed Multiply and Subtract MPYS//SUB Operation: Assembler Syntax: S1.L * S2.L → D1.M:D1.L MPYS S1,S2,D1 SUB S3,D2 (parallel data bus move) D2.L - S3.L → D2.L (move syntax - see the MOVE instruction description.) MPYS S2,S1,D1 SUB S3,D2 (move syntax - see the MOVE instruction description.) Description: Multiply the two signed operands S1 and S2 and store the product in the specified destination register D1.
PAGE 857
Instruction Format: MPYS S1,S2,D1 SUB S3,D2 (move syntax - see the MOVE instruction description.
PAGE 858
MPYU//ADD Integer Unsigned Multiply and Add MPYU//ADD Operation: Assembler Syntax: S1.L * S2.L → D1.M:D1.L MPYU S1,S2,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.) (parallel data bus move) S3.L + D2.L → D2.L MPYU S2,S1,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.) Description: Multiply the two unsigned operands S1 and S2 and store the product in the specified destination register D1.
PAGE 859
Instruction Format: MPYU S1,S2,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.
PAGE 860
MPYU//SUB Integer Unsigned Multiply and Subtract MPYU//SUB Operation: Assembler Syntax: S1.L * S2.L → D1.M:D1.L MPYU S1,S2,D1 SUB S3,D2 (parallel data bus move) D2.L - S3.L → D2.L (move syntax - see the MOVE instruction description.) MPYU S2,S1,D1 SUB S3,D2 (move syntax - see the MOVE instruction description.) Description: Multiply the two unsigned operands S1 and S2 and store the product in the specified destination register D1.
PAGE 861
Instruction Format: MPYU S1,S2,D1 SUB S3,D2 (move syntax - see the MOVE instruction description.
PAGE 862
PFLUSH Program-Cache Flush Operation: Assembler Syntax: Flush instruction cache PFLUSH PFLUSH Description: Flush the whole instruction cache, unlock all cache sectors, set the LRU stack and tag registers to their default values. The PFLUSH instruction is enabled both in Cache Mode and PRAM Mode. CCR Condition Codes: Not affected. ER Status Bits: Not affected. IER Flags: Not affected.
PAGE 863
PFREE Program-Cache Global Unlock Operation: Assembler Syntax: Unlock all locked sectors PFREE PFREE Description: Unlock all the locked cache sectors in the instruction cache. The PFREE instruction is enabled both in Cache Mode and PRAM Mode. CCR Condition Codes: Not affected. ER Status Bits: Not affected. IER Flags: Not affected.
PAGE 864
PLOCK Program-Cache-Sector Lock Operation: Assembler Syntax: Lock sector by ea PLOCK PLOCK ea Description: Lock the cache sector to which the specified effective address belongs. If the specified effective address does not belong to any cache sector, then load the least recently used cache sector tag with the 25 most significant bits of the specified address and then lock that cache sector. Update the LRU stack accordingly.
PAGE 865
PLOCKR Program-Cache-Sector Relative Lock Operation: Assembler Syntax: Lock sector by PC + xx PLOCKR label Lock sector by PC + xxxx PLOCKR Rn PLOCKR Lock sector by PC + Rn Description: Lock the cache sector to which the sum PC + specified displacement belongs. If the sum does not belong to any cache sector, then load the least recently used cache sector tag with the 25 most significant bits of the sum and then lock that cache sector. Update the LRU stack accordingly.
PAGE 866
Instruction Fields: Rn - R0-R7 Long PC Relative Displacement - 32 bits Short PC Relative Displacement - aaaaaaaaaaaaaaa (15 bits) Timing: 4 + ea oscillator clock cycles Memory: 1 + ea program words 14 MOTOROLA
PAGE 867
PUNLOCK Program-Cache-Sector Unlock Operation: Assembler Syntax: Unlock sector by ea PUNLOCK PUNLOCK ea Description: Unlock the cache sector to which the specified effective address belongs. If the specified effective address does not belong to any cache sector, and is therefore definitely unlocked, nevertheless, load the least recently used cache sector tag with the 25 most significant bits of the specified address. Update the LRU stack accordingly.
PAGE 868
PUNLOCKR Program-Cache-Sector Relative Unlock Operation: Assembler Syntax: Unlock sector by PC + xx PUNLOCKR label Unlock sector by PC + xxxx PUNLOCKR Rn PUNLOCKR Unlock sector by PC + Rn Description: Unlock the cache sector to which the sum PC + specified displacement belongs. If the sum does not belong to any cache sector, and is therefore definitely unlocked, nevertheless, load the least recently used cache sector tag with the 25 most significant bits of the sum.
PAGE 869
Instruction Fields: Rn - R0-R7 Long PC Relative Displacement - 32 bits Short PC Relative Displacement - aaaaaaaaaaaaaaa (15 bits) Timing: 4 + ea oscillator clock cycles Memory: 1 + ea program words MOTOROLA 17
PAGE 870
Motorola reserves the right to make changes without further notice to any products herein. Motorola makes no warranty, representation or guarantee regarding the suitability of its products for any particular purpose, nor does Motorola assume any liability arising out of the application or use of any product or circuit, and specifically disclaims any and all liability, including without limitation consequential or incidental damages. “Typical” parameters can and do vary in different applications.
PAGE 871
MOTOROLA SEMICONDUCTOR TECHNICAL DATA Addendum  MOTOROLA INC.
PAGE 872
MOTOROLA SEMICONDUCTOR TECHNICAL DATA DSP96002 Addendum to the DSP96002 Digital Signal Processor Instruction Set found in the DSP96002 Digital Signal Processor User’s Manual and the DSP96002 CLAS Documentation FOREWORD The following ten instructions have been added to the DSP96002 instruction set. These instructions are available only on versions of the DSP96002 that have an instruction cache.
PAGE 873
MPYS//ADD Integer Signed Multiply and Add MPYS//ADD Operation: Assembler Syntax: S1.L * S2.L → D1.M:D1.L MPYS S1,S2,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.) (parallel data bus move) S3.L + D2.L → D2.L MPYS S2,S1,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.) Description: Multiply the two signed operands S1 and S2 and store the product in the specified destination register D1.
PAGE 874
Instruction Format: MPYS S1,S2,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.
PAGE 875
MPYS//SUB Integer Signed Multiply and Subtract MPYS//SUB Operation: Assembler Syntax: S1.L * S2.L → D1.M:D1.L MPYS S1,S2,D1 SUB S3,D2 (parallel data bus move) D2.L - S3.L → D2.L (move syntax - see the MOVE instruction description.) MPYS S2,S1,D1 SUB S3,D2 (move syntax - see the MOVE instruction description.) Description: Multiply the two signed operands S1 and S2 and store the product in the specified destination register D1.
PAGE 876
Instruction Format: MPYS S1,S2,D1 SUB S3,D2 (move syntax - see the MOVE instruction description.
PAGE 877
MPYU//ADD Integer Unsigned Multiply and Add MPYU//ADD Operation: Assembler Syntax: S1.L * S2.L → D1.M:D1.L MPYU S1,S2,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.) (parallel data bus move) S3.L + D2.L → D2.L MPYU S2,S1,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.) Description: Multiply the two unsigned operands S1 and S2 and store the product in the specified destination register D1.
PAGE 878
Instruction Format: MPYU S1,S2,D1 ADD S3,D2 (move syntax - see the MOVE instruction description.
PAGE 879
MPYU//SUB Integer Unsigned Multiply and Subtract MPYU//SUB Operation: Assembler Syntax: S1.L * S2.L → D1.M:D1.L MPYU S1,S2,D1 SUB S3,D2 (parallel data bus move) D2.L - S3.L → D2.L (move syntax - see the MOVE instruction description.) MPYU S2,S1,D1 SUB S3,D2 (move syntax - see the MOVE instruction description.) Description: Multiply the two unsigned operands S1 and S2 and store the product in the specified destination register D1.
PAGE 880
Instruction Format: MPYU S1,S2,D1 SUB S3,D2 (move syntax - see the MOVE instruction description.
PAGE 881
PFLUSH Program-Cache Flush Operation: Assembler Syntax: Flush instruction cache PFLUSH PFLUSH Description: Flush the whole instruction cache, unlock all cache sectors, set the LRU stack and tag registers to their default values. The PFLUSH instruction is enabled both in Cache Mode and PRAM Mode. CCR Condition Codes: Not affected. ER Status Bits: Not affected. IER Flags: Not affected.
PAGE 882
PFREE Program-Cache Global Unlock Operation: Assembler Syntax: Unlock all locked sectors PFREE PFREE Description: Unlock all the locked cache sectors in the instruction cache. The PFREE instruction is enabled both in Cache Mode and PRAM Mode. CCR Condition Codes: Not affected. ER Status Bits: Not affected. IER Flags: Not affected.
PAGE 883
PLOCK Program-Cache-Sector Lock Operation: Assembler Syntax: Lock sector by ea PLOCK PLOCK ea Description: Lock the cache sector to which the specified effective address belongs. If the specified effective address does not belong to any cache sector, then load the least recently used cache sector tag with the 25 most significant bits of the specified address and then lock that cache sector. Update the LRU stack accordingly.
PAGE 884
PLOCKR Program-Cache-Sector Relative Lock Operation: Assembler Syntax: Lock sector by PC + xx PLOCKR label Lock sector by PC + xxxx PLOCKR Rn PLOCKR Lock sector by PC + Rn Description: Lock the cache sector to which the sum PC + specified displacement belongs. If the sum does not belong to any cache sector, then load the least recently used cache sector tag with the 25 most significant bits of the sum and then lock that cache sector. Update the LRU stack accordingly.
PAGE 885
Instruction Fields: Rn - R0-R7 Long PC Relative Displacement - 32 bits Short PC Relative Displacement - aaaaaaaaaaaaaaa (15 bits) Timing: 4 + ea oscillator clock cycles Memory: 1 + ea program words 14 MOTOROLA
PAGE 886
PUNLOCK Program-Cache-Sector Unlock Operation: Assembler Syntax: Unlock sector by ea PUNLOCK PUNLOCK ea Description: Unlock the cache sector to which the specified effective address belongs. If the specified effective address does not belong to any cache sector, and is therefore definitely unlocked, nevertheless, load the least recently used cache sector tag with the 25 most significant bits of the specified address. Update the LRU stack accordingly.
PAGE 887
PUNLOCKR Program-Cache-Sector Relative Unlock Operation: Assembler Syntax: Unlock sector by PC + xx PUNLOCKR label Unlock sector by PC + xxxx PUNLOCKR Rn PUNLOCKR Unlock sector by PC + Rn Description: Unlock the cache sector to which the sum PC + specified displacement belongs. If the sum does not belong to any cache sector, and is therefore definitely unlocked, nevertheless, load the least recently used cache sector tag with the 25 most significant bits of the sum.
PAGE 888
INDEX MOTOROLA INDEX - 1
PAGE 889
PAGE 890
INDEX —A— A Law . . . . . . . . . . . . . . . . . . . . . . . . 8-17 A/D Comb Filter Transfer Function . . 6-12 A/D Converter . . . . . . . . . . . . . . . . . . . 6-3 A/D Decimation DSP Filter . 6-32, 6-40, 648, . . . . . . . . . . . . . . . . . . . . . 6-56 A/D Section . . . . . . . . . . . . . . . . . . . . . 6-5 A/D Section DC Gain . . . . . . . . . . . . 6-12 A/D Section Frequency Response and DC Gain . . . . . . . . . . . . . . . . . . . . 6-12 Address Registers . . . . . . . . . . . . . . . .
PAGE 891
Index (Continued) Codec Status Register (COSR) . 6-6, 6-9, 49 Codec Transmit Data Register . . . . . . 6-6 Comb Filter . . . . . . . . . . . . . . . . . . . . . 6-3 Command Vector Register . . . . . . . . . 5-7 Command Vector Register (CVR) . . . . .55 Companding/Expanding Hardware . . 8-17 Compare Interrupt Enable (CIE) Bit 10 7-7 Condition Code Register . . . . . . . . . . 1-21 Conditional Program Controller Instructions . . . . . . . . . . . . . . . . . . . . . .38 Control Register (PBC) . . . . . . . .
PAGE 892
Index (Continued) Dual Read Instructions . . . . . . . . . . . . .32 —E— Effective Address Update . . . . . . . . . . .34 Event Select (ES) Bit 8 . . . . . . . . . . . . 7-6 Exception Priorities within an IPL . . . 1-12 —F— Fractional Arithmetic . . . . . . . . . . . . . . 1-8 Frequency Multiplier . . . . . . . . . . . . . . 9-4 —G— G Bus Data . . . . . . . . . . . . . . . . . . . . 1-30 GDB . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7 Global Data Bus . . . . . . . . . . . . . . . . .
PAGE 893
Index (Continued) ISR Receive Data Register Full (RXDF) Bit 0 . . . . . . . . . . . . . . . . . . . . . . . 5-16 ISR Transmit Data Register Empty (TXDE) Bit 1 . . . . . . . . . . . . . . . . . . . . 5-16 ISR Transmitter Ready (TRDY) Bit 2 5-16 IVR Host Interface Interrupts . . . . . . 5-18 —J— Jump/Branch Instructions . . . . . . . . . . .35 —L— Linear . . . . . . . . . . . . . . . . . . . . . . . . . 1-9 LMS Instruction . . . . . . . . . . . . . . . . . . .32 Logical Immediate Instructions . . . . . . .
PAGE 894
Index (Continued) Port B Control Register (PBC) . . . . . . 4-6 Port B Data Direction Register . . . . . . 4-6 Port B Data Register . . . . . . . . . . . . . . 4-6 Port C . . . . . . . . . . . . . . . . . . . . . . . . . 4-6 Port C Control Register . . . . . . . . . . . . 4-6 Port C Data Direction Register . . . . . . 4-6 Port C Data Register . . . . . . . . . . . . . . 4-6 Port C Data Register (PCD) . . . . . . . . 4-6 Port Registers . . . . . . . . . . . . . . . . . . . 4-4 Programming Models . . . . . .
PAGE 895
Index (Continued) Timer Control Register (TCR) 1-17, 7-6, 47 Timer Count Register (TCR) . . . . . . . . 7-3 Timer Count Register (TCTR) 1-17, 7-3, 48 Timer Enable (TE) Bit 15 . . . . . . . . . . 7-8 Timer Functional Description . . . . . . . 7-8 Timer Preload Register . . . . . . . . . . . . 7-3 Timer Preload Register (TPR) . 1-17, 7-4, 48 Timer Resolution . . . . . . . . . . . . . . . . . 7-8 TOUT Enable (TO2-TO0) Bit 11-13 7-7, 78 Transfer with Parallel Move Instruction .
PAGE 896
Instruction Fields: Rn - R0-R7 Long PC Relative Displacement - 32 bits Short PC Relative Displacement - aaaaaaaaaaaaaaa (15 bits) Timing: 4 + ea oscillator clock cycles Memory: 1 + ea program words MOTOROLA 17
PAGE 897
Motorola reserves the right to make changes without further notice to any products herein. Motorola makes no warranty, representation or guarantee regarding the suitability of its products for any particular purpose, nor does Motorola assume any liability arising out of the application or use of any product or circuit, and specifically disclaims any and all liability, including without limitation consequential or incidental damages. “Typical” parameters can and do vary in different applications.