Title Page IBM PowerPC 750GX and 750GL RISC Microprocessor User’s Manual Version 1.
® Copyright and Disclaimer © Copyright International Business Machines Corporation 2004, 2006 All Rights Reserved Printed in the United States of America March 2006. The following are trademarks of International Business Machines Corporation in the United States, or other countries, or both: IBM IBM Logo POWER PowerPC PowerPC 750 PowerPC Architecture PowerPC Logo IEEE is a registered trademark in the United States, owned by the Institute of Electrical and Electronics Engineers.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor List of Figures .............................................................................................................. 13 List of Tables ................................................................................................................ 15 About This Manual ........................................................................................................ 19 Who Should Read This Manual ..................................
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2. Programming Model .................................................................................................. 57 2.1 PowerPC 750GX Processor Register Set ....................................................................................... 57 2.1.1 Register Set ........................................................................................................................... 57 2.1.2 PowerPC 750GX-Specific Registers .........
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.3.6.1 System Linkage Instructions—OEA ............................................................................. 2.3.6.2 Processor Control Instructions—OEA .......................................................................... 2.3.6.3 Memory Control Instructions—OEA ............................................................................. 2.3.7 Recommended Simplified Mnemonics ........................................................
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 4.2 Exception Recognition and Priorities ............................................................................................. 4.3 Exception Processing .................................................................................................................... 4.3.1 Machine Status Save/Restore Register 0 (SRR0) ............................................................... 4.3.2 Machine Status Save/Restore Register 1 (SRR1) ....
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 5.1.8 MMU Instructions and Register Summary ........................................................................... 5.2 Real-Addressing Mode .................................................................................................................. 5.3 Block-Address Translation ............................................................................................................ 5.4 Memory Segment Model .........................
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 6.6.1.3 Completion-Unit Resource Requirements .................................................................... 237 6.7 Instruction Latency Summary ........................................................................................................ 238 7. Signal Descriptions ................................................................................................. 249 7.1 Signal Configuration ......................................
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2.11.4 Time Base Enable (TBEN)—Input ............................................................................. 7.2.11.5 TLB Invalidate Synchronize (TLBISYNC)—Input ....................................................... 7.2.12 Processor Mode Selection Signals .................................................................................... 7.2.13 I/O Voltage Select Signals ................................................................
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 8.6.2 No-DRTRY Mode ................................................................................................................. 8.7 Processor State Signals ................................................................................................................ 8.7.1 Support for the lwarx and stwcx. Instruction Pair ............................................................... 8.7.2 TLBISYNC Input .................................
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 11.1 Performance-Monitor Interrupt .................................................................................................... 11.2 Special-Purpose Registers Used by Performance Monitor ......................................................... 11.2.1 Performance-Monitor Registers ......................................................................................... 11.2.1.1 Monitor Mode Control Register 0 (MMCR0) ....................
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Page 12 of 377 750gx_umTOC.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor List of Figures Figure 1-1. 750GX Microprocessor Block Diagram .................................................................................. 25 Figure 1-2. L1 Cache Organization .......................................................................................................... 34 Figure 1-3. System Interface ....................................................................................................................
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 8-5. First Level Address Pipelining ..............................................................................................287 Figure 8-6. Address-Bus Arbitration ........................................................................................................290 Figure 8-7. Address-Bus Arbitration Showing Bus Parking ....................................................................291 Figure 8-8.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor List of Tables Table 1-1. Architecture-Defined Registers (Excluding SPRs) ................................................................. 42 Table 1-2. Architecture-Defined SPRs Implemented .............................................................................. 43 Table 1-3. Implementation-Specific Registers ......................................................................................... 44 Table 1-4.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 2-34. SPR Encodings for 750GX-Defined Registers (mfspr) ........................................................112 Table 2-35. Memory Synchronization Instructions—UISA .......................................................................113 Table 2-36. Move-from Time Base Instruction .........................................................................................114 Table 2-37. Memory Synchronization Instructions—VEA ...
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 5-7. Table-Search Operations to Update History Bits—TLB Hit Case ........................................ 197 Table 5-8. Model for Guaranteed R and C Bit Settings ......................................................................... 198 Table 6-1. Notation Conventions for Instruction Timing ........................................................................ 214 Table 6-2. Performance Effects of Memory Operand Placement ....
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 11-7. HID2 Checkstop Control Bits ................................................................................................362 Table 11-8. L2CR Checkstop Control Bits ...............................................................................................362 List of Tables Page 18 of 377 750gx_umLOT.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor About This Manual This user’s manual defines the functionality of the PowerPC® 750GX and 750GL RISC microprocessors. It describes features of the 750GX and 750GL that are not defined by the architecture. This book is intended as a companion to the PowerPC Microprocessor Family: The Programming Environments (referred to as The Programming Environments Manual).
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Conventions Used in This Manual Notational Conventions mnemonics Instruction mnemonics are shown in lowercase bold. italics Italics indicate variable command parameters. For example: bcctrx. Book titles in text are set in italics. 0x0 Prefix to denote a hexadecimal number. 0b0 Prefix to denote a binary number. crfD Instruction syntax used to identify a destination Condition Register (CR) field.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Terminology Conventions The following table describes terminology conventions used in this manual and the equivalent terminology used in the PowerPC Architecture specification.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Using This Manual with the Programming Environments Manual Because the PowerPC Architecture is designed to be flexible to support a broad range of processors, the PowerPC Microprocessor Family: The Programming Environments Manual provides a general description of features that are common to PowerPC processors and indicates those features that are optional or that might be implemented differently in the design of each processor.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 1. PowerPC 750GX Overview The IBM PowerPC 750GX reduced instruction set computer (RISC) Microprocessor is an implementation of the PowerPC Architecture™ with enhancements based on the IBM PowerPC 750™, 750CXe, and 750FX RISC microprocessor designs. This chapter provides an overview of the PowerPC 750GX microprocessor features, including a block diagram that shows the major functional components.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor and data block-address-translation (IBAT and DBAT) arrays, defined by the PowerPC Architecture. During block translation, effective addresses are compared simultaneously with all eight block-address-translation (BAT) entries. For information about the L1 cache, see Chapter 3, Instruction-Cache and Data-Cache Operation, on page 121.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 1-1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor made available from the instruction cache. Typically, if a fetch access hits the BTIC, it provides the first two instructions in the target stream effectively yielding a zero-cycle branch. • 512-entry branch history table (BHT) with two bits per entry for four levels of prediction—nottaken, strongly not-taken, taken, strongly taken.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor – Retires as many as two instructions per clock. • Separate on-chip L1 instruction and data caches (Harvard architecture). – 32-KB, 8-way set-associative instruction and data caches. – Pseudo least-recently-used (PLRU) replacement algorithm. – 32-byte (8-word) cache block. – Physically indexed/physical tags. Note: The PowerPC Architecture refers to physical address space as real address space.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor • TLBs are hardware-reloadable (the page table search is performed by hardware). • Bus interface features: – Enhanced 60x bus that pipelines back-to-back reads to a depth of four. A dedicated snoop queue that allows snoop copybacks to also pipeline with up to the four maximum reads. Enveloped write transactions supported with the assertion of DBWO. – Selectable bus-to-core clock frequency ratios of 2x, 2.5x, 3x, 3.5x, 4x, 4.5x, 5x, 5.5x, 6x, 6.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 1.2.1 Instruction Flow As shown in Figure 1-1, 750GX Microprocessor Block Diagram, on page 25, the 750GX instruction control unit provides centralized control of instruction flow to the execution units. The instruction unit contains a sequential instruction fetch (Ifetch), 6-entry instruction queue (IQ), dispatch unit, and BPU.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor are flushed from the processor, and instruction fetching resumes along the correct path. The 750GX allows a second branch instruction to be predicted; instructions from the second predicted branch instruction stream can be fetched but cannot be dispatched. These instructions are held in the instruction queue. Dynamic prediction is implemented using a 512-entry BHT.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor For a more detailed discussion of instruction completion, see Section 6.6.1, Branch, Dispatch, and Completion-Unit Resource Requirements, on page 237. 1.2.2 Independent Execution Units In addition to the BPU, the 750GX has the following five execution units: • • • • Two integer units (IUs) Floating-point unit (FPU) Load/store unit (LSU) System register unit (SRU) 1.2.2.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 1.2.2.3 Load/Store Unit (LSU) The LSU executes all load-and-store instructions and provides the data-transfer interface between the GPRs, FPRs, and the data-cache/memory subsystem. The LSU functions as a 2-stage pipelined unit, which calculates effective addresses in the first stage. In the second stage, the address is translated, the cache is accessed, and the data is aligned if necessary.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The 750GX supports the following types of memory translation: Real-addressing mode In this mode, translation is disabled (control bit MSR(IR) = 0 for instructions and control bit MSR(DR) = 0 for data). The effective address is used as the physical address to access memory.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor written into an 8-word buffer. Subsequent double words are fetched from either the L2 cache or the system memory and written into the buffer. Once the total block is in the buffer, the line is written into the L1 cache in a single cycle. This minimizes write cycles into the L1 cache, leaving more read/write cycles available to the LSU. The L1 is nonblocking and supports hits under misses during this block reload sequence.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor instruction-cache flash invalidate bit (HID0[ICFI]). The instruction cache can be locked by setting HID0[ILOCK]. The instruction cache supports only the valid and invalid states, and requires software to maintain coherency if the underlying program changes. The 750GX also implements a 64-entry (16-set, 4-way set-associative) branch target instruction cache (BTIC).
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The address and data buses operate independently. Address and data tenures of a memory access are decoupled to provide more flexible control of bus traffic. The primary activity of the system interface is transferring data and instructions between the processor and system memory. There are two types of memory accesses: Single-beat transfers Allow transfer sizes of 8, 16, 24, 32, or 64 bits in one bus clock cycle.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 1-3. System Interface Address Arbitration Data Arbitration Address Start Data Transfer Address Transfer Data Termination 750GX Transfer Attribute Test and Control Address Termination Clocks Interrupt Processor Status/Control VDD VDD (I/O) The system interface supports address pipelining, which allows the address tenure of one transaction to overlap the data tenure of another.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Interrupt These signals include the interrupt signal, checkstop signals, and both soft reset and hard reset signals. These signals are used to generate interrupt exceptions and, under various conditions, to reset the processor. Processor status/control These signals are used to indicate miscellaneous bus functions. Clocks These signals determine the system clock frequency. These signals can also be used to synchronize multiprocessor systems.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 1-4.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 1.2.9 Clocking The 750GX requires a single system clock input, SYSCLK, that represents the bus interface frequency. Internally, the processor uses a phase-locked loop (PLL) circuit to generate a master core clock that is frequencymultiplied and phase-locked to the SYSCLK input. This core frequency is used to operate the internal circuitry.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Exception mode Section 1.7, Exception Model, on page 48 describes the exception model of the PowerPC operating environment architecture and the differences in the 750GX exception model. The information in this section is described more fully in Chapter 4, Exceptions, on page 151. Memory management Section 1.8, Memory Management, on page 51 describes in general terms the conventions for memory management among the PowerPC processors.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 1.4 PowerPC Registers and Programming Model The PowerPC Architecture defines register-to-register operations for most computational instructions. Source operands for these instructions are accessed from the registers or are provided as immediate values embedded in the instruction itself. The 3-register instruction format allows specification of a target register distinct from the two source operands.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The OEA defines numerous Special-Purpose Registers that serve a variety of functions, such as providing controls, indicating status, configuring the processor, and performing special operations. During normal execution, a program can access the registers shown in Figure 2-1 on page 58, depending on the program’s access privilege (supervisor or user, determined by the privilege-level (PR) bit in the MSR).
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 1-2. Architecture-Defined SPRs Implemented (Page 2 of 2) Register Level Function SPRG0–SPRG3 Supervisor The general-purpose SPRs (SPRG0–SPRG3) are provided for operating system use. TB User: read Supervisor: read/write The Time Base Register (TB) is a 64-bit register that maintains the time and date variable. The TB consists of two 32-bit fields—time-base upper (TBU) and time-base lower (TBL).
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 1.5 Instruction Set All PowerPC instructions are encoded as single-word (32-bit) instructions. Instruction formats are consistent among all instruction types (the primary operation code is always 6 bits, register operands are always specified in the same bit fields in the instruction), permitting efficient decoding to occur in parallel with operand accesses.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor – Translation-lookaside-buffer management instructions These categories do not indicate the execution unit that executes a particular instruction or group of instructions. Integer instructions operate on byte, half-word, and word operands. Floating-point instructions operate on single-precision (one word) and double-precision (two words) floating-point operands.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 1.5.2 750GX Microprocessor Instruction Set 750GX instruction set is defined as follows. • 750GX provides hardware support for all PowerPC instructions. • 750GX implements the following instructions, which are optional in the PowerPC Architecture. – – – – – – External Control In Word Indexed (eciwx). External Control Out Word Indexed (ecowx). Floating Select (fsel). Floating Reciprocal Estimate Single-Precision (fres).
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 1.7 Exception Model The following sections describe the PowerPC exception model and the 750GX implementation. A detailed description of the 750GX exception model is provided in Chapter 4, Exceptions, on page 151 in this manual. 1.7.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The PowerPC Architecture supports four types of exceptions: Synchronous, precise These are caused by instructions. All instruction-caused exceptions are handled precisely. That is, the machine state at the time the exception occurs is known and can be completely restored.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 1-5. Exceptions and Conditions Exception Type Vector Offset (hex) Causing Conditions Reserved 00000 — System reset 00100 Assertion of either HRESET or SRESET or a power-on reset. Machine check 00200 Assertion of the transfer error acknowledge (TEA) during a data-bus transaction, assertion of a machine-check interrupt (MCP), an address, data or L2 double-bit error. MSR[ME] must be set.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 1.8 Memory Management The following subsections describe the memory-management features of the PowerPC Architecture, and the 750GX implementation. A detailed description of the 750GX MMU implementation is provided in Chapter 5, Memory Management, on page 179. 1.8.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 1.8.2 750GX Microprocessor Memory-Management Implementation The 750GX implements separate MMUs for instructions and data. It implements a copy of the Segment Registers in the instruction MMU. However, read and write accesses (Move-from Segment Register [mfsr] and Move-to Segment Register [mtsr]) are handled through the Segment Registers implemented as part of the data MMU. The 750GX MMU is described in Section 1.2.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 1-5. Pipeline Diagram Maximum 4-instruction fetch per clock cycle Fetch BPU Maximum 3-instruction dispatch per clock cycle (includes one branch instruction) Dispatch Execute Stage FPU1 FPU2 SRU FPU3 LSU1 IU1 Complete (Write-Back) IU2 LSU2 Maximum 2-instruction completion per clock cycle Note: Figure 1-5 does not show features such as reservation stations and rename buffers that reduce stalls and improve instruction throughput.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor • The execution units process instructions from their reservation stations using the operands provided from dispatch, and notifies the completion stage when the instruction has finished execution. With the exception of multiply and divide, integer instructions complete execution in a single cycle. The FPU has three stages (multiply, add, and normalize) for processing floating-point arithmetic.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Nap The nap mode further reduces power consumption by disabling bus snooping, leaving only the Time Base Register and the PLL in a powered state. The 750GX returns to the full-power state upon receipt of an external asynchronous interrupt, a system management interrupt, a decrementer exception, a hard or soft reset, or a machine-check interrupt (MCP). A return to full-power state from nap state takes only a few processor clock cycles.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The TAU is controlled through the privileged mtspr and mfspr instructions to the four SPRs provided for configuring and controlling the sensor control logic. The SPRs function as follows. • THRM1 and THRM2 provide the ability to compare the junction temperature against two user-provided thresholds. Having dual thresholds gives the thermal-management software finer control of the junction temperature.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2. Programming Model This chapter describes the 750GX programming model, emphasizing those features specific to the 750GX processor and summarizing those that are common to PowerPC processors. It consists of three major sections, which describe the following topics.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 2-1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The PowerPC UISA registers are user-level. General Purpose Registers (GPRs) and Floating Point Registers (FPRs) are accessed through instruction operands. Access to registers can be explicit (by using instructions for that purpose such as mtspr and mfspr instructions) or implicit as part of the execution of an instruction. Some registers are accessed both explicitly and implicitly.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual. • User-level registers (VEA)—The PowerPC VEA defines the time-base facility (TB), which consists of two 32-bit registers—Time Base Upper (TBU) and Time Base Lower (TBL). The Time Base Registers can be written to only by supervisor-level instructions, but can be read by both user-level and supervisor-level software.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor – Memory-management registers • Block-Address Translation (BAT) Registers. The PowerPC OEA includes an array of Block Address Translation Registers that can be used to specify eight blocks of instruction space and eight blocks of data space. The BAT registers are implemented in pairs—eight pairs of instruction BATs (IBAT0U–IBAT7U and IBAT0L–IBAT7L) and eight pairs of data BATs (DBAT0U–DBAT7U and DBAT0L–DBAT7L).
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Register 1 (SRR1)” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual for more information. Note: When a machine-check exception occurs, the 750GX sets one or more error bits in SRR1. Table 2-2 describes SRR1 bits 750GX implements that are not required by the PowerPC Architecture. Table 2-2.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor – Hardware-Implementation-Dependent Register 0 (HID0)—This register controls various functions, such as enabling checkstop conditions, and locking, enabling, and invalidating the instruction and data caches, power modes, miss-under-miss, and others. – Hardware-Implementation-Dependent Register 1 (HID1)—This register reflects the state of PLL_CFG[0:4] clock signals, and phase-locked loop (PLL) selection and range bits.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.1.2 PowerPC 750GX-Specific Registers This section describes registers that are defined for the 750GX but are not included in the PowerPC Architecture. 2.1.2.1 Instruction Address Breakpoint Register (IABR) The Instruction Address Breakpoint Register (IABR) supports the instruction address breakpoint exception. When this exception is enabled, instruction fetch addresses are compared with an effective address stored in the IABR.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.1.2.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Bits 9 10 Field Name Description NAP Nap mode enable. Operates in conjunction with MSR[POW]. 0 Nap mode disabled. 1 Nap mode enabled. Doze mode is invoked by setting MSR[POW] while this bit is set. In nap mode, the PLL and the time base remain active. SLEEP2 Sleep mode enable. Operates in conjunction with MSR[POW]. 0 Sleep mode disabled. 1 Sleep mode enabled. Sleep mode is invoked by setting MSR[POW] while this bit is set.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Bits 18 19 20 21 Field Name Description ILOCK Instruction-cache lock 0 Normal operation. 1 Instruction cache is locked. A locked cache supplies data normally on a hit, but is treated as a cache-inhibited transaction on a miss. On a miss, the transaction to the bus or the L2 cache is single-beat. However, CI still reflects the original state as determined by address translation independent of cache locked or disabled status.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Bits Field Name Description 22 SPD Speculative cache access disable 0 Speculative bus accesses to nonguarded space (G = 0) from both the instruction and data caches are enabled. 1 Speculative bus accesses to nonguarded space in both caches are disabled. 23 IFEM Enable M bit on bus for instruction fetches. 0 M bit disabled. Instruction fetches are treated as nonglobal on the bus.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Bits Field Name Description 29 BHT Branch history table enable 0 BHT disabled. The 750GX uses static branch prediction as defined by the PowerPC User Instruction Set Architecture (UISA) for those branch instructions the BHT would have otherwise used to predict (that is, those that use the CR as the only mechanism to determine direction).
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.1.2.3 Hardware-Implementation-Dependent Register 1 (HID1) 0 1 2 3 4 8 9 PC0 PR0 PC1 PR1 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Bits Field Name 0:4 PCE PLL external configuration bits (read-only). 5:6 PRE PLL external range bits (read-only). 7 PSTAT1 8 ECLK Description PLL status.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.1.2.4 Hardware-Implementation-Dependent Register 2 (HID2) The Hardware-Implementation-Dependent Register 2 (HID2) enables parity. The status bits (25:27) are set when a parity error is detected and cleared by writing '0' to each bit. See the IBM PowerPC 750GX RISC Microprocessor Datasheet for details.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.1.2.5 Performance-Monitor Registers This section describes the registers used by the performance monitor, which is described in Chapter 11, Performance Monitor and System Related Features, on page 349. Monitor Mode Control Register 0 (MMCR0) The Monitor Mode Control Register 0 (MMCR0) is a 32-bit SPR provided to specify events to be counted and recorded. The MMCR0 can be accessed only in supervisor mode.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Bits 6 7:8 Field Name Description DISCOUNT Disables counting of PMCn when a performance-monitor interrupt is signaled (that is, ((PMCnINTCONTROL = '1') & (PMCn[0] = '1') & (ENINT = '1')) or when an enabled timebase transition occurs with ((INTONBITTRANS = '1') & (ENINT = '1')). 0 Signaling a performance-monitor interrupt does not affect the counting status of PMCn.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Monitor Mode Control Register 1 (MMCR1) The Monitor Mode Control Register 1 (MMCR1) functions as an event selector for Performance-Monitor Counter Registers 3 and 4 (PMC3 and PMC4). Corresponding events to the MMCR1 bits are described in Performance-Monitor Counter Registers (PMCn). MMCR1 can be accessed with mtspr and mfspr using SPR 956.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The following tables list the selectable events and their encodings: • • • • Table 11-2, PMC1 Events—MMCR0[19:25] Select Encodings, on page 352. Table 11-3, PMC2 Events—MMCR0[26:31] Select Encodings, on page 352. Table 11-4, PMC3 Events—MMCR1[0:4] Select Encodings, on page 353. Table 11-5, PMC4 Events—MMCR1[5:9] Select Encodings, on page 354.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor User Sampled Instruction Address Register (USIA) The contents of SIA are reflected to USIA, which can be read by user-level software. USIA can be accessed with the mfspr instructions using SPR 939. Sampled Data Address Register (SDA) and User Sampled Data Address Register (USDA) The 750GX does not implement the Sampled Data Address Register (SDA) or the user-level, read-only USDA registers.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.1.3 Instruction Cache Throttling Control Register (ICTC) Reducing the rate of instruction fetching can control junction temperature without the complexity and overhead of dynamic clock control. System software can control instruction forwarding by writing a nonzero value to the supervisor-level ICTC register.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 2-3. Valid THRM1/THRM2 Bit Settings TIN1 TIV1 TID TIE V x x x x 0 Invalid entry. The threshold in the SPR is not used for comparison. x x x 0 1 Disable thermal-management interrupt assertion. x x 0 x 1 Set TIN and assert thermal-management interrupt if TIE = 1 and the junction temperature exceeds the threshold. If TIE = 0, then no interrupt will be taken when the threshold is achieved.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.1.4.3 Thermal-Management Register 4 (THRM4) Due to process and thermal sensor variations, a temperature offset is provided that can be read via an mfspr instruction to THRM4. The TOFFSET field is an 8-bit signed integer that represents the temperature offset measured; it is burned into the THRM4 Register at the factory to allow for enhanced accuracy.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.1.5 L2 Cache Control Register (L2CR) The L2 Cache Control Register is a supervisor-level, implementation-specific SPR used to configure and operate the L2 cache. It is cleared by a hard reset or power-on reset.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.2 Operand Conventions This section describes the operand conventions as they are represented in two levels of the PowerPC Architecture—UISA and VEA. Detailed descriptions of conventions used for storing values in registers and memory, accessing PowerPC registers, and representing data in these registers can be found in Chapter 3, “Operand Conventions” in the PowerPC Microprocessor Family: The Programming Environments Manual. 2.2.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.2.3 Floating-Point Operand and Execution Models—UISA The IEEE 754-1985 standard defines conventions for 64-bit and 32-bit arithmetic. The standard requires that single-precision arithmetic be provided for single-precision operands.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.2.3.3 Time-Critical Floating-Point Operation For time-critical applications where deterministic floating-point performance is required, the FPSCR bits must be set with: the non-IEEE mode enabled, the floating-point exception masked, and all sticky bits set to one. With these settings, the 750GX will not cause exceptions nor generate denormalized numbers, either of which slows performance. 2.2.3.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 2-5.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.3 Instruction Set Summary This section describes instructions and addressing modes defined for the 750GX. These instructions are divided into the following functional categories: Integer These include arithmetic and logical instructions. For more information, see Section 2.3.4.1 on page 92.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor that the architecture specification refers to simplified mnemonics as extended mnemonics. Programs written to be portable across the various assemblers for the PowerPC Architecture should not assume the existence of mnemonics not described in that document. 2.3.1 Classes of Instructions The 750GX instructions belong to one of the following three classes.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.3.1.3 Illegal Instruction Class Illegal instructions can be grouped into the following categories: • Instructions not defined in the PowerPC Architecture.The following primary opcodes are defined as illegal, but might be defined to perform new functions in future extensions to the architecture: 1, 4, 5, 6, 9, 22, 56, 60, 61 • Instructions defined in the PowerPC Architecture but not implemented in a specific PowerPC implementation.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.3.1.4 Reserved Instruction Class Reserved instructions are allocated to specific implementation-dependent purposes not defined by the PowerPC Architecture. Attempting to execute an unimplemented reserved instruction invokes the illegal instruction error handler (a program exception). See Section 4.5.7 on page 170 for information about illegal and invalid instruction exceptions.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.3.2.3 Effective Address Calculation An effective address is the 32-bit sum computed by the processor when executing a memory-access or branch instruction or when fetching the next sequential instruction.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor For example, if the mtmsr sets the MSR[PR] bit, unless an isync immediately follows the mtmsr instruction, a privileged instruction could be executed or privileged access could be performed without causing an exception even though the MSR[PR] bit indicates user mode.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Summary” in the PowerPC Microprocessor Family: The Programming Environments Manual. These categorizations are somewhat arbitrary and are provided for the convenience of the programmer and do not necessarily reflect the PowerPC Architecture specification. Note that some instructions have the following optional features: • CR Update—The dot (.) suffix on the mnemonic enables the update of the CR.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 2-7. Integer Arithmetic Instructions (Page 2 of 2) Name Add to Zero Extended Subtract from Zero Extended Negate Multiply Low Immediate Multiply Low Multiply High Word Multiply High Word Unsigned Divide Word Divide Word Unsigned Mnemonic Syntax addze (addze. addzeo addzeo.) rD,rA subfze (subfze. subfzeo subfzeo.) rD,rA neg (neg. nego nego.) rD,rA mulli rD,rA,SIMM mullw (mullw. mullwo mullwo.) rD,rA,rB mulhw (mulhw.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Integer Logical Instructions The logical instructions shown in Table 2-9 on page 94 perform bit-parallel operations on the specified operands. Logical instructions with CR updating enabled (uses dot suffix) and the AND Immediate (andi.) and AND Immediate Shifted (andis.) instructions set the CR[CR0] field to characterize the result of the logical operation. Logical instructions do not affect XER[SO], XER[OV], or XER[CA].
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The integer rotate instructions are summarized in Table 2-10. For more information, see the PowerPC Microprocessor Family: The Programming Environments Manual. Table 2-10. Integer Rotate Instructions Name Mnemonic Syntax Rotate Left Word Immediate then AND with Mask rlwinm (rlwinm.) rA,rS,SH,MB,ME Rotate Left Word then AND with Mask rlwnm (rlwnm.) rA,rS,rB,MB,ME Rotate Left Word Immediate then Mask Insert rlwimi (rlwimi.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 2-12. Floating-Point Arithmetic Instructions Name Floating Add (Double-Precision) Floating Add Single Floating Subtract (Double-Precision) Floating Subtract Single Floating Multiply (Double-Precision) Floating Multiply Single Floating Divide (Double-Precision) Floating Divide Single 1 Floating Reciprocal Estimate Single 1 Floating Reciprocal Square Root Estimate Floating Select1 Mnemonic Syntax fadd (fadd.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Examples of uses of these instructions to perform various conversions can be found in Appendix D, “FloatingPoint Models,” in the PowerPC Microprocessor Family: The Programming Environments Manual. Table 2-14. Floating-Point Rounding and Conversion Instructions Name Floating Round to Single Floating Convert to Integer Word Floating Convert to Integer Word with Round toward Zero Mnemonic Syntax frsp (frsp.) frD,frB fctiw (fctiw.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Note: The PowerPC Architecture states that, in some implementations, the move-to FPSCR fields (mtfsf) instruction might perform more slowly when only some of the fields are updated as opposed to all of the fields. In the 750GX, there is no degradation of performance. Floating-Point Move Instructions Floating-point move instructions copy data from one FPR to another. The floating-point move instructions do not modify the FPSCR.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Little Endian Misaligned Accesses The 750GX supports misaligned single register load-and-store accesses in little-endian mode without causing an alignment exception. However, execution of a load/store multiple or string instruction causes an alignment exception.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 2-18.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Integer Store Instructions For integer store instructions, the contents of the source register (rS) are stored into the byte, half word, or word in memory addressed by the EA. Many store instructions have an update form, in which rA is updated with the EA. For these forms, the following rules apply: • If rA ≠ 0, the effective address is placed into rA.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor If store gathering is enabled and the stores do not fall under the above categories, then an Enforce In-Order Execution of I/O (eieio) or Synchronize (sync) instruction must be used to prevent two stores from being gathered. Store gathering is also not done when the MMU is busy doing a hardware table walk. Integer Load-and-Store with Byte-Reverse Instructions Table 2-20 describes integer load-and-store with byte-reverse instructions.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Integer Load-and-Store String Instructions The integer load-and-store string instructions allow movement of data from memory to registers, or from registers to memory, without concern for alignment. These instructions can be used for a short move between arbitrary memory locations or to initiate a long move between misaligned memory fields.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor For software compatibility, the other two mode encodings, imprecise-nonrecoverable mode and imprecise-recoverable mode, default to the precise mode. Note: For the 750GX, the ignore-exceptions mode allows floating-point instructions to complete earlier and, thus, might provide better performance than the precise-exception mode.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 2-24 summarizes the single-precision and double-precision floating-point store and stfiwx instructions. Table 2-24.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 2-26. Store Floating-Point Double Behavior (Page 2 of 2) FPR Precision Data Type Action Single SNaN Store Double Normalized Store Double Denormalized Store Double Zero, infinity, QNaN Store Double SNaN Store Architecturally, all single-precision and double-precision floating-point numbers are represented in doubleprecision format within the 750GX.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor speculatively executed instructions and restore the machine state to immediately after the branch. This correction can be done immediately upon resolution of the Condition Registers bits. Branch Instructions Table 2-27 lists the branch instructions provided by the PowerPC processors.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Trap Instructions The trap instructions shown in Table 2-29 are provided to test for a specified set of conditions. If any of the conditions tested by a trap instruction are met, the system trap type of program exception is taken. For more information, see Section 4.5.7 on page 170. If the tested conditions are not met, instruction execution continues normally. Table 2-29.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Implementation Note: The PowerPC Architecture indicates that in some implementations the Move-to Condition Register Fields (mtcrf) instruction might perform more slowly when only a portion of the fields are updated as opposed to all of the fields. The Condition Register access latency for the 750GX is the same in both cases. Move-to/Move-from Special-Purpose Register Instructions (UISA) Table 2-32 lists the mtspr and mfspr instructions.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 2-33.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 2-33. PowerPC Encodings (Page 3 of 3) SPR 1 Register Name TBL2 TBU2 XER Access mfspr/mtspr 01100 User (VEA) mfspr 01000 11100 Supervisor (OEA) mtspr 269 01000 01101 User (VEA) mfspr 285 01000 11101 Supervisor (OEA) mtspr 1 00000 00001 User (UISA) Both Decimal SPR[5–9] SPR[0–4] 268 01000 284 Note: 1. The order of the two 5-bit halves of the SPR number is reversed compared with actual instruction coding.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Encodings for the 750GX-specific SPRs are listed in Table 2-34. Table 2-34.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.3.4.7 Memory Synchronization Instructions—UISA Memory synchronization instructions control the order in which memory operations are completed with respect to asynchronous events, and the order in which memory operations are seen by other processors or memory-access mechanisms.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 2-36 shows the mftb instruction. Table 2-36. Move-from Time Base Instruction Name Move-from Time Base Mnemonic Syntax mftb rD, TBR Simplified mnemonics are provided for the mftb instruction so it can be coded with the TBR name as part of the mnemonic rather than requiring it to be coded as an operand.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 2-37. Memory Synchronization Instructions—VEA Name Enforce In-Order Execution of I/O Instruction Synchronize Mnemonic eieio isync Syntax Implementation Notes — The eieio instruction is dispatched to the LSU and executes after all previous cacheinhibited or write-through accesses are performed. All subsequent instructions that generate such accesses execute after eieio.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 2-38 summarizes the cache instructions defined by the VEA. Note that these instructions are accessible to user-level programs. Table 2-38. User-Level Cache Instructions (Page 1 of 2) Name Mnemonic Syntax Implementation Notes Data Cache Block Touch1 dcbt rA,rB The VEA defines this instruction to allow for potential system performance enhancements through the use of software-initiated prefetch hints.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 2-38. User-Level Cache Instructions (Page 2 of 2) Name Data Cache Block Store Mnemonic dcbst Syntax Implementation Notes rA,rB The EA is computed, translated, and checked for protection violations. • For cache hits with the tag marked exclusive unmodified (E), no further action is taken. • For cache hits with the tag marked M, the cache block is written back to memory and marked exclusive unmodified (E).
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor output the 4-bit resource ID (RID) field located in the EAR. The eciwx instruction also loads a word from the data bus that is output by the special device. For more information about the relationship between these instructions and the system interface, see Chapter 7, Signal Descriptions, on page 249. 2.3.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2.3.6.3 Memory Control Instructions—OEA Memory control instructions include the following. • Cache-management instructions (supervisor-level and user-level). • Segment register manipulation instructions. • Translation-lookaside-buffer management instructions. This section describes supervisor-level memory control instructions. Section 2.3.5.3, Memory Control Instructions—VEA, on page 115 describes user-level memory control instructions.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Translation Lookaside Buffer Management Instructions—(OEA) The address-translation mechanism is defined in terms of the segment descriptors and page table entries (PTEs) PowerPC processors use to locate the logical-to-physical address mapping for a particular access. These segment descriptors and PTEs reside in Segment Registers and page tables in memory, respectively.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 3. Instruction-Cache and Data-Cache Operation The 750GX microprocessor contains separate 32-KB, 8-way set-associative instruction and data caches to allow the execution units and registers rapid access to instructions and data.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 3-1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 3.1 Data-Cache Organization The data cache is organized as 128 sets of eight ways as shown in Figure 3-2. Each way consists of 32 bytes, two state bits, and an address tag. Note that in the PowerPC Architecture, the term ‘cache block,’ or simply ‘block,’ when used in the context of cache implementations, refers to the unit of memory at which coherency is maintained. For the 750GX, this is the 8-word (32-byte) cache line.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 3.2 Instruction-Cache Organization The instruction cache also consists of 128 sets of eight ways, as shown in Figure 3-3 on page 125. Each way consists of 32 bytes, a single state bit, and an address tag. As with the data cache, each instruction-cache block contains eight contiguous words from memory that are loaded from an 8-word boundary (that is, bits A[27–31] of the logical [effective] addresses are zero).
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 3-3. Instruction-Cache Organization 128 Sets Way 0 Address Tag 0 Valid Words [0–7] Way 1 Address Tag 1 Valid Words [0–7] Way 2 Address Tag 2 Valid Words [0–7] Way 3 Address Tag 3 Valid Words [0–7] Way 4 Address Tag 4 Valid Words [0–7] Way 5 Address Tag 5 Valid Words [0–7] Way 6 Address Tag 6 Valid Words [0–7] Way 7 Address Tag 7 Valid Words [0–7] 8 Words/Block 3.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor These bits allow both uniprocessor and multiprocessor system designs to exploit numerous system-level performance optimizations. The WIMG attributes are programmed by the operating system for each page and block. The write-through (W) and caching-inhibited (I) attributes control how the processor performing an access uses its own cache.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 3-1. MEI State Definitions MEI State Definition Modified (M) The addressed cache block is present in the cache, and is modified with respect to system memory. That is, the modified data in the cache block has not been written back to memory. The cache block might be present in 750GX’s L2 cache, but it is not present in any other coherent cache.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 3-4. MEI Cache-Coherency Protocol—State Diagram (WIM = 001) Invalid SH/CRW SH/CRW WM RM WH RH Modified WH SH Exclusive RH SH/CIR Bus Transactions SH = RH = RM = WH = WM = SH/CRW = Snoop Hit Read Hit Read Miss Write Hit Write Miss Snoop Hit, Cacheable Read/Write Snoop Push Cache Block Fill Section 3.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Another consideration is page table aliasing. If a store hits to a modified cache block but the page table entry is marked write-through (WIMG = 1xxx), then the page has probably been aliased through another page table entry which is marked write-back (WIMG = 0xxx). If this occurs, the 750GX ignores the modified bit in the cache tag. The cache block is updated during the write-through operation, and the block remains in the modified state.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 3.3.5 PowerPC 750GX-Initiated Load/Store Operations Load-and-store operations are assumed to be weakly ordered on the 750GX. The load/store unit (LSU) can perform load operations that occur later in the program ahead of store operations, even when the data cache is disabled (see Section 3.3.5.2).
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor atomic access to noncoherent memory. For detailed information on these instructions, see Chapter 2, Programming Model, on page 57. The lwarx instruction performs a load word from memory operation and creates a reservation for the 32-byte section of memory that contains the accessed word. The reservation granularity is 32 bytes.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 3.4.1.1 Data-Cache Flash Invalidation The data cache is automatically invalidated when the 750GX is powered up and during a hard reset. However, a soft reset does not automatically invalidate the data cache. Software must use the HID0 datacache flash invalidate bit (HID0[DCFI]) if data cache invalidation is desired after a soft reset.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 3.4.1.4 Instruction-Cache Flash Invalidation The instruction cache is automatically invalidated when the 750GX is powered up and during a hard reset. However, a soft reset does not automatically invalidate the instruction cache. Software must use the HID0 instruction-cache flash invalidate bit (HID0[ICFI]) if instruction-cache invalidation is desired after a soft reset.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor are not broadcast, unless broadcast is enabled through the HID0[ABE] configuration bit. Note that dcbi, dcbf, dcbst, and dcbz do broadcast to the 750GX’s L2 cache, regardless of HID0[ABE]. The icbi instruction is never broadcast. 3.4.2.1 Data Cache Block Touch (dcbt) and Data Cache Block Touch for Store (dcbtst) The dcbt and dcbtst instructions provide potential system performance improvement through the use of software-initiated prefetch hints.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor For this reason, avoid using dcbz for data that is shared in real time and that is not protected during writing through higher-level software synchronization protocols (such as semaphores). Use of dcbz must be avoided for managing semaphores themselves. An alternative solution could be to prevent dcbz from hitting in the L1 cache by performing a dcbf to that address beforehand. 3.4.2.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 3.4.2.6 Instruction Cache Block Invalidate (icbi) For the icbi instruction, the effective address is not computed or translated, so it cannot generate a protection violation or exception. This instruction performs a virtual lookup into the instruction cache (index only). All ways of the selected instruction cache set are invalidated. The icbi instruction is not broadcast on the 60x bus.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 3-5. PLRU Replacement Algorithm L0 invalid Allocate L0 L1 invalid Allocate L1 L2 invalid Allocate L2 L3 invalid Allocate L3 L4 invalid Allocate L4 L5 invalid Allocate L5 L6 invalid Allocate L6 L7 invalid Allocate L7 L0 valid L1 valid L2 valid L3 valid L4 valid L5 valid L6 valid L7 valid B0 = 1 B0 = 0 B1 = 0 B3 = 0 Replace L0 gx_03.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 3-2. PLRU Bit Update Rules Then the PLRU bits are changed to:1 If the current access is to: B0 B1 B2 B3 B4 B5 B6 L0 1 1 x 1 x x x L1 1 1 x 0 x x x L2 1 0 x x 1 x x L3 1 0 x x 0 x x L4 0 x 1 x x 1 x L5 0 x 1 x x 0 x L6 0 x 0 x x x 1 L7 0 x 0 x x x 0 1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The data-cache flush assist bit, HID0[DCFA], simplifies the software flushing process. When set, HID0[DCFA] forces the PLRU replacement algorithm to ignore the invalid entries and follow the replacement sequence defined by the PLRU bits. This reduces the series of uniquely addressed load or dcbz instructions to eight per set.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Burst transactions on the 750GX always transfer eight words of data at a time, and are aligned to a doubleword boundary. The 750GX transfer burst (TBST) output signal indicates to the system whether the current transaction is a single-beat transaction or 4-beat burst transfer. Burst transactions have an assumed address order.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 3.6.2 Bus Operations Caused by Cache-Control Instructions The cache-control, TLB management, and synchronization instructions supported by the 750GX can affect or be affected by the operation of the 60x bus. The operation of the instructions can also indirectly cause bus transactions to be performed, or their completion can be linked to the bus.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 3.6.3 Snooping The 750GX maintains data-cache coherency in hardware by coordinating activity between the data cache, the bus interface logic, the L2 cache, and the memory system. The 750GX has a copy-back cache which relies on bus snooping to maintain cache coherency with other caches in the system. For the 750GX, the coherency size of the bus is the size of a cache block, 32 bytes.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor the data transactions to memory in order). Note also that all burst writes by the 750GX are performed as nonglobal, and hence do not normally enable snooping, even for address collision purposes. (Snooping might still occur for reservation cancelling purposes.) 3.6.4 Snoop Response to 60x Bus Transactions There are several bus transaction types defined for the 60x bus.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 3-5. Response to Snooped Bus Transactions (Page 2 of 3) Snooped Transaction TT[0–4] 750GX Response 00110 A write-with-kill operation is a burst transaction initiated due to a castout, cachingenabled push, or snoop copy-back. • If the address hits in the cache, the cache block is placed in the invalid (I) state (killing modified data that might have been in the block). • If the address misses in the cache, no action is taken.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 3-5. Response to Snooped Bus Transactions (Page 3 of 3) Snooped Transaction TT[0–4] 750GX Response Read-with-no-intent-to-cache (RWNITC) 01011 A RWNITC operation is issued to acquire exclusive use of a memory location with no intention of modifying the location. • If the addressed cache block is in the exclusive (E) state, the cache block remains in the exclusive (E) state.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 3-6.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 3.7 MEI State Transactions Table 3-7 shows MEI state transitions for various operations. Bus operations are described in Table 3-4 on page 141. Table 3-7. MEI State Transitions (Page 1 of 3) Operation Load (T = 0) Cache Operation Bus Sync WIM Current Cache State Next Cache State Read No x0x I Same Cache Actions Bus Operation Cast out of modified block (as required). Write-with-kill Pass 4-beat read to memory queue.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 3-7. MEI State Transitions (Page 2 of 3) Operation dcbst Cache Operation Data-cacheblock store Bus Sync No WIM Current Cache State Next Cache State I,E Same xxx Same Same Cache Actions Bus Operation dcbst. — Pass clean. Clean No action. — dcbst Data-cacheblock store No xxx M E Push block to write queue. Write-with-kill dcbz Data-cacheblock set to zero No x1x x x Alignment trap.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 3-7. MEI State Transitions (Page 3 of 3) Operation tlbie sync Cache Operation Bus Sync WIM Current Cache State Next Cache State TLB invalidate No xxx x x Synchronization No xxx x Cache Actions Bus Operation Pass TLBI. — No action. — Pass sync. — No action. — x Note: Single-beat writes are not snooped in the write queue. gx_03.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Instruction-Cache and Data-Cache Operation Page 150 of 377 gx_03.fm.(1.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor 4. Exceptions The operating environment architecture (OEA) portion of the PowerPC Architecture defines the mechanism by which PowerPC processors implement exceptions (referred to as interrupts in the architecture specification). Exception conditions can be defined at other levels of the architecture.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor Note: The PowerPC Architecture documentation refers to exceptions as interrupts. In this book, the term “interrupt” is reserved to refer to asynchronous exceptions and sometimes to the event that causes the exception. The PowerPC Architecture also uses the word “exception” to refer to IEEE-defined floating-point exception conditions that can cause a program exception to be taken (see Section 4.5.7, Program Exception (0x00700), on page 170).
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor Table 4-2. Exceptions and Conditions (Page 2 of 2) Exception Type Vector Offset (hex) Causing Conditions Program 00700 As defined by the PowerPC Architecture (for example, an instruction opcode error). Floating-point unavailable 00800 As defined by the PowerPC Architecture. MSR[FP] = 0, and a floating-point instruction is executed.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor • Exceptions caused by asynchronous events (interrupts). These exceptions are further distinguished by whether they are maskable and recoverable. – Asynchronous, nonmaskable, nonrecoverable System reset for assertion of HRESET—Has highest priority and is taken immediately regardless of other pending exceptions or recoverability (includes power-on reset).
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor Table 4-3.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor System reset and machine-check exceptions can occur at any time and are not delayed even if an exception is being handled. As a result, state information for an interrupted exception might be lost. Therefore, these exceptions are typically nonrecoverable. An exception might not be taken immediately when it is recognized. 4.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor 4.3.2 Machine Status Save/Restore Register 1 (SRR1) SRR1 is used to save machine status (selected MSR bits and possibly other status bits as well) on exceptions and to restore those values when a Return from Interrupt (rfi) instruction is executed.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor 0 1 2 3 4 Bits 0:12 5 6 7 8 9 Reserved Reserved IP IR DR PM RI LE 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Field Name Reserved SE BE FE1 ILE EE PR FP ME FE0 Reserved Reserved POW 4.3.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor Bits Field Name Description 22 BE Branch trace enable 0 The processor executes branch instructions normally. 1 The processor generates a branch-type trace exception when a branch instruction executes successfully. 23 FE1 IEEE floating-point exception mode 1 (see Table 4-4 on page 160). 24 Reserved Reserved. IP Exception prefix. The setting of this bit specifies whether an exception vector offset is prefaced with Fs or 0s.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor Table 4-4. IEEE Floating-Point Exception Mode Bits FE0 FE1 Mode 0 0 Floating-point exceptions disabled. 0 1 Imprecise nonrecoverable. For this setting, the 750GX operates in floating-point precise mode. 1 0 Imprecise recoverable. For this setting, the 750GX operates in floating-point precise mode. 1 1 Floating-point precise mode. 4.3.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor 0x000n_nnnn. If IP is set, exceptions are vectored to the physical address 0xFFFn_nnnn. For a machinecheck exception that occurs when MSR[ME] = 0 (machine-check exceptions are disabled), the checkstop state is entered (the machine stops executing instructions). 4.3.6 Setting MSR[RI] The RI bit in the MSR was designed to indicate to the exception handler whether the exception is recoverable.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor 4.4 Process Switching The following instructions are useful for restoring proper context during process switching: • The Synchronization (sync) instruction orders the effects of instruction execution. All instructions previously initiated appear to have completed before the sync instruction completes, and no subsequent instructions appear to be initiated until the sync instruction completes.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor Table 4-5. MSR Setting Due to Exception (Page 2 of 2) MSR Bit2 Exception Type POW ILE EE PR FP ME FE0 SE BE FE1 IP IR DR PM RI LE System management 0 — 0 0 0 — 0 0 0 0 — 0 0 0 0 ILE Performance monitor 0 — 0 0 0 — 0 0 0 0 — 0 0 0 0 ILE Thermal management 0 — 0 0 0 — 0 0 0 0 — 0 0 0 0 ILE Note: 1. 2. 3. 4. A zero indicates that the bit is cleared.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor 4.5.1.1 Soft Reset If SRESET is asserted, the processor is first put in a recoverable state. To do this, the 750GX allows any instruction at the point of completion to either complete or take an exception, blocks completion of any subsequent instructions, and allows the completion queue to drain.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor The hard reset exception is a nonrecoverable, nonmaskable, asynchronous exception. When HRESET is asserted or at power-on reset (POR), the 750GX immediately branches to 0xFFF0_0100 without attempting to reach a recoverable state. A hard reset has the highest priority of any exception. It is always nonrecoverable.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor Table 4-7. Settings Caused by Hard Reset Register BATs Setting Unknown Cache, instruction All blocks are unchanged from before cache, and data HRESET. cache CR CTR DABR All zeros 00000000 Breakpoint is disabled. Address is unknown.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor The following is also true after a hard reset operation: • External checkstops are enabled. • The on-chip test interface has given control of the I/Os to the rest of the chip for functional use. • Since the reset exception has data and instruction translation disabled (MSR[DR] and MSR[IR] both cleared), the chip operates in direct address-translation mode (referred to as the real-addressing mode in the architecture specification).
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor A TEA indication on the bus can result from any load or store operation initiated by the processor. In general, TEA is expected to be used by a memory controller to indicate that a memory parity error or an uncorrectable memory ECC error has occurred. Note that the resulting machine-check exception is imprecise and unordered with respect to the instruction that originated the bus operation.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor When a machine-check exception is taken, instruction fetching resumes at offset 0x00200 from the physical base address indicated by MSR[IP]. 4.5.2.2 Checkstop State (MSR[ME] = 0) If MSR[ME] = 0 and a machine check occurs, the processor enters the checkstop state. The 750GX processor can also be forced into the checkstop state by the assertion of checkstop input (CKSTP_IN), the primary input signal.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor stops dispatching and waits for all pending instructions to complete. This allows any instructions in progress that need to take an exception to do so before the external interrupt is taken. After all instructions have vacated the completion buffer, the 750GX takes the external interrupt exception as defined in the PowerPC Architecture (OEA).
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor 4.5.8 Floating-Point Unavailable Exception (0x00800) The floating-point unavailable exception is implemented as defined in the PowerPC Architecture. A floatingpoint unavailable exception occurs when no higher-priority exception exists, an attempt is made to execute a floating-point instruction (including floating-point load, store, or move instructions), and the floating-point available bit in the MSR is disabled, (MSR[FP] = 0).
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor 4.5.13 Performance-Monitor Interrupt (0x00F00) The 750GX microprocessor provides a performance-monitor facility to monitor and count predefined events such as processor clocks, misses in either the instruction cache or the data cache, instructions dispatched to a particular execution unit, mispredicted branches, and other occurrences. The count of such events can be used to trigger the performance-monitor exception.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor 4.5.14 Instruction Address Breakpoint Exception (0x01300) An instruction address breakpoint interrupt occurs when the following conditions are met: • The instruction breakpoint address IABR[0:29] matches EA[0:29] of the next instruction to complete in program order. The instruction that triggers the instruction address breakpoint exception is not executed before the exception handler is invoked.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor Table 4-12. System Management Interrupt Exception—Register Settings Register Setting Description SRR0 Set to the effective address of the instruction that the processor would have attempted to execute next if no exception conditions were present. SRR1 0 1:4 5:9 10:15 16:31 Loaded with equivalent MSR bits. Cleared. Loaded with equivalent MSR bits. Cleared. Loaded with equivalent MSR bits.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor The thermal-management interrupt is similar to the system management and external interrupt. The 750GX requires the next instruction in program order to complete or take an exception, blocks completion of any following instructions, and allows the completed store queue to drain.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor 4.5.19 Exception Latencies Latencies for taking various exceptions are variable based on the state of the machine when conditions to produce an exception occur. The shortest latency possible is one cycle. In this case, an exception is signaled in the cycle following the appearance of the conditions that generated that exception. In most cases, a hard reset or machine check has a single-cycle latency to exception.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor Table 4-14. Front-End Exception Handling Summary (Page 2 of 2) Exception Type Specific Exception Description ISI Once this type of exception is detected, dispatch is halted and the current instruction stream is allowed to drain out of the machine. If completing any of the instructions in this stream causes an exception, that exception is taken and the instruction fetch exception is forgotten.
User’s Manual IBM PowerPC 750GX and GL RISC Microprocessor Exceptions Page 178 of 377 gx_04.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 5. Memory Management This chapter describes the 750GX microprocessor’s implementation of the memory management unit (MMU) specifications provided by the operating environment architecture (OEA) for PowerPC processors.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Basic features of the 750GX MMU implementation defined by the OEA are as follows: • Support for real-addressing mode—Effective-to-physical address translation can be disabled separately for data and instruction accesses.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 5-1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor the memory subsystem. The MMUs record whether the translation is for an instruction or data access, whether the processor is in user or supervisor mode, and for data accesses, whether the access is a load or a store operation. The MMUs use this information to appropriately direct the address translation and to enforce the protection hierarchy programmed by the operating system. (Section 4.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 5-1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 5-2.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 5-3. 750GX Microprocessor DMMU Block Diagram A[20–31] Load/Store Unit DMMU EA[0–19] EA[0–3] EA[0–19] 0 Segment Registers Select • • • EA[0–14] 15 DBAT Array DBAT0U DBAT0L • • DBAT7U DBAT7L EA[4–19] DTLB Data Cache 7 0 0 Tag Select A[20–26] 127 PA[0–19] 63 Page Table Search Logic 7 X Compare PA[0–19] SDR1 0 Compare Compare SPR 25 Data Cache Hit/Miss PA[0–31] gx_05.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 5.1.3 Address-Translation Mechanisms PowerPC processors support the following three types of address translation: Page address Translates the page frame address for a 4-KB page size. Block address Translates the block number for blocks that range in size from 128 KB to 256 MB. Real-addressing mode address When address translation is disabled, the physical address is identical to the effective address.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 5-4. Address-Translation Types 0 Address Translation Disabled Effective Address (MSR[IR] = 0, or MSR[DR] = 0) Match with BAT Registers Segment Descriptor Located (T = 1) (T = 0) Page Address Translation Block Address Translation (See Section 5.3 on page 196) 0 Virtual Address Direct-Store Interface Translation Real Addressing Mode Effective Address = Physical Address (See Section 5.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 5-2.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 5.1.6 General Flow of MMU Address Translation The following sections describe the general flow used by PowerPC processors to translate effective addresses to virtual and then physical addresses. 5.1.6.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 5.1.6.2 Page-Address-Translation Selection If address translation is enabled and the effective address information does not match a BAT array entry, then the segment descriptor must be located. When the segment descriptor is located, the T bit in the segment descriptor selects whether the translation is to a page or to a direct-store segment as shown in Figure 5-6, General Flow of Page and Direct-Store Interface Address Translation, on page 191.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 5-6.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor If the T bit in the Segment Register is cleared (SR[T] = 0), then page-address translation is selected. The information in the segment descriptor is then used to generate the 52-bit virtual address. The virtual address is used to identify the page-address-translation information (stored as page table entries [PTEs] in a page table in memory). For increased performance, the 750GX has two on-chip TLBs to cache recently-used translations on-chip.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 5-3.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 5-4.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 5-5. 750GX Microprocessor Instruction Summary—Control MMUs (Page 2 of 2) Instruction Description tlbie rB1 TLB Invalidate Entry For effective address specified by rB, TLB[V]←0 The tlbie instruction invalidates all TLB entries indexed by the EA, and operates on both the instruction and data TLBs simultaneously invalidating four TLB entries. The index corresponds to bits 14–19 of the EA.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor For information on the synchronization requirements for changes to MSR[IR] and MSR[DR], see Section 2.3.2.4, Synchronization, on page 90 in this manual and “Synchronization Requirements for Special Registers and for Lookaside Buffers” in Chapter 2 of the PowerPC Microprocessor Family: The Programming Environments Manual. 5.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor page-address translation and not for translations made with the BAT mechanism or for accesses that correspond to direct-store (T = 1) segments. Furthermore, R and C bits are maintained only for accesses made while address translation is enabled (MSR[IR] = 1 or MSR[DR] = 1). In the 750GX, the referenced and changed bits are updated as follows. • For TLB hits, the C bit is updated according to Table 5-7.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor • Accesses that cause exceptions and are not completed. 5.4.1.2 Changed Bit The changed bit of a page is located both in the PTE in the page table and in the copy of the PTE loaded into the TLB (if a TLB is implemented, as in the 750GX). Whenever a data store instruction is executed successfully, if the TLB search (for page-address translation) results in a hit, then the changed bit in the matching TLB entry is checked.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 5-8. Model for Guaranteed R and C Bit Settings (Page 2 of 2) Causes Setting of R Bit Priority Causes Setting of C Bit Scenario OEA 750GX OEA 750GX 4 Out-of-order store operation. Required by the sequential execution model in the absence of system-caused or imprecise exceptions, or of floating-point assist exception for instructions that would cause no other kind of precise exception.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Each TLB contains 128 entries organized as a 2-way set-associative array with 64 sets as shown in Figure 5-7 for the DTLB (the ITLB organization is the same). When an address is being translated, a set of two TLB entries is indexed in parallel with the access to a Segment Register. If the address in one of the two TLB entries is valid and matches the 40-bit virtual page number, that TLB entry contains the translation.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor To uniquely identify a TLB entry as the required PTE, each TLB entry contains, in addition to the PTE, an additional 4-bit field called the Extended Page Index (EPI). The EPI contains bits 10–13 of the EA. Software cannot access the TLB arrays directly, except to invalidate an entry with the tlbie instruction. Each set of TLB entries has one associated LRU bit.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Other than the possible TLB miss on the next instruction prefetch, the tlbie instruction does not affect the instruction fetch operation—that is, the prefetch buffer is not purged and does not cause these instructions to be refetched. 5.4.4 Page-Address-Translation Summary Figure 5-8 on page 203 provides the detailed flow for the page-address-translation mechanism.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 5-8.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 5.4.5 Page Table-Search Operation If the translation is not found in the TLBs (a TLB miss), the 750GX initiates a table-search operation, which is described in this section. Formats for the PTE are given in “PTE Format for 32-Bit Implementations,” in Chapter 7, “Memory Management” of the PowerPC Microprocessor Family: The Programming Environments Manual. The following is a summary of the page-table-search process performed by the 750GX. 1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 5-9.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 5-10.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 5.4.6 Page Table Updates When TLBs are implemented (as in the 750GX), they are defined as noncoherent caches of the page tables. TLB entries must be flushed explicitly with the TLB invalidate entry instruction (tlbie) whenever the corresponding PTE is modified. As the 750GX is intended primarily for uniprocessor environments, it does not provide coherency of TLBs between multiple processors.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Memory Management Page 208 of 377 gx_05.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 6. Instruction Timing This chapter describes how the PowerPC 750GX microprocessor fetches, dispatches, and executes instructions and how it reports the results of instruction execution. It gives detailed descriptions of how the 750GX’s execution units work, and how those units interact with other parts of the processor, such as the instructionfetching mechanism, register files, and caches.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Fetch The process of bringing instructions from the system memory (such as a cache or the main memory) into the instruction queue. Folding (branch folding) On the 750GX, a branch is expunged from (folded out of) the instruction queue via the dispatch mechanism, without being either passed to an execution unit or given a position in the completion queue.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Stage The processing of instructions in the 750GX is done in stages. They are: fetch, decode/dispatch, execute, complete, and retirement. The fetch unit brings instructions from the memory system into the instruction queue. Once in the instruction queue, the dispatch unit must do a partial decode on the instruction to determine its type. If the instruction is an integer, it is passed to the integer execution unit.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor • 64-bit floating-point unit (FPU) • Load/store unit (LSU) • System register unit (SRU) Figure 6-1 represents a generic pipelined execution unit. Figure 6-1. Pipelined Execution Unit Stage 1 Stage 2 Stage 3 Clock 0 Instruction A — — Clock 1 Instruction B Instruction A — Clock 2 Instruction C Instruction B Instruction A Clock 3 Instruction D Instruction C Instruction B The 750GX can retire two instructions in every clock cycle.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The instruction pipeline stages are described as follows: • The instruction fetch stage includes the clock cycles necessary to request instructions from the memory system and the time the memory system takes to respond to the request. Instruction fetch timing depends on many variables, such as whether the instruction is in the branch target instruction cache, the L1 instruction cache, or the L2 cache.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The notation conventions used in the instruction timing examples are as follows: Table 6-1. Notation Conventions for Instruction Timing Symbol Description Fetch. The fetch stage includes the time between when an instruction is requested and when it is brought into the instruction queue.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 6.3 Timing Considerations The 750GX is a superscalar processor; as many as three instructions can be issued to the execution units (one branch instruction to the branch processing unit, and two instructions issued from the dispatch queue to the other execution units) during each clock cycle. Only one instruction can be dispatched to each execution unit.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The 750GX’s instruction-cache throttling feature, managed through the Instruction Cache Throttling Control (ICTC) register, can lower the processor’s overall junction temperature by slowing the instruction fetch rate. See Chapter 10, Power and Thermal Management, on page 335 for more information. Branch instructions are identified by the fetcher, and forwarded to the BPU directly, bypassing the dispatch queue.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 6.3.2.1 Cache Arbitration When the instruction fetcher requests instructions from the instruction cache, two things might happen. If the instruction cache is idle and the requested instructions are present, they are provided on the next clock cycle. However, if the instruction cache is busy due to a cache-line-reload operation, instructions cannot be fetched until that operation completes. 6.3.2.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 6-4.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 6-5 on page 220 shows a simple example of instruction fetching that hits in the L1 cache. This example uses a series of integer add and double-precision floating-point add instructions to show how the number of instructions to be fetched is determined, how program order is maintained by the instruction and completion queues, how instructions are dispatched and retired in pairs (maximum), and how the FPU, IU1, and IU2 pipelines function.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 6-5.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The instruction timing for this example is described cycle-by-cycle as follows: 1. In cycle 0, instructions 0–3 are fetched from the instruction cache. Instructions 0 and 1 are placed in the two entries in the instruction queue from which they can be dispatched on the next clock cycle. 2. In cycle 1, instructions 0 and 1 are dispatched to the IU2 and FPU, respectively.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 10. In cycle 9, instruction 11 completes, instruction 12 continues through the FPU pipeline, and instructions 13 and 14 are dispatched. One new instruction, 18, can be fetched on this cycle because the instruction queue had one opening on the previous clock cycle. 6.3.2.3 Cache Miss Figure 6-6 on page 223 shows an instruction fetch that misses both the L1 cache and L2 cache. A processor/bus clock ratio of 1:2 is used.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 6-6.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 6.3.2.4 L2 Cache Access Timing Considerations If an instruction fetch misses both the BTIC and the L1 instruction cache, the 750GX next looks in the L2 cache. If the requested instructions are there, they are burst into the 750GX in much the same way as shown in Figure 6-6 on page 223. An instruction fetch from the L2 cache has a latency of five cycles. 6.3.2.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor When the dispatch unit dispatches an instruction to its execution unit, it allocates a Rename Register (or registers) for the results of that instruction. If an instruction is dispatched to a reservation station associated with an execution unit due to a data dependency, the dispatcher also provides a tag to the execution unit identifying the Rename Register that forwards the required data at completion.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Performance features such as branch folding, BTIC, dynamic branch prediction (implemented in the BHT), 2-level branch prediction, and the implementation of nonblocking caches minimize the penalties associated with flow-control operations on the 750GX.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 6-7.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 6-9. Branch Completion Branch Completion (LR/CTR Write-Back) Clock 0 Clock 1 Clock 2 Clock 3 IQ5 add5 IQ4 add4 IQ3 add3 add5 add7 add9 IQ2 bc add4 add6 add8 IQ1 add2 add3 add5 add7 IQ0 add1 bc add4 add6 CQ1 add2 add3 add5 CQ0 add1 bc add4 CQ5 CQ4 CQ3 CQ2 In this example, the Branch Conditional (bc) instruction is encoded to decrement the CTR. It is predicted as not-taken in clock cycle 0.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor does not write back its results to the architected registers. Instead, it stalls in the completion queue. Of course, when the completion queue is full, no additional instructions can be dispatched, even if an execution unit is idle. In the case of a misprediction, the 750GX can easily redirect the instruction stream because the programming model has not been updated.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Predicted Branch Timing Examples Figure 6-10 on page 231 shows cases where branch instructions are predicted. It shows how both taken and not-taken branches are handled, and how the 750GX handles both correct and incorrect predictions.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 6-10.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 2. In clock cycle 1, instructions 2 and 3 enter the dispatch entries in the IQ. Instruction 4 (a second bc instruction) and 5 are fetched. The second bc instruction is predicted as taken. It can be folded, but it cannot be resolved until instruction 3 writes back. 3. In clock cycle 2, instruction 4 has been folded and instruction 5 has been flushed from the IQ.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 6.4.5 Load/Store Unit Execution Timing The execution of most load-and-store instructions is pipelined. The LSU has two pipeline stages. The first is for effective address calculation and MMU translation, and the second is for accessing data in the cache. Load-and-store instructions have a 2-cycle latency and 1-cycle throughput.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 6-2. Performance Effects of Memory Operand Placement (Page 2 of 2) Operand Size Boundary Crossing Byte Alignment None 8 Byte Cache Block Protection Boundary 8 Optimal — — — 4 — Good Good Good <4 — Poor Poor Poor 4 Optimal — — — <4 Poor Poor Poor Poor Floating-Point 8 byte 4 byte Note: 1. 2. 3. 4. Optimal means one EA calculation occurs.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 6.5 Memory Performance Considerations Because the 750GX can have a maximum instruction throughput of three instructions per clock cycle, lack of memory bandwidth can affect performance. For the 750GX to maximize performance, it must be able to read and write data efficiently.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 6.5.2 Effect of TLB Miss If a page-address translation is not in a translation lookaside buffer (TLB), the 750GX hardware searches the page tables and updates the TLB when a translation is found. Table 6-3 shows the estimated latency for the hardware TLB load for different cache configurations and conditions. Table 6-3.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 6.6.1 Branch, Dispatch, and Completion-Unit Resource Requirements This section describes the specific resources required to avoid stalls during branch resolution, instruction dispatching, and instruction completion. 6.6.1.1 Branch-Resolution Resource Requirements The following branch instructions and resources are required to avoid stalling the fetch unit in the course of branch resolution: • The bclr instruction requires LR availability.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor • Requirements for completing an instruction from CQ1: – – – – – – – – Instruction in CQ0 must complete in same cycle. Instruction in CQ1 must be finished. Instruction in CQ1 must not follow an unresolved predicted branch. Instruction in CQ1 must not cause an exception. Instruction in CQ1 must be an integer or load instruction. Number of CR updates from both CQ0 and CQ1 must not exceed two.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 6-5.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 6-6 lists condition register logical instruction latencies. Table 6-6.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 6-7. Integer Instructions (Page 2 of 3) Instruction Mnemonic Primary Opcode Extended Opcode Unit Cycles AND Immediate Shifted andis. 29 — IU1/IU2 1 — AND and[.] 31 28 IU1/IU2 1 — Compare cmp 31 0 IU1/IU2 1 — Compare Immediate cmpi 11 — IU1/IU2 1 — Compare Logical cmpl 31 32 IU1/IU2 1 — Compare Logical Immediate cmpli 10 — IU1/IU2 1 — Count Leading Zeros Word cntlzw[.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 6-7. Integer Instructions (Page 3 of 3) Instruction Mnemonic Primary Opcode Extended Opcode Unit Cycles Subtract From Carrying subfc[o][.] 31 8 IU1/IU2 1 — Subtract From Extended subfe[o][.] 31 136 IU1/IU2 1 Execution Subtract From Immediate Carrying subfic 8 — IU1/IU2 1 — Subtract From Minus One Extended subfme[o][.] 31 232 IU1/IU2 1 Execution Subtract From Zero Extended subfze[o][.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 6-8. Floating-Point Instructions (Page 2 of 2) Instruction Mnemonic Primary Opcode Extended Opcode Unit Cycles Floating MultiplySubtract Single fmsubs[.] 59 28 FPU 1-1-1 — Floating MultiplySubtract fmsub[.] 63 28 FPU 2-1-1 — Floating Multiply Single fmuls[.] 59 25 FPU 1-1-1 — Floating Multiply fmul[.] 63 25 FPU 2-1-1 — Floating Negative Absolute Value fnabs[.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 6-9 shows load-and-store instruction latencies. Pipelined load/store instructions are shown with cycles of total latency and throughput cycles separated by a colon. Table 6-9.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 6-9.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 6-9.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 6-9. Load-and-Store Instructions (Page 4 of 4) Instruction Mnemonic Primary Opcode Extended Opcode Unit Cycles Store Word with Update Indexed stwux 31 183 LSU 2:1 — Store Word Indexed stwx 31 151 LSU 2:1 — LSU 1 TLB Invalidate Entry tlbie 31 306 3:4 Serialization Execution 1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Instruction Timing Page 248 of 377 gx_06.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7. Signal Descriptions This chapter describes the 750GX microprocessor’s external signals. It contains a concise description of individual signals, showing behavior when the signal is asserted and negated and when the signal is an input and an output. Note: A bar over a signal name indicates that the signal is active low—for example, ARTRY (address retry) and TS (transfer start).
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.1 Signal Configuration Figure 7-1 illustrates the 750GX’s signal configuration, showing how the signals are grouped. A pinout showing pin numbers is included in the PowerPC 750GX RISC Microprocessor Datasheet. Figure 7-1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2 Signal Descriptions This section summarizes the functions of individual signals on the 750GX, grouped according to Figure 7-1. Chapter 8, Bus Interface Operation, on page 279 describes many of these signals in greater detail, both with respect to how individual signals function and to how the groups of signals interact. The information in the remainder of this chapter applies to the basic transfer protocol of the 60x bus.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2.1.2 Bus Grant (BG)—Input State Asserted Indicates that the 750GX may, with proper qualification, assume mastership of the address bus. A qualified bus grant occurs when BG is asserted and ABB and ARTRY are not asserted on the bus cycled following the assertion of AACK. Note that the assertion of BR is not required for a qualified bus grant (to allow bus parking).
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Address Bus Busy (ABB)—Input State Asserted Indicates that another master is the current address-bus owner. Negated Indicates that the address bus might be available for use by the 750GX (see BG). The 750GX will also track the state of ABB on the bus from the TS and AACK inputs. (See Section 8.3.1, Address-Bus Arbitration, on page 290.) Timing Assertion Must occur whenever the 750GX must be prevented from using the address bus.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2.3 Address Transfer Signals The address transfer signals are used to transmit the address and to generate and monitor parity for the address transfer. For a detailed description of how these signals interact, see Section 8.3.2, Address Transfer, on page 292. 7.2.3.1 Address Bus (A[0–31]) The address bus (A[0–31]) consists of 32 signals that are both input and output signals.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2.3.2 Address-Bus Parity (AP[0–3]) The address-bus parity (AP[0–3]) signals are both input and output signals reflecting 1 bit of odd-byte parity for each of the 4 bytes of address when a valid address is on the bus. Address-Bus Parity (AP[0–3])—Output State Asserted/ Negated Represents odd parity for each of the 4 bytes of the physical address for a transaction.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2.4.1 Transfer Type (TT[0–4]) The transfer type (TT[0–4]) signals consist of five input/output signals on the 750GX. For a complete description of TT[0–4] signals and for transfer type encodings, see Table 7-1. Transfer Type (TT[0–4])—Output State Asserted/ Negated Indicates the type of transfer in progress. Timing Assertion/ Negation/ High Impedance The same as A[0–31].
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 7-1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 7-2.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Timing Assertion/ Negation/ High Impedance The same as A[0–31]. Table 7-3. Data-Transfer Size TBST TSIZ[0–2] Transfer Size Asserted 010 Burst (32 bytes) Negated 000 8 bytes Negated 001 1 byte Negated 010 2 bytes Negated 011 3 bytes Negated 100 4 bytes Negated 101 5 bytes1 Negated 110 6 bytes1 Negated 111 7 bytes1 1. Not generated by the 750GX. 7.2.4.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2.4.4 Cache Inhibit (CI)—Output The cache inhibit (CI) signal is an output signal on the 750GX. State Timing Asserted Indicates that a single-beat transfer will not be cached, reflecting the setting of the I bit for the block or page that contains the address of the current transaction. Negated Indicates that a burst transfer will allocate the 750GX data-cache block. Assertion/ Negation/ High Impedance The same as A[0–31]. 7.2.4.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2.4.6 Global (GBL) The global (GBL) signal is an input/output signal on the 750GX. Global (GBL)—Output State Timing Asserted Indicates that the transaction is global and should be snooped by other masters. GBL reflects the M bit (WIMG bits) from the memory management unit (MMU) except during certain transactions. Copybacks are always nonglobal.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2.5 Address Transfer Termination Signals The address transfer termination signals are used to indicate either that the address phase of the transaction has completed successfully or must be repeated, and when it should be terminated. For detailed information about how these signals interact, see Chapter 8, Bus Interface Operation, on page 279. 7.2.5.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2.5.2 Address Retry (ARTRY) The address retry (ARTRY) signal is both an input and output signal on the 750GX. Address Retry (ARTRY)—Output State Timing gx_07.fm.(1.2) March 27, 2006 Asserted The 750GX as snooper indicates that the 750GX requires the snooped transaction to be rerun.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Address Retry (ARTRY)—Input State Timing Asserted If the 750GX is the address-bus master, ARTRY indicates that the 750GX must retry the preceding address tenure and immediately negate BR (if asserted). If the associated data tenure has already started, the 750GX also cancels the data tenure immediately, even if the burst data has been received.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Timing Negated Indicates that the 750GX is not granted next data-bus ownership. Assertion Might occur on any cycle; not recognized until the cycle TS is asserted, or later. Negation Might occur on any cycle to indicate the 750GX cannot assume data-bus ownership. 7.2.6.2 Data-Bus Write-Only (DBWO) The data-bus write-only (DBWO) signal is an input-only signal on the 750GX.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Timing Assertion Occurs the cycle following a qualified DBG. Remains asserted for the duration of the data tenure. Negation Negates for a fraction of a bus cycle (one-half minimum, depends on clock mode) starting the cycle following the final assertion of the transfer acknowledge (TA) signal, or following the transfer error acknowledge (TEA) signal or certain ARTRY cases. Then releases to the high impedance state.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Data Bus (DH[0–31], DL[0–31])—Output State Asserted/ Negated Represents the state of data during a data write. For single-beat (cache inhibited or write through) writes, byte lanes not selected for data transfer will not supply valid data (no data mirroring). Timing Assertion/ Negation First or only beat begins on the cycle of DBB assertion and, for bursts, transitions on the cycle following each initially qualified assertion of TA.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Data-Bus Parity (DP[0–7])—Input State Asserted/ Negated Represents odd parity for each byte of read data. Parity is checked on all data byte lanes, regardless of the size of the transfer. Detected even parity causes a checkstop if data-parity errors are enabled in the HID0 register. Timing Assertion/ Negation The same as DL[0–31]. 7.2.7.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Timing Assertion Might occur on any cycle during the normal or extended data-bus tenure for the 750GX (see DBB and DRTRY). Must not occur two cycles or more before ARTRY assertion if ARTRY cancellation is to be used. Negation For a burst, must occur the cycle after the assertion of TA unless another assertion of TA is immediately required for the next data beat.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Timing Assertion/ Negation Assertion might occur on any cycle during the normal or extended data-bus tenure for the 750GX (during DBB, and the cycle after TA during reads). Assertion should occur for one cycle only. It is the responsibility of the system to ensure that TEA is negated by the start of the next data-bus tenure. 7.2.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2.9.3 Machine-Check Interrupt (MCP)—Input State Timing Asserted The 750GX initiates a machine-check interrupt operation if MSR[ME] and HID0[EMCP] are set. If MSR[ME] is cleared and HID0[EMCP] is set, the 750GX must terminate operation by internally gating off all clocks, and releasing all outputs (except CKSTP_OUT) to the high-impedance state. If HID0[EMCP] is cleared, the 750GX ignores the interrupt condition.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2.10 Reset Signals There are two reset signals on the 750GX—hard reset (HRESET) and soft reset (SRESET). Descriptions of the reset signals follows. 7.2.10.1 Hard Reset (HRESET)—Input The hard reset (HRESET) signal must be used at power-on in conjunction with the test reset (TRST) signal to properly reset the processor. State Timing Asserted Initiates a complete hard reset operation when this input transitions from asserted to negated.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2.11 Processor Status Signals Processor status signals indicate the state of the processor. They include the memory reservation signal, machine quiesce control signals, time-base enable signal, and TLB Invalidate Synchronize (TLBISYNC) signal. 7.2.11.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2.11.4 Time Base Enable (TBEN)—Input State Timing Asserted Indicates that the time base and decrementer should continue clocking. This signal is essentially a “count enable” control for the time base and decrementer counter. Negated Indicates that the time base and decrementer should stop clocking. Assertion/ Negation May occur on any cycle. The sampling of this signal is synchronous with SYSCLK. 7.2.11.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2.13 I/O Voltage Select Signals Table 7-7 shows the settings for the I/O voltage signals. Table 7-7. Bus Voltage Selection Settings OVDD Select #1 BVSEL OVDD Select #2 Reserved 0 0 1.8 V 0 1 2.5 V 1 1 3.3 V 1 0 Voltage Selection L1TSTCLK 7.2.14 Test Interface Signals The processor provides two sets of pins for controlling JTAG and level-sensitive scan design (LSSD) testing. 7.2.14.1 IEEE 1149.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2.14.3 L1_TSTCLK State Timing LSSD test clock in test mode, and bus voltage select in functional mode. See Table 7-7, Bus Voltage Selection Settings, on page 275. Assertion/ Negation Signal should be held to a constant value for I/O voltage selection. 7.2.14.4 L2_TSTCLK State Timing Reserved pin that must be negated for system operation. Assertion/ Negation Must be held constant for system operation.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2.15.1 System Clock (SYSCLK)—Input The 750GX requires a single system clock (SYSCLK) input. This input sets the frequency of operation for the bus interface. Internally, the 750GX uses a PLL circuit to generate a master clock for all of the CPU circuitry (including the bus interface circuitry) which is phase-locked to the SYSCLK input. State Asserted/ Negated The primary clock input for the 750GX.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 7.2.15.4 PLL Range (PLL_RNG[0:1])—Input State Asserted/ Negated Configures the PLL operating-frequency range. Internal core clock frequency must be within the specified range. Timing Asserted/ Negated Must remain stable during normal operation; should only be changed during the assertion of HRESET. These bits are readable through bits PRE[5:6] in the HID1. 7.2.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 8. Bus Interface Operation This chapter describes the PowerPC 750GX microprocessor’s bus interface and its operation. It shows how the 750GX signals, defined in Chapter 7, Signal Descriptions, on page 249, interact to perform address and data transfers. The bus interface buffers bus requests from the instruction and data caches, and executes the requests per the 60x bus protocol.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 8-1. Bus Interface Address Buffers Instruction Cache Data Cache Bus Interface Unit (BIU) Control Instruction Cache Load Address L2 Castout Data Cache Load Address Data Cache Castout/ Store Address Data Cache Snoop Address Reservation Address Buffer Snoop Control Address Address Data Data L2 or System Bus 8.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor In addition to the loads, stores, and instruction fetches, the 750GX performs hardware table-search operations following translation lookaside buffer (TLB) misses, L2 cache castout operations when the least-recently used (LRU) cache lines are written to memory after a cache miss, and cache-line snoop push-out operations when a modified cache line experiences a snoop hit from another bus master.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Cache lines are selected for replacement based on a pseudo least-recently-used (PLRU) algorithm. Each time a cache line is accessed, it is tagged as the most-recently-used line of the set. When a miss occurs, and all eight lines in the set are marked as valid, the least recently used line is replaced with the new data.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor one, two, or eight beats depending on the size of the program transaction and the cache mode for the address. For additional information about 32-bit data bus mode, see Section 8.6.1, 32-Bit Data Bus Mode, on page 316.” 8.1.5 Direct-Store Accesses The 750GX does not support the extended transfer protocol for accesses to the direct-store storage space.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 8.2 Memory-Access Protocol Memory accesses are divided into address and data tenures. Each tenure has three phases—bus arbitration, transfer, and termination. The 750GX also supports address-only transactions. Note that address and data tenures can overlap, as shown in Figure 8-3. Figure 8-3 shows that the address and data tenures are distinct from one another and that both consist of three phases—arbitration, transfer, and termination.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Data tenure: Arbitration To begin the data tenure, the 750GX arbitrates for mastership of the data bus. Transfer After the 750GX is the data-bus master, it samples the data bus for read operations or drives the data bus for write operations. The data parity and data-parity error signals ensure the integrity of the data transfer. Termination Data termination signals are required after each data beat in a data transfer.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor DBWO (data-bus writeonly) Assertion indicates that the 750GX might perform the data-bus tenure for an outstanding write address even if a read address is pipelined before the write address. If DBWO is asserted, the 750GX will assume data-bus mastership for a pending data-bus write operation. The 750GX will take the data bus for a pending read operation if this input is asserted along with DBG and no write is pending.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor data cache. If there is a miss in the L2 cache, then the request is passed on to the bus interface unit (BIU) via three additional L2-to-BIU reload-request queues. Data returned from the bus is loaded into the data-cache reload buffer, one of the L2 reload buffers, and the critical word is forwarded to the load/store unit. A dedicated snoop copyback queue has been added, which enables a fifth transaction to pipeline on the bus.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The BIU has both AR buffers and a 4-deep reload-request queue. So, the BIU operation for the MuM support is not dependent on the LSU queue, as it has enough buffers and queue depth to manage the outstanding transactions. The LSU has no additional queues for MuM. MuM just uses what is already there.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Load multiple and load string instructions allow one MuM (two outstanding miss requests) to pipeline on the 60x bus. 6. A load is aliased to a store in the store queue, which means it references a byte to the same index and word. Loads are normally allowed to bypass stores in the 3-deep store queue. However, a load that aliases a store must allow the store to proceed ahead of it (in program order).
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 8.2.2.2 Speculative Loads and Conditional Branches Loads that are dispatched before a preceding conditional branch is resolved are speculative. Mispredicted branches cause the speculative loads to be canceled. Normally, the cancellation is confined to the load/store unit, and no additional cycles are wasted. However, this is not the case when MuM is enabled. The speculative loads might be MuM requests that have started on the 60x bus.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor External arbiters must allow only one device at a time to be the address-bus master. For implementations in which no other device can be a master, BG can be grounded (always asserted) to continually grant mastership of the address bus to the 750GX. Note: Arbiter designs must ensure that no more than one address-bus master can be granted the bus at one time (that is, bus grants must be mutually exclusive).
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor System designers should note that it is possible to ignore the ABB signal, and regenerate the state of ABB locally within each device by monitoring the TS and AACK input signals. The 750GX allows this operation by using both the ABB input signal and a locally regenerated version of ABB to determine if a qualified bus grant state exists (both sources are internally ORed together).
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 8-8. Address-Bus Transfer 0 1 2 3 4 qual BG TS ABB ADDR+ aack artry_in gx_08.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 8.3.2.1 Address-Bus Parity The 750GX always generates 1 bit of correct odd-byte parity for each of the 4 bytes of address when a valid address is on the bus. The calculated values are placed on the AP[0–3] outputs when the 750GX is the address-bus master. If the 750GX is not the master and TS and GBL are asserted together (qualified condition for snooping memory operations), the calculated values are compared with the AP[0–3] inputs.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The basic coherency size of the bus is defined to be 32 bytes (corresponding to one cache line). Data transfers that cross an aligned, 32-byte boundary either must present a new address onto the bus at that boundary (for coherency consideration) or must operate as noncoherent data with respect to the 750GX. The 750GX never generates a bus transaction with a transfer size of 5 bytes, 6 bytes, or 7 bytes.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 8-3.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 8-4. Aligned Data Transfers (Page 2 of 2) Data-Bus Byte Lane(s) Transfer Size TSIZ0 TSIZ1 TSIZ2 A[29–31] 0 1 2 3 4 5 6 7 1 0 0 000 x x x x — — — — 1 0 0 100 — — — — x x x x 0 0 0 000 x x x x x x x x Word Double word Note: The entries with an “x” indicate the byte portions of the requested operand that are read or written during a bus transaction.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 8-5.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 8-6.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 8.3.2.5 Alignment of External Control Instructions The size of the data transfer associated with the eciwx and ecowx instructions is always 4 bytes. If the eciwx or ecowx instruction is misaligned and crosses any word boundary, the 750GX will generate an alignment exception. 8.3.3 Address Transfer Termination The address tenure of a bus operation is terminated when completed with the assertion of AACK, or retried with the assertion of ARTRY.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor address tenures occur until the current snoop push from the 750GX is completed. Snoop push delays can also be avoided by operating the L2 cache in write-through mode so no snoop pushes are required by the L2 cache. Figure 8-9. Snooped Address Cycle with ARTRY 1 2 3 4 5 6 7 8 ts abb addr aack ARTRY BR qualBG ABB 8.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 8-10. Data-Bus Arbitration 0 1 2 3 TS dbg dbb drtry qual DBG DBB A qualified data-bus grant can be expressed as the following: QDBG = DBG asserted while DBB, DRTRY, and ARTRY (associated with the data-bus operation) are negated. When a data tenure overlaps with its associated address tenure, a qualified ARTRY assertion coincident with a data-bus grant signal does not result in data-bus mastership (DBB is not asserted).
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 8.4.2 Data-Bus Write-Only As a result of address pipelining, the 750GX can have up to two data tenures queued to perform when it receives a qualified DBG. Generally, the data tenures should be performed in strict order (the same order as their address tenures were performed). The 750GX, however, also supports a limited out-of-order capability with the data-bus write-only (DBWO) input.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor (or only) data beat, the 750GX negates DBB but still considers the data beat active and waits for another assertion of TA. DRTRY is ignored on write operations. TEA indicates a nonrecoverable bus error event. Upon receiving a final (or only) termination condition, the 750GX always negates DBB for one cycle.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 8-12. Normal Single-Beat Write Termination 0 1 2 3 TS qual DBG DBB data ta drtry AACK Normal termination of a burst transfer occurs when TA is asserted for four bus clock cycles, as shown in Figure 8-13. The bus clock cycles in which TA is asserted need not be consecutive, thus allowing pacing of the data-transfer beats. For read bursts to terminate successfully, TEA and DRTRY must remain negated during the transfer.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor For read bursts, DRTRY can be asserted one bus clock cycle after TA is asserted to signal that the data presented with TA is invalid and that the processor must wait for the negation of DRTRY before forwarding data to the processor (see Figure 8-14). Thus, a data beat can be terminated by a predicted branch with TA, and then one bus clock cycle later confirmed with the negation of DRTRY. The DRTRY signal is valid only for read transactions.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 8-15 shows the effect of using DRTRY during a burst read. It also shows the effect of using TA to pace the data-transfer rate. Notice that in bus clock cycle 3 of Figure 8-15, TA is negated for the second data beat. The 750GX data pipeline does not proceed until bus clock cycle 4 when the TA is reasserted. Figure 8-15.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Note: TEA generates a machine-check exception depending on MSR[ME]. Clearing the machine-checkexception enable control bits leads to a true checkstop condition (instruction execution halted and processor clock stopped). 8.4.5 Memory Coherency—MEI Protocol The 750GX provides dedicated hardware to provide memory coherency by snooping bus transactions.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 8-16. MEI Cache-Coherency Protocol—State Diagram (WIM = 001) Invalid SH/CRW SH/CRW WM RM WH RH Modified SH WH Exclusive RH SH/CIR Bus Transactions SH = RH = RM = WH = WM = SH/CRW = Snoop Hit Read Hit Read Miss Write Hit Write Miss Snoop Hit, Cacheable Read/Write Snoop Push Cache Block Fill 8.5 Timing Examples This section shows timing diagrams for various scenarios.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 8-17. Fastest Single-Beat Reads 1 2 3 4 5 6 7 8 9 10 11 12 10 11 12 BR BG ABB TS A[0–31] CPU A CPU A CPU A TT[0–4] Read Read Read TBST GBL AACK ARTRY DBG DBB D[0–63] In In In TA DRTRY TEA 1 Bus Interface Operation Page 310 of 377 2 3 4 5 6 7 8 9 gx_08.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 8-18 illustrates the fastest single-beat writes supported by the 750GX. All bidirectional signals are tristated between bus tenures. Figure 8-18. Fastest Single-Beat Writes 1 2 3 4 5 6 7 8 9 10 11 12 10 11 12 BR BG ABB TS A[0–31] CPU A CPU A CPU A TT[0–4] SBW SBW SBW TBST GBL AACK ARTRY DBG DBB D[0–63] Out Out Out TA DRTRY TEA 1 gx_08.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 8-19 shows three ways to delay single-beat reads using data-delay controls: • The TA signal can remain negated to insert wait states in clock cycles 3 and 4. • For the second access, DBG could have been asserted in clock cycle 6. • In the third access, DRTRY is asserted in clock cycle 11 to flush the previous data. Note: All bidirectional signals are tristated between bus tenures.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 8-20 shows data-delay controls in a single-beat write operation. Note that all bidirectional signals are tristated between bus tenures. Data transfers are delayed in the following ways: • The TA signal is held negated to insert wait states in clocks 3 and 4. • In clock 6, DBG is held negated, delaying the start of the data tenure. The last access is not delayed (DRTRY is valid only for read operations). Figure 8-20.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 8-21 shows the use of data-delay controls with burst transfers. Note that all bidirectional signals are tristated between bus tenures. Also note: • • • • The first data beat of burst read data (clock 0) is the critical quadword. The write burst shows the use of TA signal negation to delay the third data beat. The final read burst shows the use of DRTRY on the third data beat.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 8-22 shows the use of the TEA signal. Note that all bidirectional signals are tristated between bus tenures. Also note: • The first data beat of the read burst (in clock 0) is the critical quadword. • The TEA signal truncates the burst write transfer on the third data beat. • The 750GX eventually causes an exception to be taken on the TEA event. Figure 8-22.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 8.6 Optional Bus Configuration The 750GX supports optional bus configurations that are selected during the negation of the HRESET signal. The operation and selection of the optional bus configuration are described in the following sections. 8.6.1 32-Bit Data Bus Mode The 750GX supports an optional 32-bit data bus mode.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 8-23. 32-Bit Data-Bus Transfer (8-Beat Burst) TS ABB ADDR TBST AACK ARTRY DBB DH[0–31] 0 1 2 3 4 5 6 7 TA DRTRY TEA An example of a two-beat data transfer (with DRTRY asserted during each data tenure) is shown in Figure 8-24. Figure 8-24. 32-Bit Data-Bus Transfer (2-Beat Burst with DRTRY) TS ABB ADDR TBST AACK ARTRY DBB DH[0–31] 0 1 TA DRTRY TEA gx_08.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The 750GX selects 64-bit or 32-bit data bus mode at startup by sampling the state of the TLBISYNC signal at the negation of HRESET. If the TLBISYNC signal is negated at the negation of HRESET, the 750GX enters 64-bit data mode. If TLBISYNC is asserted at the negation of HRESET, the 750GX enters 32-bit data mode. Table 8-3 on page 296 describes the burst ordering when the 750GX is in 32-bit mode.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 8.7 Processor State Signals This section describes the 750GX's support for atomic update and memory through the use of the lwarx and stwcx. opcode pair, and includes a description of the TLB Invalidate Synchronize (TLBISYNC) input. 8.7.1 Support for the lwarx and stwcx. Instruction Pair The Load Word and Reserve Indexed (lwarx) and the Store Word Conditional Indexed (stwcx.) instructions provide a means for atomic memory updating.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 8-25. IEEE 1149.1a-1993 Compliant Boundary-Scan Interface TDI (Test Data Input) TMS (Test Mode Select) TCK (Test Clock Input) TDO (Test Data Output) TRST (Test Reset) 8.9 Using Data-Bus Write-Only The 750GX supports split-transaction pipelined transactions. It supports a limited out-of-order capability for its own pipelined transactions through the data-bus write-only (DBWO) signal.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Note that although the 750GX can pipeline any write transaction behind the read transaction, special care should be used when using the enveloped write feature. It is envisioned that most system implementations will not need this capability; for these applications, DBWO should remain negated. In systems where this capability is needed, DBWO should be asserted under the following scenario: 1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Bus Interface Operation Page 322 of 377 gx_08.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 9. L2 Cache This chapter describes the 750GX microprocessor‘s implementation of the 1-MB L2 cache. Note: The L2 cache is initially disabled following a power-on or hard reset. Before enabling the L2 cache, configuration parameters must be set in the L2 Cache Control Register (L2CR), and the L2 tags must be globally invalidated. The L2 cache should be initialized during system start-up (see Section 9.4 on page 329). 9.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor If multiple read requests from the L1 caches are pending, the L2 cache can perform hit-under-miss operations, supplying the available instruction or data while a bus transaction for previous L2 cache misses is being performed. The L2 cache also supports miss-under-miss operation. Up to four outstanding misses are supported: one miss from the instruction cache and three from the data cache, or four data-cache misses.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Whenever a way in the set is referenced, the LRU bits are updated. The new value of the LRU bits depends on the old value, which way is currently being accessed, and whether the operation is an invalidation or a load/store. Table 9-2 shows the new value of the LRU bits for the various combinations of these variables. An ‘x’ indicates don’t care, while a ‘-’ indicates no change from previous value. Table 9-2.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 9-3. Effect of Locked Ways on LRU Interpretation (Page 2 of 2) LRU Bits Lock Bits LRU Way 110 x011 1 1x1 xxx0 3 1x1 xx01 2 101 0x11 0 111 x011 1 L2 Cache Page 326 of 377 gx_09.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 9-1. L2 Cache L1 Data Cache Castout, Single Beat Stores L1 Data Cache Reload Load Store Critical Word Instruction Cache Reload 64-bit 256-bit 256-bit Store Queue ST0, ST1, SNP 3 Lines L2 Castout Snoop Queue L2 Reload Queue 5 Lines 2 Lines 64-bit 64-bit 64-bit 64-bit 64-bit ECC 64-bit 64-bit ECC ECC ECC ECC 8-bit 64-bit 72-bit L2 SRAM 1 MB Data-Out Request 60x Bus gx_09.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The execution of the Store Word Conditional Indexed (stwcx.) instruction results in single-beat writes from the L1 data cache. These single-beat writes are processed by the L2 cache according to hit/miss status, L1 and L2 write-through configuration, and reservation-active status. If the address associated with the stwcx. instruction misses in the L2 cache, or if the reservation is no longer active, the stwcx.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 9.3 L2 Cache Control Register (L2CR) The L2 Cache Control Register is used to configure and enable the L2 cache. The L2CR is a supervisor-level read/write, implementation-specific register that is accessed as Special Purpose Register (SPR) 1017. The contents of the L2CR are cleared during power-on reset. For a full description of L2CR and its bits, see Section 2.1.5, L2 Cache Control Register (L2CR), on page 81. 9.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 9.6 L2 Cache Used as On-Chip Memory The L2 cache can be configured to be unlocked, partially locked, or completely locked. When configured to be unlocked, the L2 cache is 4-way set-associative, with 32 bytes per sector, two sectors per block. When configured to be completely locked, the L2 cache is a 1-MB on-chip memory (OCM) that is explicitly managed by software.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 9.6.1.1 Loading the Locked L2 Cache Contents are loaded into the L2 cache simply by executing load instructions to cacheable addresses that miss in the L1. Note that instructions to be locked in the L2 cache are loaded as data. Only one access to each 32-byte cache block is needed to allocate the entire block in the cache. Note: While lines are being allocated in way 0 using this procedure, the cache behaves as a direct-mapped cache.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The dcbz instruction has no effect on the L2-cache state, whether the state is locked or not. The dcbi instruction causes invalidation of the block in the case of an L2 hit, for both normal and locked caches. 9.7 Data-Only and Instruction-Only Modes The 750GX microprocessor supports a data-only mode of L2 operation that can be used for test (as described in Section 9.8.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 9.8.2 L2 Cache Testing A typical test for verifying the proper operation of the 750GX microprocessor’s L2-cache memory follows this sequence: 1. Initialize the L2 test sequence by disabling address translation to invoke the default WIMG setting (0011). Set L2CR[DO] and L2CR[TS], and perform a global invalidation of the L1 data cache and the L2 cache. The L1 instruction cache can remain enabled to improve execution efficiency. 2.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor L2 Cache Page 334 of 377 gx_09.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 10. Power and Thermal Management The 750GX microprocessor is specifically designed for low-power operation. It provides both automatic and program-controlled power reduction modes for progressive reduction of power consumption. It also provides a thermal assist unit (TAU) to allow on-chip thermal measurement, allowing sophisticated thermal management for high-performance portable systems.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 10-1. 750GX Power States Full On T1 T6 T2 T5 T3 T4 T7 Doze Sleep T8 allow snoop Nap T1: HID0(Doze) = 1 and MSR(POW) 0 → 1 T2: HRESET, SRESET, INT, SMI, MCP, DEC, PFM, machine-check interrupts, thermal-management interrupt T3: HID0(Nap) = 1 and MSR(POW) 0 → 1 T4:HRESET, SRESET, INT, SMI, MCP, DEC T5: HID0(Sleep) = 1 and MSR(POW) 0 → 1 T6: HRESET, SRESET, INT, SMI, MCP T7: QACK 0 → 1 T8: QACK 1 → 0 Table 10-1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 10.2.1 Power Management Modes The following sections describe the characteristics of the 750GX’s power management modes, the requirements for entering and exiting the various modes, and the system capabilities provided by the 750GX while the power management modes are active.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 750GX will then be able respond to a snoop cycle. Assertion of QACK following the snoop cycle will again disable the 750GX’s snoop capability. The 750GX’s power dissipation while in nap mode with QACK deasserted is the same as the power dissipation while in doze mode. The 750GX also allows dynamic switching between nap and doze modes to allow the use of nap mode without sacrificing hardware snoop coherency.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 10.2.1.4 Sleep Mode Sleep mode consumes the least amount of power of the four modes since all functional units are disabled. To conserve the maximum amount of power, the PLL can be disabled by placing the PLL_CFG signals in the PLL bypass mode, and disabling SYSCLK.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 10.2.2 Power Management Software Considerations Since the 750GX is a dual-issue processor with out-of-order execution capability, care must be taken in how the power management mode is entered. Furthermore, nap and sleep modes require all outstanding bus operations to be completed before these power management modes are entered.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Note: If the PLL software configuration is used, sufficient time must be allowed for the chosen PLL to lock. See the PowerPC 750GX RISC Microprocessor Datasheet for more information. The following sequence can be used to change processor clock frequency. Assume PLL0 is currently the source for the processor clock.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 10.3.3 Dual PLL Implementation Switching between the two PLLs on the 750GX is intended to be a seamless, 3-cycle operation. As shown in Figure 10-2, the two PLL outputs will feed a multiplexer (MUX), controlled by a signal from the PLL select logic. Figure 10-2.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 10-3. Dual PLL Switching Example, 3X to 4X SYSCLK 3X 4X GCLK clk blocked 10.4 Thermal Assist Unit With the increasing power dissipation of high-performance processors and operating conditions that span a wider range of temperatures than desktop systems, thermal management becomes an essential part of system design to ensure reliable operation of portable systems.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Figure 10-4. Thermal Assist Unit Block Diagram Thermal Sensor DAC Thermal Interrupt Request (0x1700) Thermal Sensor Control Logic Decoder THRM3 Interrupt Control THRM1 THRM2 THRM4 Latch The TAU provides thermal control by periodically comparing the 750GX’s junction temperature against userprogrammed thresholds, and generating a thermal-management interrupt if the threshold values are crossed.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 10.4.2.1 TAU Single-Threshold Mode When the TAU is configured for single-threshold mode, either THRM1 or THRM2 can be used to contain the threshold value, and a thermal-management interrupt is generated when the threshold value is crossed.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 10-3. Valid THRM1 and THRM2 Bit Settings (Page 2 of 2) TIN1 TIV1 TID TIE V 0 1 0 x 1 The junction temperature is less than the threshold, and, as a result, the thermalmanagement interrupt is not generated for TIE = 1. 1 1 0 x 1 The junction temperature is greater than the threshold, and, as a result, the thermalmanagement interrupt is generated if TIE = 1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 10.4.2.4 Power Saving Modes and TAU Operation The static power saving modes provided by the 750GX (the nap, doze, and sleep modes) allow the temperature of the processor to be lowered quickly, and can be invoked through the use of the TAU and associated thermal-management interrupt. The TAU remains operational in the nap and doze modes, and in sleep mode as long as the SYSCLK signal input remains active.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor The bit field settings of the ICTC SPR are shown in Table 10-4 on page 348. Table 10-4. ICTC Bit Field Settings Bits Name Description 0-22 Reserved Bits reserved for future use. The system software should always write zeros to these bits when writing to the THRM SPRs. 23–30 FI Instruction forwarding interval expressed in processor clocks. 0x00 0 clock cycle 0x01 1 clock cycle . .
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 11. Performance Monitor and System Related Features The performance-monitor facility provides the ability to monitor and count predefined events such as processor clocks, misses in the instruction cache, data cache, or L2 cache, types of instructions dispatched, mispredicted branches, and other occurrences. The count of such events (which might be an approximation) can be used to trigger the performance-monitor exception.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor As a result of a performance-monitor exception being taken, the action taken depends on the programmable events. To help track which part of the code was being executed when an exception was signaled, the address of the last completed instruction during that cycle is saved in the Sampled Instruction Address (SIA) register. The SIA is not updated if no instruction completed the cycle in which the exception was taken.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 11.2.1 Performance-Monitor Registers This section describes the registers used by the performance monitor. 11.2.1.1 Monitor Mode Control Register 0 (MMCR0) The Monitor Mode Control Register 0 (MMCR0) is a 32-bit SPR provided to specify events to be counted and recorded. MMCR0 can be written to only in supervisor mode. User-level software can read the contents of MMCR0 by issuing an mfspr instruction to UMMCR0, described in Section 11.2.1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Software is expected to use the mtspr instruction to explicitly set PMC to nonoverflowed values. Setting an overflowed value might cause an erroneous exception. For example, if both MMCR0[ENINT] and either PMC1INTCONTROL or PMCINTCONTROL are set and the mtspr instruction loads an overflow value, an interrupt signal might be generated without event counting having taken place. The event to be monitored can be chosen by setting MMCR0[19:31].
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 11-3. PMC2 Events—MMCR0[26:31] Select Encodings (Page 2 of 2) Encoding Description 00 0101 Counts L1 instruction-cache misses. 00 0110 Counts ITLB misses. 00 0111 Counts L2 instruction misses. 00 1000 Counts branches predicted or resolved not taken. 00 1001 Reserved 00 1010 Counts times a reserved load operations completes. 00 1011 Counts completed load-and-store instructions. 00 1100 Counts snoops to the L1 and the L2.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 11-4. PMC3 Events—MMCR1[0:4] Select Encodings (Page 2 of 2) Encoding Description 1 0000 Number of branches in the second speculative stream that resolve correctly. 1 0001 Number of cycles the BPU stalls due to LR or CR unresolved dependencies. All others Reserved. Might be used in a later revision. Bits MMCR1[5:9] specify events associated with PMC4, as shown in Table 11-5. Table 11-5.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 11.2.1.7 Sampled Instruction Address Register (SIA) The Sampled Instruction Address Register (SIA) is a supervisor-level register that contains the effective address of an instruction executing at or around the time that the processor signals the performance-monitor interrupt condition. The SIA is shown in Sampled Instruction Address Register (SIA) on page 75.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 11.4 Event Selection Event selection is handled through MMCR0 and MMCR1. • The four event-select fields in MMCR0 and MMCR1 are: – MMCR0[19:25] PMC1SELECT PMC1 input selector. 128 events selectable; 25 defined. See Table 11-2 on page 352. – MMCR0[26:31] PMC2SELECT PMC2 input selector. 64 events selectable; 21 defined. See Table 11-3 on page 352. – MMCR0[0:4] PMC3SELECT PMC3 input selector. 32 events selectable and defined.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 11.6 Debug Support 11.6.1 Overview The 750GX provides the following debug support features: • • • • • Branch trace Single step instruction trace Instruction-address breakpoint Data-address breakpoint Externally triggered soft stop The trace mode allows either a single step trace if MSR[SE] = 1 or a branch trace if MSR[BE] = 1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor • • • • • • • • • • • • • Internal registers (such as the general-purpose, floating-point, and processor version registers) Data cache Instruction cache L2 cache L2 tag Data tag Instruction tag Data translation lookaside buffer (TLB) Data Segment Registers Instruction TLB Instruction Segment Registers Instruction Block-Address-Translation (BAT) Registers External memory INTMEM will allow reading and writing the above arrays while accessing a c
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 11.8 Resets The 750GX supports two types of resets: a hard and a soft reset. 11.8.1 Hard Reset The hard reset is triggered by the assertion of the hard reset pin, HRESET.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 11.8.3 Reset Sequence Figure 11-2. Reset Sequence Hard Reset Soft Reset Scan in 0s > 255 clocks yes Hard Reset? no yes JTAG_IR=FFRZ? no Stop Chip Clks Perform RISCWatch Functions no RISCWatch cmd = RESUME? yes Chip Clks Running System Reset Interrupt Routine Hard Reset = 0xFFF00100 Performance Monitor and System Related Features Page 360 of 377 gx_11.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 11.9 Checkstops A checkstop causes the processor to halt and assert the checkstop output pin, CKSTP_OUT. Once the 750GX enters a checkstop state, only a hard reset can clear the processor. 11.9.1 Checkstop Sources Following is the list of checkstop sources: • Machine Check with MSR[ME] = 0. If MSR[ME] = 0 when a machine-check interrupt occurs, then the checkstop state is entered.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Table 11-7 shows the control bits for HID2. Table 11-7. HID2 Checkstop Control Bits Hard Reset State Bits Field Name Description 29 ICPE Enable L1 instruction-cache or instruction-tag parity checking. 30 DCPE Enable L1 data-cache or data-tag parity checking. 31 L2PE Enable L2 Tag parity checking. Table 11-8.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 11.10 750GX Parity Parity is implemented for the following arrays: instruction cache, instruction tag, data cache, data tag, and L2 tag. All parity errors, when parity is enabled, result in either a machine-check or checkstop interrupt that is not recoverable. For all of the following arrays, parity for a given set of data is a one if there is an odd number of ones in the data (even parity).
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor 11.10.1 Parity Control and Status Parity is enabled with the Hardware-Implementation-Dependent Register 2 (HID2). For a diagram of this register and a description of its fields, see Hardware-Implementation-Dependent Register 2 (HID2) on page 71. HID2 SPR number is 1016 decimal, (spr[5-9] = 11111, spr[0-4] = 11000). The status bits (25:27) are set when a parity error is detected and cleared when the HID2 Register is written. 11.10.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Acronyms and Abbreviations BAT block-address translation BHT branch history table BIST built-in self test BIU bus interface unit BPU branch processing unit BSDL Boundary-Scan Description Language BTIC branch target instruction cache BUID bus unit ID CMOS complementary metal-oxide semiconductor COP common on-chip processor CQ completion queue CR Condition Register CTR Count Register DABR Data Address Breakpoint Register
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor FPR Floating Point Register FPSCR Floating-Point Status and Control Register FPU floating-point unit GPR General Purpose Register HIDn Hardware-Implementation-Dependent Register IABR Instruction Address Breakpoint Register IBAT instruction BAT ICTC Instruction Cache Throttling Control Register IEEE Institute for Electrical and Electronics Engineers IMMU instruction MMU IQ instruction queue ITLB instruction translation look
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor NaN not a number no-op no operation OEA operating environment architecture PID processor identification tag PLL phase-locked loop PLRU pseudo least recently used PMCn Performance-Monitor Counter Registers POR power-on reset POWER Performance Optimized with Enhanced RISC architecture PTE page table entry PTEG page-table-entry group PVR Processor Version Register RAW read-after-write RISC reduced instruction set computin
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor THRM n Thermal-Management Registers TLB translation lookaside buffer TTL transistor-to-transistor logic UIMM unsigned immediate value UISA user instruction set architecture UMMCRn User Monitor Mode Control Registers UPMCn User Performance-Monitor Counter Registers USIA User Sampled Instruction Address Register VEA virtual environment architecture WAR write-after-read WAW write-after-write WIMG write-through/caching-inhibit
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Index A AACK (address acknowledge) signal, 262 ABB (address bus busy) signal, 285 Address bus address tenure, 284 address transfer An, 254 APE, 294 address transfer attribute CI, 260 GBL, 261 TBST, 259 , 294 TSIZn, 258 , 294 TTn, 256 , 294 WT, 260 address transfer start TS, 253 , 292 address transfer termination AACK, 262 ARTRY, 263 terminating address transfer, 300 arbitration signals, 251 , 285 bus parking, 291 Address translation, see Memory
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor L2 interface cache global invalidation, 329 cache initialization, 329 cache testing, 333 dcbi, 328 eieio, 328 operation, 323 stwcx. execution, 328 sync, 328 load/store operations, processor initiated, 130 miss, 222 operations cache block push operations, 328 data cache transactions, 140 instruction cache block fill, 139 snoop response to bus transactions, 143 PLRU replacement, 137 stwcx.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor register settings MSR, 162 SRR0/SRR1, 156 reset exception, 163 returning from an exception handler, 161 summary table, 152 system call exception, 171 terminology, 151 thermal management interrupt exception, 174 Execution synchronization, 90 Execution unit timing examples, 225 Execution units, 31 External control instructions, 117 , 300 F Features, list, 25 Finish cycle, definition, 210 Floating-Point Execution Models—UISA, 83 Floating-point mode
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor integer, 99 byte reverse instructions, 102 floating-point move, 98 floating-point store, 104 integer load, 99 integer multiple, 102 integer store, 101 memory synchronization, 113 , 114 string instructions, 103 memory control instructions, 115 , 119 memory synchronization instructions, 113 , 114 processor control instructions, 108 , 113 , 118 reserved instructions, 89 rfi, 161 stwcx., 162 support for lwarx/stwcx.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor O OEA exception mechanism, 151 memory management specifications, 179 registers, 60 Operand conventions, 82 Operand placement and performance, 233 Operating environment architecture (OEA), 41 Operations bus operations caused by cache control instructions, 141 instruction cache block fill, 139 read operation, 140 response to snooped bus transactions, 143 single-beat write operations, 311 Overview, 23 PowerPC architecture operating environment arc
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor DABR, 62 DAR, 61 DEC, 62 DSISR, 61 EAR, 62 HID0, 65 , 337 HID1, 70 IABR, 64 ICTC, 77 , 348 L2CR, 81 , 329 MMCR0, 72 , 172 , 351 MMCR1, 74 , 172 , 351 MSR, 60 PMC1 and PMC2, 44 PMCn, 74 , 172 PVR, 60 SDR1, 61 SIA, 75 , 172 , 355 SPRGn, 61 SPRs for performance monitor, 349 SRn, 61 SRR0/SRR1, 61 THRMn, 78 time base (TB), 62 user-level CR, 59 CTR, 59 FPRn, 59 FPSCR, 59 GPRn, 59 LR, 59 time base (TB), 60 , 62 UMMCR0, 73 UMMCR1, 74 UPMCn, 75 USIA, 76 ,
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Stall, definition, 211 Static branch prediction, 216 , 229 stwcx.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Index Page 376 of 377 750gx_umIX.fm.(1.
User’s Manual IBM PowerPC 750GX and 750GL RISC Microprocessor Revision Log Revision Date February 27, 2004 September 30, 2004 March 27, 2006 gx_revlog.fm.(1.2) March 27, 2006 Contents of Modification Initial release (version 1.0) (version 1.1) on page 26, added the following to the list under "2-stage load/store unit (LSU)." "4-entry load queue." On page 32, added a third paragraph under Section 1.2.2.3 Load/Store Unit.