Information
e300 Processor Core Overview
MPC8308 PowerQUICC II Pro Processor Reference Manual, Rev. 1
7-36 Freescale Semiconductor
an interim 52-bit virtual address and hashed page tables for generating 32-bit physical addresses. The
MMUs in the e300 core rely on the interrupt processing mechanism for the implementation of the paged
virtual memory environment and for enforcing protection of designated memory areas.
Instruction and data TLBs provide address translation in parallel with the on-chip cache access, incurring
no additional time penalty in the event of a TLB hit. A TLB is a cache of the most recently used page table
entries. Software is responsible for maintaining the consistency of the TLB with memory. The core TLBs
are 64-entry, two-way, set-associative caches that contain instruction and data address translations. The
core provides hardware assist for software table search operations through the hashed page table on TLB
misses. Supervisor software can invalidate TLB entries selectively.
For instructions and data that correspond to block address translation, the e300 core provides independent
eight-entry BAT arrays. These entries define blocks that can vary from 128 Kbytes to 256 Mbytes. The
BAT arrays are maintained by system software. HID2[HBE] is added to the e300 for enabling or disabling
the four additional pairs of BAT registers. However, regardless of the setting of HID2[HBE], these BATs
are accessible by mfspr and mtspr.
As specified by the PowerPC architecture, the hashed page table is a variable-sized data structure that
defines the mapping between virtual page numbers and physical page numbers. The page table size is a
power of two, and its starting address is a multiple of its size.
Also as specified by the PowerPC architecture, the page table contains a number of PTEGs. A PTEG
contains 8 PTEs of 8 bytes each; therefore, each PTEG is 64 bytes long. PTEG addresses are entry points
for table search operations.
7.4.6 Instruction Timing
The e300 core is a pipelined superscalar processor core. Because instruction processing is reduced into a
series of stages, an instruction does not require all of the resources of an execution unit at the same time.
For example, after an instruction completes the decode stage, it can pass on to the next stage, while the
subsequent instruction can advance into the decode stage. This improves the throughput of the instruction
flow. For example, it may take three cycles for a single floating-point instruction to execute, but if there
are no stalls in the floating-point pipeline, a series of floating-point instructions can have a throughput of
one instruction per cycle.
The core instruction pipeline has four major pipeline stages, described as follows:
• The fetch pipeline stage primarily involves retrieving instructions from the memory system and
determining the location of the next instruction fetch. Additionally, if possible, the BPU decodes
branches during the fetch stage and folds out branch instructions before the dispatch stage.
• The dispatch pipeline stage is responsible for decoding the instructions supplied by the instruction
fetch stage and determining which of the instructions are eligible to be dispatched in the current
cycle. In addition, the source operands of the instructions are read from the appropriate register file
and dispatched with the instruction to the execute pipeline stage. At the end of the dispatch pipeline
stage, the dispatched instructions and their operands are latched by the appropriate execution unit.
• In the execute pipeline stage, each execution unit with an instruction executes the selected
instruction (perhaps over multiple cycles), writes the instruction's result into the appropriate
rename register, and notifies the completion stage when the execution has finished. In the case of