Specifications
E PENTIUM® PRO PROCESSOR AT 150, 166, 180, and 200 MHz
9
FEU
IEU
JEU
IEU
AGU
AGU
Port 0
Port 1
Port 2
Port 3,4
Load
Store
RS
To/from
Instruction
Pool (ROB)
RS - Reservation Station
EU - Execution Unit
FEU - Floating Point EU
IEU - Integer EU
JEU - Jump EU
AGU - Address Generation Unit
ROB - ReOrder Buffer
Figure 4. Inside the Dispatch/Execute Unit
Note that the actual algorithm employed by this
execution-scheduling process is vitally important to
performance. If only one µop per resource becomes
data-ready per clock cycle, then there is no choice.
But if several are available, it must choose. The
Pentium Pro processor uses a pseudo First In, First
Out (FIFO) scheduling algorithm favoring back-to-
back µops.
Note that many of the µops are branches. The BTB
will correctly predict most of these branches but it
can’t correctly predict them all. Consider a BTB that
is correctly predicting the backward branch at the
bottom of a loop; eventually that loop is going to
terminate, and when it does, that branch will be
mispredicted. Branch µops are tagged (in the in-order
pipeline) with their fall-through address and the
destination that was predicted for them. When the
branch executes, what the branch actually did is
compared against what the prediction hardware said
it would do. If those coincide, then the branch
eventually retires, and most of the speculatively
executed work behind it in the instruction pool is
good.
But if they do not coincide, then the Jump Execution
Unit (JEU) changes the status of all of the µops
behind the branch to remove them from the
instruction pool. In that case the proper branch
destination is provided to the BTB which restarts the
whole pipeline from the new target address.
2.2.3. THE RETIRE UNIT
Figure 5 shows a more detailed view of the Retire
Unit.
The retire unit is also checking the status of µops in
the instruction pool. It is looking for µops that have
executed and can be removed from the pool. Once
removed, the original architectural target of the µops
is written as per the original IA instruction. The
retirement unit must not only notice which µops are
complete, it must also reimpose the original program
order on them. It must also do this in the face of
interrupts, traps, faults, breakpoints and
mispredictions.
The retirement unit must first read the instruction pool
to find the potential candidates for retirement and
determine which of these candidates are next in the
original program order. Then it writes the results of
this cycle’s retirements to both the Instruction Pool
and the Retirement Register File (RRF). The
retirement unit is capable of retiring 3 µops per clock.
2.2.4. THE BUS INTERFACE UNIT
Figure 6 shows a detailed view of the Bus Interface
Unit.