Specifications
PENTIUM® PRO PROCESSOR AT 150, 166, 180, and 200 MHz E
8
ID
(x3)
Next_IP
BTB
MIS
RAT
Allocate
From BIU
ICache
To
Instruction
Pool (ROB)
BIU - Bus Interface Unit
ID - Instruction Decoder
BTB - Branch Target Buffer
MIS - Microcode Instruction
Sequencer
RAT - Register Alias Table
ROB - ReOrder Buffer
Figure 3. Inside the Fetch/Decode Unit
The ICache fetches the cache line corresponding to
the index from the Next_IP, and the next line, and
presents 16 aligned bytes to the decoder. The
prefetched bytes are rotated so that they are justified
for the Instruction Decoders (ID). The beginning and
end of the IA instructions are marked.
Three parallel decoders accept this stream of marked
bytes, and proceed to find and decode the IA
instructions contained therein. The decoder converts
the IA instructions into triadic µops (two logical
sources, one logical destination per µop). Most IA
instructions are converted directly into single µops,
some instructions are decoded into one-to-four µops
and the complex instructions require microcode (the
box labeled MIS in Figure 3). This microcode is just a
set of preprogrammed sequences of normal µops.
The µops are queued, and sent to the Register Alias
Table (RAT) unit, where the logical IA-based register
references are converted into Pentium Pro processor
physical register references, and to the Allocator
stage, which adds status information to the µops and
enters them into the instruction pool. The instruction
pool is implemented as an array of Content
Addressable Memory called the ReOrder Buffer
(ROB).
This is the end of the in-order pipe.
2.2.2. THE DISPATCH/EXECUTE UNIT
The dispatch unit selects µops from the instruction
pool depending upon their status. If the status
indicates that a µop has all of its operands then the
dispatch unit checks to see if the execution resource
needed by that µop is also available. If both are true,
the
Reservation Station
removes that µop and
sends it to the resource where it is executed. The
results of the µop are later returned to the pool. There
are five ports on the Reservation Station
,
and the
multiple resources are accessed as shown in
Figure 4.
The Pentium Pro processor can schedule at a peak
rate of 5 µops per clock, one to each resource port,
but a sustained rate of 3 µops per clock is typical.
The activity of this scheduling process is the out-of-
order process; µops are dispatched to the execution
resources strictly according to dataflow constraints
and resource availability, without regard to the
original ordering of the program.