Specifications
46 IBM Power 770 and 780 Technical Overview and Introduction
2.1.6 On-chip L3 cache innovation and Intelligent Cache
A breakthrough in material engineering and microprocessor fabrication has enabled IBM to
implement the L3 cache in eDRAM and place it on the POWER7 processor die. L3 cache is
critical to a balanced design, as is the ability to provide good signaling between the L3 cache
and other elements of the hierarchy, such as the L2 cache or SMP interconnect.
The on-chip L3 cache is organized into separate areas with differing latency characteristics.
Each processor core is associated with a Fast Local Region of L3 cache (FLR-L3) but also
has access to other L3 cache regions as shared L3 cache. Additionally, each core can
negotiate to use the FLR-L3 cache associated with another core, depending on reference
patterns. Data can also be cloned to be stored in more than one core's FLR-L3 cache, again
depending on reference patterns. This
Intelligent Cache management enables the POWER7
processor to optimize the access to L3 cache lines and minimize overall cache latencies.
Figure 2-7 shows the FLR-L3 cache regions for each of the cores on the POWER7
processor die.
Figure 2-7 Fast local regions of L3 cache on the POWER7 processor
The innovation of using eDRAM on the POWER7 processor die is significant for
several reasons:
Latency improvement
A six-to-one latency improvement occurs by moving the L3 cache on-chip compared to L3
accesses on an external (on-ceramic) ASIC.
Bandwidth improvement
A 2x bandwidth improvement occurs with on-chip interconnect. Frequency and bus sizes
are increased to and from each core.