Family paper

Two Integrated Memory Controllers
and an Intel® Scalable Memory Interconnect
Communication channels between the processor cores and main
memory have been dramatically improved. Each processor has two
integrated memory controllers, which provide peak memory band-
width up to 34 GB/s, which is up to six times the bandwidth of the
previous-generation processor. The new processor also includes
an Intel® Scalable Memory Interconnect (Intel® SMI), which connects
to the Intel® 7500 Scalable Memory Buffer to support larger physi-
cal memory configurations using industry-standard DDR3 RDIMMs.
The combination of larger physical memory and increased bandwidth
communications will deliver higher performance in many environments
and will make it easier and more cost-effective for organizations to
add memory as workloads grow.
Directory-based Cache Coherency
Cache coherency ensures that data in memory and cache remain
synchronized, so the most current data is used in every transaction.
The previous-generation Intel Itanium processor used snoopy-based
coherency mechanisms in which every processor on the Front Side Bus
had to be “snooped” for every memory transaction. With this approach,
coherency traffic is roughly proportional to the square of the number
of processors in the system and can become a bottleneck in large
multiprocessor servers unless auxiliary coherency (node) controllers
are employed.
The Intel Itanium processor 9300 series implements true directory-
based coherency. The home agent for each memory controller keeps
track of all owners and sharers for each line of cache using information
stored in main memory. Since the number of owners and sharers for
a given cache line is typically much less than the number of caching
agents in the system, coherency traffic tends to scale sub-linearly
with increasing system size. This improves cache efficiency and
reduces coherency-related traffic that would otherwise contend
for available bandwidth. It also helps to sustain low-latency data
access times as workloads increase in large multiprocessor systems.
Glueless System Designs Up to Eight Sockets
Systems based on the previous-generation Intel Itanium processor
were limited to four processor loads per Front Side Bus due to the
electrical challenges of the shared bus design. The Intel Itanium processor
9300 series supports up to eight-socket glueless systems, with
glueless” meaning that no additional node controller is required
(Figure 1). This enables simpler system designs with lower chip
counts and smaller board footprints.
2
It also provides exceptionally
fast processor-to-processor communications.
Larger systems can be built using one or more node controllers,
which are designed independently by the various server vendors.
These node controllers are used to aggregate multiple, multi-socket
Intel QuickPath Interconnect “nodes” into larger symmetric multi-
processing (SMP) systems.
3
Figure 1. The quad-core Intel® Itanium® processor supports up to eight-socket glueless system designs. Larger symmetric multiprocessing (SMP)
systems are built using node controllers designed by the individual system vendors.
IOH
CPU
CPU
CPU
CPU
IOH
Memory Memory
Memory Memory
IOH
CPU
CPU
CPU
CPU
IOH
IOH
CPU
CPU
CPU
CPU
IOH
Four CPU Topology
Eight CPU Topology
(memory not shown)
Glue-less Systems Designs Up to Eight Sockets
5
White Paper: The Intel® Itanium® Processor 9300 Series