Family paper
Overview of Key Architectural Advancements
The new Intel Itanium processor 9300 series is the world’s first processor
with more than two billion transistors, and Intel engineers have used this
abundance of resources to integrate a variety of fundamental
architectural improvements (Table 1). The most obvious advancement is
the addition of two more processing cores. Just as important, however,
are improvements that enable data and instructions to be delivered
to these cores much faster than on previous-generation processors,
which help to sustain high levels of utilization for all four cores. Based
on internal tests by Intel engineers, the Intel Itanium processor 9300
series can be expected to deliver more than double the performance
of its predecessor for many applications.
1
Since Itanium®-based servers are often deployed in mission-critical
computing environments, data integrity and high availability are critical.
A great deal of the design effort went into extending and enhancing
the mainframe-class reliability, availability and serviceability (RAS)
features that were implemented in the previous-generation processor.
Improvements are implemented across many levels, from silicon-level
advancements that make individual logic gates more resistant to errors,
to system-level advancements that help organizations add, remove and
allocate resources more effectively among running partitions. These
and other architectural improvements are discussed in more detail in
the following sections.
Higher Performance through Enhanced
Thread-Level Parallelism (TLP)
With four multi-threaded cores per processor, the Intel Itanium
processor 9300 series can handle twice as many simultaneous
software threads as its predecessor. This can substantially improve
performance and scalability for heavily threaded software code, such
as database and decision support applications. The benefits can be
equally compelling for consolidated environments in which multiple
applications and operating systems are hosted on a single system.
Thread management has also been improved. In the previous genera-
tion, a processor core would switch to another thread whenever the
active thread was stalled due to a high latency event, such as waiting
3
White Paper: The Intel® Itanium® Processor 9300 Series
Table 1. Architectural Enhancements in the Intel® Itanium® processor 9300 series
Key Characteristics Intel® Itanium® processor 9100 series Intel® Itanium® processor 9300 series
Cores
2 4
Total On-Die Cache
a
27.5 MB 30 MB
Software Threads per Core
2 2 (with enhanced thread management)
System Interconnect
(bandwidth per processor for a
3-load or 2-socket system)
Front Side Bus
• Peak bandwidth per processor: 5 GB/s
Intel® QuickPath Interconnect Technology
• Peak bandwidth: 48 GB/s (up to 9x improvement)
• Enhanced RAS
• Enables common Input/Output Hubs (IOHs) with
next-generation Intel® Xeon® processors
Memory Interconnect
(bandwidth per processor for a
3-load or 2-socket system)
Front Side Bus
• Peak bandwidth per processor: 5 GB/s
Dual Integrated Memory Controllers
• Peak bandwidth 34 GB/s (up to 6x improvement)
Memory Capacity
(4-socket system)
128-384 GB 1TB (using 16 GB RDIMMs) — up to 8x improvement
Partitioning and Virtualization
Intel® Virtualization Technology (Intel® VT-i) Intel® VT-i2 (includes extensions to improve performance in both processing
and I/O efciency)
RAS
Mainframe Class Enhanced
• Advanced Machine Check Architecture error recovery for increased uptime
• Error detection, correction and avoidance extended and/or improved across all
key components (processor, memory, interconnect)
• Improved support for partitioning, virtualization and resource management
(including component hot add/remove/replace if supported by OS)
Energy Efciency
Demand Based Switching (DBS) • Enhanced DBS (voltage modulation in addition to frequency)
• Intel® Turbo Boost Technology
• Advanced CPU and Memory Thermal Management
SMP Scalability
• 64-bit Virtual Addressability
• 50-bit Physical Addressability
• Home snoop coherency
• 64-bit Virtual Addressability
• 50-bit Physical Addressability
• Directory coherency for better performance in large SMP congurations
• Up to 8-socket glueless systems (higher scalability with OEM chipsets)
a
Size includes all on-die cache arrays which are comprised of the cache tag and data arrays for the three-level cache hierarchy (9100 and 9300 series) as well as directory cache arrays (9300 series only).