User Guide
92 General-Purpose Programming
AMD64 Technology 24592—Rev. 3.15—November 2009
result (speculation), and it can reorder reads ahead of writes. In the case of writes, multiple writes to
memory locations in close proximity to each other can be combined into a single write or a burst of
multiple writes. Writes can also be delayed, or buffered, by the processor.
Application software that needs to force memory ordering to memory-mapped I/O devices can do so
using the read/write barrier instructions: LFENCE, SFENCE, and MFENCE. These instructions are
described in “Forcing Memory Order” on page 94. Serializing instructions, I/O instructions, and
locked instructions can also be used as read/write barriers, but they modify program state and are an
inferior method for enforcing strong-memory ordering.
Typically, the operating system controls access to memory-mapped I/O devices. The AMD64
architecture provides facilities for system software to specify the types of accesses and their ordering
for entire regions of memory. These facilities are also used to manage the cacheability of memory
regions. See “System-Management Instructions” in Volume 2 for further information.
3.8.3 Protected-Mode I/O
In protected mode, access to the I/O-address space is governed by the I/O privilege level (IOPL) field
in the rFLAGS register, and the I/O-permission bitmap in the current task-state segment (TSS).
I/O-Privilege Level. RFLAGS.IOPL governs access to IOPL-sensitive instructions. All of the I/O
instructions (IN, INS, OUT, and OUTS) are IOPL-sensitive. IOPL-sensitive instructions cannot be
executed by a program unless the program’s current-privilege level (CPL) is numerically less (more
privileged) than or equal to the RFLAGS.IOPL field, otherwise a general-protection exception (#GP)
occurs.
Only software running at CPL = 0 can change the R FLAGS.IOPL field. Two instructions, POPF and
IRET, can be used to change the field. If application software (or any software running at CPL>0)
attempts to change RFLAGS.IOPL, the attempt is ignored.
System software uses RFLAGS.IOPL to control the privilege level required to access I/O-address
space devices. Access can be granted on a program-by-program basis using different c opies of
RFLAGS for every program, each with a different IOPL. RFLAGS.IOPL acts as a global control over
a program’s access to I/O-address space devices. System software can grant less-privileged programs
access to individual I/O devices (overriding RFLAGS.IOPL) by using the I/O-permission bitmap
stored in a program’s TSS. For details about the I/O-permission bitmap, see “I/O-Permission Bitmap”
in Volume 2.
3.9 Memory Optimization
Generally, application software is unaware of the memory hierarchy implemented within a particular
system design. The application simply sees a homogenous address space within a single level of
memory. I n reality, both system and processor implementations can use any number of techniques to
speed up accesses into memory, doing so in a manner that is transparent to applications. Application
software can be written to maximize this speed-up even though the methods used by the hardware are
not visible to the application. This section gives an overview of the memory hierarchy and access