Specifications

Intel
®
64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 236
Documentation Changes
applied to an address range dedicated to memory mapped I/O devices to force
strong memory ordering.
For areas of memory where weak ordering is acceptable, the write back (WB)
memory type can be chosen. Here, reads can be performed speculatively and writes
can be buffered and combined. For this type of memory, cache locking is performed
on atomic (locked) operations that do not split across cache lines, which helps to
reduce the performance penalty associated with the use of the typical synchroni-
zation instructions, such as XCHG, that lock the bus during the entire read-modify-
write operation. With the WB memory type, the XCHG instruction locks the cache
instead of the bus if the memory access is contained within a cache line.
The PAT was introduced in the Pentium III processor to enhance the caching characteris-
tics that can be assigned to pages or groups of pages. The PAT mechanism typically used
to strengthen caching characteristics at the page level with respect to the caching char-
acteristics established by the MTRRs. Table 11-7 shows the interaction of the PAT with
the MTRRs.
Intel recommends that software written to run on Intel Core 2 Duo, Intel Atom, Intel
Core Duo, Pentium 4, Intel Xeon, and P6 family processors assume the processor-
ordering model or a weaker memory-ordering model. The Intel Core 2 Duo, Intel Atom,
Intel Core Duo, Pentium 4, Intel Xeon, and P6 family processors do not implement a
strong memory-ordering model, except when using the UC memory type. Despite the
fact that Pentium 4, Intel Xeon, and P6 family processors support processor ordering,
Intel does not guarantee that future processors will support this model. To make soft-
ware portable to future processors, it is recommended that operating systems provide
critical region and resource control constructs and API’s (application program interfaces)
based on I/O, locking, and/or serializing instructions be used to synchronize access to
shared areas of memory in multiple-processor systems. Also, software should not
depend on processor ordering in situations where the system hardware does not support
this memory-ordering model.
8.3 SERIALIZING INSTRUCTIONS
The Intel 64 and IA-32 architectures define several serializing instructions. These
instructions force the processor to complete all modifications to flags, registers, and
memory by previous instructions and to drain all buffered writes to memory before the
next instruction is fetched and executed. For example, when a MOV to control register
instruction is used to load a new value into control register CR0 to enable protected
mode, the processor must perform a serializing operation before it enters protected
mode. This serializing operation ensures that all operations that were started while the
processor was in real-address mode are completed before the switch to protected mode
is made.
The concept of serializing instructions was introduced into the IA-32 architecture with
the Pentium processor to support parallel instruction execution. Serializing instructions
have no meaning for the Intel486 and earlier processors that do not implement parallel
instruction execution.
It is important to note that executing of serializing instructions on P6 and more recent
processor families constrain speculative execution because the results of speculatively
executed instructions are discarded. The following instructions are serializing instruc-
tions: