Specifications

ManualsBrandsIntel Manualscomputer componentscore 2 Duo T5850

231

232

233

234

235

236

237

238

239

240

Intel

64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 235

Documentation Changes

• The page attribute table (PAT) can be used to strengthen memory ordering for a

specific page or group of pages (see Section 11.12, “Page Attribute Table (PAT)”).

The PAT is available only in the Pentium 4, Intel Xeon, and Pentium III processors.

These mechanisms can be used as follows:

Memory mapped devices and other I/O devices on the bus are often sensitive to the

order of writes to their I/O buffers. I/O instructions can be used to (the IN and OUT

instructions) impose strong write ordering on such accesses as follows. Prior to

executing an I/O instruction, the processor waits for all previous instructions in the

program to complete and for all buffered writes to drain to memory. Only instruction

fetch and page tables walks can pass I/O instructions. Execution of subsequent instruc-

tions do not begin until the processor determines that the I/O instruction has been

completed.

Synchronization mechanisms in multiple-processor systems may depend upon a strong

memory-ordering model. Here, a program can use a locking instruction such as the

XCHG instruction or the LOCK prefix to ensure that a read-modify-write operation on

memory is carried out atomically. Locking operations typically operate like I/O opera-

tions in that they wait for all previous instructions to complete and for all buffered writes

to drain to memory (see Section 8.1.2, “Bus Locking”).

Program synchronization can also be carried out with serializing instructions (see

Section 8.3). These instructions are typically used at critical procedure or task bound-

aries to force completion of all previous instructions before a jump to a new section of

code or a context switch occurs. Like the I/O and locking instructions, the processor

waits until all previous instructions have been completed and all buffered writes have

been drained to memory before executing the serializing instruction.

The SFENCE, LFENCE, and MFENCE instructions provide a performance-efficient way of

ensuring load and store memory ordering between routines that produce weakly-

ordered results and routines that consume that data. The functions of these instructions

are as follows:

• SFENCE — Serializes all store (write) operations that occurred prior to the SFENCE

instruction in the program instruction stream, but does not affect load operations.

• LFENCE — Serializes all load (read) operations that occurred prior to the LFENCE

instruction in the program instruction stream, but does not affect store operations.

• MFENCE — Serializes all store and load operations that occurred prior to the

MFENCE instruction in the program instruction stream.

Note that the SFENCE, LFENCE, and MFENCE instructions provide a more efficient

method of controlling memory ordering than the CPUID instruction.

The MTRRs were introduced in the P6 family processors to define the cache characteris-

tics for specified areas of physical memory. The following are two examples of how

memory types set up with MTRRs can be used strengthen or weaken memory ordering

for the Pentium 4, Intel Xeon, and P6 family processors:

• The strong uncached (UC) memory type forces a strong-ordering model on memory

accesses. Here, all reads and writes to the UC memory region appear on the bus and

out-of-order or speculative accesses are not performed. This memory type can be

1. Specifically, LFENCE does not execute until all prior instructions have completed locally, and no later

instruction begins execution until LFENCE completes. As a result, an instruction that loads from mem-

ory and that precedes an LFENCE receives data from memory prior to completion of the LFENCE. An

LFENCE that follows an instruction that stores to memory might complete before the data being

stored have become globally visible. Instructions following an LFENCE may be fetched from memory

before the LFENCE, but they will not execute until the LFENCE completes.