User`s guide

Accesses in this space are no more than a quadword. Software must ensure
that the processor does not merge consecutive write transactions in its write
buffers by using memory barriers after each write transaction. Architecturally,
if a byte, word, tribyte, or longword is written on the PCI, an STL instruction
must be executed to the lower longword in the corresponding quadword
address. An STQ or STL instruction to the upper longword is not allowed.
One bit pair of cpucwmask<1:0>, <3:2>, <5:4>, and <7:6> must have a
value of 01 (binary). The other fields must be 00. The location of the 01
field indicates whether the data reference is byte, word, tribyte, or longword
(respectively).
Similarly, if a quadword is written to the PCI, software must execute an
STQ instruction to the corresponding address. The only legal value on
cpucwmask<7:6> in sparse space is 11000000.
If a byte, word, tribyte, or longword is read from the PCI, an LDL instruction
must be executed to the lower longword in the corresponding quadword
address. An LDL instruction to the upper longword or LDQ instruction returns
the wrong data. If a quadword is read from the PCI, software must use an
LDQ instruction. An LDL instruction returns wrong data.
4.1.9 PCI Dense Memory Space (3 0000 0000 to 3 FFFF FFFF)
PCI dense memory space is typically used for data buffers on the PCI and has
the following characteristics:
There is a one-to-one mapping between CPU addresses and PCI addresses.
A longword address from the CPU maps to a longword on the PCI (thus
the name dense space as opposed to PCI sparse memory space).
Byte or word accesses are not allowed in this space. Minimum access
granularity is a longword. The maximum transfer length implemented by
the 21072 chipset is a cache line (32 bytes) on write transactions, and a
quadword on read transactions.
Read prefetching is allowed in this space; additional read transactions
have no side effects. The 21064A does not specify a longword address
on read transactions; it only specifies a quadword address. Therefore,
read transactions in this space are always performed as a quadword read
transaction with a burst length of two on the PCI.
Write transactions to addresses in this space can be buffered in the
21064A. The 21072 chipset supports a maximum burst length of 8 on the
PCI corresponding to a cache line of data.
4–18 System Address Mapping