Datasheet

Table Of Contents
2.5.3.4. Flash Streaming and Auxiliary Bus Slave
As the flash is generally much larger than SRAM, it’s often useful to stream chunks of data into memory from flash. It’s
convenient to have the DMA stream this data in the background while software in the foreground is doing other things,
and it’s even more convenient if code can continue to execute from flash whilst this takes place.
This doesn’t interact well with standard XIP operation, because of the lengthy bus stalls forced on the DMA whilst the SSI
is performing serial transfers. These stalls are tolerable for a processor, because an in-order processor tends to have
nothing better to do while waiting for an instruction fetch to retire, and because typical code execution tends to have
much higher cache hit rates than bulk streaming of infrequently accessed data. In contrast, stalling the DMA prevents any
other active DMA channels from making progress during this time, which slows overall DMA throughput.
The STREAM_ADDR and STREAM_CTR registers are used to program a linear sequence of flash reads, which the XIP subsystem
will perform in the background in a best-effort fashion. To minimise impact on code being executed from flash whilst the
stream is ongoing, the streaming hardware has lower priority access to the SSI than regular XIP accesses, and there is a
brief cooldown (seven cycles) between the last XIP cache miss and resuming streaming. This helps to avoid increase in
initial access latency on XIP cache miss.
Pico Examples: https://github.com/raspberrypi/pico-examples/tree/pre_release/flash/xip_stream/flash_xip_stream.c Lines 45 - 48
45 while (!(xip_ctrl_hw->stat & XIP_STAT_FIFO_EMPTY))
46 (void) xip_ctrl_hw->stream_fifo;
47 xip_ctrl_hw->stream_addr = (uint32_t) &random_test_data[0];
48 xip_ctrl_hw->stream_ctr = count_of(random_test_data);
The streamed data is pushed to a small FIFO, which generates DREQ signals, telling the DMA to collect the streamed
data. As the DMA does not initiate a read until after the data has been read from flash, the DMA is not stalled when
accessing the data.
Although this scheme ensures that the data is ready in the streaming FIFO once the DREQ is asserted, the DMA can still
be stalled if another master is currently stalled on the XIP slave, e.g. due to a cache miss. This is solved by the auxiliary
bus slave, which is a simple bus interface providing access only to the streaming FIFO. This slave is exposed on the
FASTPERI arbiter, which services only native AHB-Lite peripherals which don’t generate wait states, so the DMA will never
experience stalls when accessing the FIFO at this address, assuming it has high bus priority.
Pico Examples: https://github.com/raspberrypi/pico-examples/tree/pre_release/flash/xip_stream/flash_xip_stream.c Lines 58 - 70
58 const uint dma_chan = 0;
59 dma_channel_config cfg = dma_channel_get_default_config(dma_chan);
60 channel_config_set_read_increment(&cfg, false);
61 channel_config_set_write_increment(&cfg, true);
62 channel_config_set_dreq(&cfg, DREQ_XIP_STREAM);
63 dma_channel_configure(
64 dma_chan,
65 &cfg,
66 (void *) buf, // Write addr
67 (const void *) XIP_AUX_BASE, // Read addr
68 count_of(random_test_data), // Transfer count
69 true // Start immediately!
70 );
2.5.3.5. Performance Counters
The XIP subsystem provides two performance counters. These are 32 bits in size, saturate upon reaching 0xffffffff, and
are cleared by writing any value. They count:
1. The total number of XIP accesses, to any alias
RP2040 Datasheet
2.5. Memory 109