Skip to content

Descriptor Frontend

Descriptor Frontend Role

The descriptor frontend (idma_desc64_top) uses a linked list of transfer descriptors in shared memory. Hardware fetches descriptors autonomously over an AXI read port, enabling software to enqueue multiple transfers without polling for completion of each one.

Descriptor Format

The descriptor frontend’s reshaper (a module that converts the descriptor’s flag bits into the idma_req_t option fields, idma_desc64_reshaper) maps flag bits to backend options, but also hardcodes some values. Notably, src_max_llen and dst_max_llen are always set to 0 (full debursting: every transfer is broken into single-beat bus transactions, maximum protocol compliance at the cost of throughput), and lock, prot, qos, region are zeroed. These options cannot be controlled per-descriptor.

Each descriptor is a 256-bit packed struct stored in shared memory. In a SystemVerilog packed struct, the first field (flags) occupies the most-significant bits and the last field (dest_addr) occupies the least-significant bits. In memory (little-endian), dest_addr is at the lowest byte address:

typedef struct packed {
logic [31:0] flags; // Transfer flags (see below)
logic [31:0] length; // Transfer length in bytes
addr_t next; // Address of next descriptor (0xFFFF...F = end of chain)
addr_t src_addr; // Source address
addr_t dest_addr; // Destination address
} descriptor_t; // Total: 256 bits (addr_t = 64 bits)

Descriptors must be in memory accessible to the DMA’s AXI read port. Descriptors are read as 256-bit naturally-aligned bursts.

Flags Bitfield

BitsFieldDescription
0irqTrigger interrupt on completion
2:1src_burstSource burst type: 00=FIXED, 01=INCR, 10=WRAP
4:3dst_burstDestination burst type: 00=FIXED, 01=INCR, 10=WRAP
5decouple_rwFully decouple read and write channels — the backend can write without waiting for reads to complete (risk of deadlock if buffer fills)
6decouple_awSafer decoupling: write addresses are held back until the first read data arrives, but once data starts flowing, reads and writes proceed independently
7reduce_lenReduce burst length on both source and destination (opt.beo.src_reduce_len + opt.beo.dst_reduce_len)
11:8src_cacheAXI cache attributes for source (bufferable, modifiable, read-alloc, write-alloc)
15:12dst_cacheAXI cache attributes for destination
23:16axi_idAXI ID for the transfer
31:24Reserved

Typical flags values: 0x00000000 — default (INCR burst, no decoupling, no IRQ). 0x00000001 — IRQ on completion. 0x00000061 — decouple_aw + decouple_rw + IRQ.

Descriptor Chain Example

A two-descriptor chain that transfers 64 bytes, then 128 bytes, then stops. Assume the descriptors are allocated at addresses 0x4000 and 0x4020 (32 bytes apart, naturally aligned):

// Descriptor at address 0x4000:
// src_addr = 0x1000
// dest_addr = 0x2000
// next = 0x4020 (address of the second descriptor)
// length = 64
// flags = 0x00000000 (no IRQ, INCR burst, no decoupling)
// Descriptor at address 0x4020:
// src_addr = 0x1100
// dest_addr = 0x2100
// next = 0xFFFFFFFF_FFFFFFFF (end of chain)
// length = 128
// flags = 0x00000001 (IRQ on completion)

Parameters

The descriptor frontend is configured through module parameters at instantiation time:

ParameterDefaultDescription
AddrWidth64Address width
DataWidth64AXI data width
AxiIdWidth3AXI ID width
InputFifoDepth8Depth of the descriptor address input FIFO
PendingFifoDepth8Depth of the pending request tracking FIFO
BackendDepth0Backend pipeline depth (NumAxInFlight + BufferDepth)
NSpeculation4Number of descriptors to prefetch speculatively

Programming Sequence

  1. Allocate descriptors in memory accessible to both CPU and DMA (e.g., uncached or coherent region)
  2. Fill descriptor fields: Set src_addr, dest_addr, length, flags for each transfer
  3. Chain descriptors: Set each descriptor’s next field to the address of the following descriptor. Use 0xFFFFFFFF_FFFFFFFF to mark the end of the chain
  4. Write first descriptor address to the desc_addr register. This register is part of the descriptor frontend’s own small register file (separate from the descriptor memory). Writing a non-zero address to it triggers the fetch engine
  5. Hardware fetches autonomously: The frontend reads descriptors over AXI, submits them to the backend, and follows the next pointer chain
  6. Wait for completion: Poll the status register (bit 0 = busy) or wait for the IRQ (if flags.irq is set)

Speculative Prefetch

The NSpeculation parameter controls how many descriptors the frontend may fetch ahead of the backend’s consumption. This hides the descriptor fetch latency — while the backend processes one transfer, the frontend is already reading the next NSpeculation descriptors from memory. Setting NSpeculation=0 disables prefetching (each descriptor is fetched only after the previous transfer completes).

Set NSpeculation to match your expected descriptor chain length or NumAxInFlight, whichever is smaller. Higher values waste memory bandwidth on speculative reads if chains are short. For most systems, NSpeculation=4 is a good default.

Source Files

  • src/frontend/desc64/idma_desc64_top.sv — Top-level module
  • src/frontend/desc64/idma_desc64_reg_wrapper.sv — Register interface wrapper
  • src/frontend/desc64/idma_desc64_reshaper.sv — Descriptor-to-request conversion