.. include:: custom_tools.txt

.. _dma use policy:

**************
DMA Use Policy
**************

DMA is an intelligent part of most modern microcontrollers that permits transferring of
data between peripherals and RAM (or between blocks of RAM), without using CPU time
to do so, thus permitting the CPU to get on with other things.  Such operations can
be time-consuming, and so DMA effectively provides another thread of execution, whose
entire purpose is to speed up execution.

Whether DMA is used is entirely dependant on the system architect, and the efficiency
needs of the system.

When DMA is going to be implemented, the programmer should *first* get a data
transfer going with CPU-driven transfer with :bi:`known data` before implementing
DMA.  Reason:  DMA of PIC MCU's has undocumented (or poorly-documented) buffer
alignment constraints, and unexpected (and sometimes undocumented) relationships with
certain peripherals that :bi:`cannot` be debugged unless you already know what the
resulting data in the data transfer *is supposed to contain*.

Example:  PMP (Parallel Master Port) peripheral first read from PMDIN reads junk
because CPU-driven PMP reads implements a 1-element FIFO for reads.  However, when
using DMA, you start a PMP DMA transfer with a CFORCE = 1 and the DMA either knows to
discard the first datum read, or has a way around the 1-element FIFO!  If you start
the block transfer with a PMDIN read, the first valid read gets lost!  Apparently,
DMA is discarding it.

Thus, KNOWN data in the transfer, like "Now is the time for all good men" helped isolate
this problem, when instead what was transferred looked like this at the destination:
"ow is the time for all good men"!

Example:  Hidden booby trap:  Implementing PMP DMA, I expected already-tested code from
the PIC32MZ to work just fine.  In fact, this assumption was not a bad one.  The one
difference, unsuspected because CPU-driven PMP doesn't care:  if the ``PMPMODEbits.MODE16
== 1``, then CPU-driven PMP transfers 16 bits, but we only use the lower 8 bytes, so
there is no behavioral difference (this is in a situation where the PMP module is in an
environment with mixed bus widths, so we have been just leaving it set to 16 bits).
However, we NOW have a NAND chip with an 8-bit bus whereas it had a 16-bit bus before
when it was running with the PIC32MZ.  And surprise surprise:  with MODE16 == 1, but
DMA cell size == 1 (meaning 1 byte), it WILL NOT keep going and go past a single cell
transfer, despite the IRQ being set up correctly to trigger re-starting the cell transfer.
What made it work was setting MODE16 = 0, then the whole block transfer (2112 reads ---
the size of the NAND chip's page size) completed successfully.

Example:  A ``uint32_t`` array used for NAND page buffer is aligned on 4 bytes.  DMA
documentation indicates there are, literally, NO alignment constraints --- that when
the cell transfer size is 1, that the DMA operates just fine with only byte
alignment.  To my dismay, this is covered IN DETAIL in a whole section about DMA
memory alignment.  However, a buffer whose physical start address was 0x0000CAAC
worked fine with CPU-driven 1-byte transfers.  The KNOWN DATA was:

    0x000000DB, 0x000000DC, 0x000000DD, 0x000000DE, 0x000000DF, etc.

But the DMA transfer gave us:

    0xDB000000, 0xDC000000, 0xDD000000, 0xDE000000, 0xDF000000, etc.

See the difference?

When the buffer was aligned on 16 bytes (with new address 0x0000CAB0), then we got

    0x000000DB, 0x000000DC, 0x000000DD, 0x000000DE, 0x000000DF, etc.

as expected.