Minimalistic DMA manager for MSX (2)

Minimalistic DMA manager for MSX (2)

Comes from part 1: Basic DMA in MSX

We left our generic dma-manager comprising:

  • A "long" pointer which provides both "bus address" and "mapper config"
  • A set of registers for setting this pointer
  • A state machine for generating both the bus-cycles and the set/restore of the msx memory context 

We don't have included a counter/length register for the dma-manager, so we'll pretend the case of a dma-device which tracks it itself. Software will check that in the driver code. A typical real-time dma consumer which usually tracks the length is a pcm player.

Example system: PCM player

We'll employ our "single-channel" dma-manager to provide data to a pcm sample player. Our "imagined" pcm player can generate arbitrary sample rates, has a single-sample buffer and a "buffer is empty" signal which triggers the dma-manager. When the data is ready, the dma-manager notifies so, the player fetches a sample into the buffer and dma request (hopefully) momentarily is deasserted.  When the timer signals that the sample period has elapsed, buffer goes into the dac and the buffer is empty again, thus re-asserting the dma request.

Our pcm player has a cpu interface to setup internal registers (omitted in the drawing). Once we program the sample rate and sample count we enable the dma controller and sample count is decremented on each dma transaction. When the sample count is done, dma requests cease to happen and cpu is notified.

Note that samples maybe comprised of one or more bytes but, for simplicity, we'll assume 1 byte for 1 sample and thus, single-byte per dma transaction (AKA cpu-stop)

The Page-2 Trick

Although we should save the slot configuration in every transaction, we can take a shortcut that will work the same, at least in a basic system, for our "minimalistic version"

In standard conditions, under basic and dos, system software doesn't ever selects other slot than main-ram slot from page 2. Taking that assumption for granted, we could use page 2 for dma transactions and expect that whenever we interrupt the cpu, we'll find ram in that page.

That way we can reduce the steps needed for setting/restoring context. Otherwise, we should set page 2 (or other page, if used) to desired ram slot (which should be configurable) using the ppi, and restore the original value in the ppi after transaction, before restarting the cpu. 

DMA transactions

When a dma transaction is requested, the state machine is started. The actual length of the bus transaction, in clocks (and thus the actual delay until the data has been read) is not previously known. Both the arbitration and each of the memory access may take longer than the minimum cycles.

In addition to that, the actual memory read cycle may repeat if the request signal is kept asserted (for multi-byte transactions)

(state machine flow)

  1. Assert /busreq
  2. Wait /busack (may take longer depending of the last cpu instruction)
  3. Place the "set mapper" (io write) on the bus
  4. Wait the io-cycle to complete (typ. 4 clocks, but may use /wait)
  5. Place the "read memory" (mem read) on the bus
  6. Wait the mem-cycle (typ. 3 clocks, but may use /wait)
  7. Assert "ready", increment pointer and, if still requested, loop to 5 (note1)
  8. Place the "restore mapper" (io write) on the bus
  9. Wait the io-cycle to complete (typ. 4 clocks, but may use /wait)
  10. De-assert /busreq
  11. Wait de-assertion of /busack

note 1: If the lower part of the address "carries", then loop to 3, to select the next segment of memory in the ram mapper.

 The "set mapper" cycle (step 3)

Even if the typical bus transaction takes 3 clocks in the Z-80, the typical io cycle uses 4 clocks because of the "implicit" 1 wait-state. Note that it may take longer if the bus assert /wait for any reason. 

The dma-manager uses the higher part of the dma pointer to select a segment in the memory mapper. Unimplemented bits must be writen as zero, to ensure that is a bigger mapper than suportes is present, at least the part used is always the lower one.

The io-address matches the desired msx page, used in step 5:

  • FCh for page 0
  • FDh for page 1
  • FEh for page 2
  • FFh for page 3

The memory cycle (step 5)

The dma-manager uses the lower part (A13 to A0) of the dma pointer for the address of the memory cycle. A14-A15 select a page (usually fixed)

This access usually takes three clocks, but may be longer when /wait is detected. 

Note that when the data is present in the bus (T3) the data is passed to the dma-device

The "restore mapper" (step 8)

This step is just like step 3, but the actual data value written is the value snooped during the normal operation. I.e. the last value written in the mapper by the cpu.

DMA Registers

Our controller has a "slave" interface that we use from the cpu to prepare the configuration and enable/disable the functionality.

We need at least 3 register for the dma pointer (it has more than 16 bits), thus we'll decode 4 registers for writes. We'll use the fourth register for general control bits, and a single status register for readings.

R0: DMA Pointer LOW (A7-A0)

R1: DMA Pointer MED (A15-A8)

R2: DMA Pointer HI (A23-A16) 

May have less bits, typically MSX mappers have 128 KB (up A16) to 1 MByte (up to A19). 

R3: Control register

Only bit 0 is implemented, to enable the dma engine. When 0, the dma request input is ignored

 

With all the elements defined, we can proceed with an implementation in programmable logic. In our example, we have all necessary to have our hardware read samples and play directly from system memory.

We'll see it the coming part 3: MSX DMA implementation

 

 

Back to blog