The steps are pretty simple, if a DMA is not being used -- the CPU has to do all the writing/reading itself. With DMA, the CPU or periphreal can initiate the DMA process.
An example would be the Commodore 64. The C64 has a 6510 processor and a memory space of 64K. The RAM, ROM, and periphreals are in the same linear address space of 0000-FFFF. I/O is in the space D000-DFFF with several chips assigned in this area including video, sound, keyboard, etc. An example of I/O without DMA on the 6510 would be LDA $D000 which reads Sprite 0 X position. The reverse, write, would be STA $D000.