After continuing to think about my needs for the device writing, I came to the conclusion that things would be much simpler if I just changed the buffer size of a device to 8 bytes, so that it can be filled with one LOD64. I was right, because I was able to write a working >send routine pretty quickly, and it also worked on the first try and was 1/3 the instructions as the most recent broken attempt with the 64B buffer. Perhaps tomorrow I can finally move onto the receiving end of devices?
Quickly getting the hang of this way of thinking, though I really need to be writing down the ideas I'm having for assiting tools because this initial friction is generating a lot of good ideas.