After spending some time trying to write device IO code, I left off trying to think a bit more about the architecture of my stack machine's stack manipulation instructions. I was missing some fundamental operations for actually using the stack effectively, which meant that I mostly was resorting to using the various registers and stacks as individual registers. I did more reading on Forth to refresh my memory, and started thinking about better ways I might be able to implement "picking" and "rolling" for a variable-width stack. Perhaps I could use bit-masks for surfacing bytes? And integers for rolling and tucking?